2023-10-02 01:35:27,303 INFO [train.py:1114] (1/4) Training started 2023-10-02 01:35:27,304 INFO [train.py:1124] (1/4) Device: cuda:1 2023-10-02 01:35:27,335 INFO [train.py:1136] (1/4) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '821ebc378e7fb99b8adc81950227963332821e01', 'k2-git-date': 'Wed Jul 19 15:38:25 2023', 'lhotse-version': '1.16.0.dev+git.1db4d97a.clean', 'torch-version': '1.11.0+cu102', 'torch-cuda-available': True, 'torch-cuda-version': '10.2', 'python-version': '3.9', 'icefall-git-branch': 'dev/bilingual', 'icefall-git-sha1': '4897f2c0-dirty', 'icefall-git-date': 'Thu Sep 28 11:38:28 2023', 'icefall-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/icefall-1.0-py3.9.egg', 'k2-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/k2-1.24.3.dev20230721+cuda10.2.torch1.11.0-py3.9-linux-x86_64.egg/k2/__init__.py', 'lhotse-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/lhotse-1.16.0.dev0+git.1db4d97a.clean-py3.9.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-7-1218101249-5d97868c7c-tp8w2', 'IP address': '10.177.6.147'}, 'world_size': 4, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 50, 'start_epoch': 21, 'start_batch': 0, 'exp_dir': PosixPath('zipformer/exp-w-tal-csasr'), 'bpe_model': 'data/lang_bbpe_2000/bbpe.model', 'base_lr': 0.045, 'lr_batches': 7500, 'lr_epochs': 3.5, 'ref_duration': 600, 'context_size': 2, 'prune_range': 5, 'lm_scale': 0.25, 'am_scale': 0.0, 'simple_loss_scale': 0.5, 'ctc_loss_scale': 0.2, 'seed': 42, 'print_diagnostics': False, 'inf_check': False, 'save_every_n': 4000, 'keep_last_k': 30, 'average_period': 200, 'use_fp16': True, 'use_tal_csasr': True, 'use_librispeech': True, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': False, 'chunk_size': '16,32,64,-1', 'left_context_frames': '64,128,256,-1', 'use_transducer': True, 'use_ctc': False, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 1000, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'blank_id': 0, 'vocab_size': 2000} 2023-10-02 01:35:27,335 INFO [train.py:1138] (1/4) About to create model 2023-10-02 01:35:28,008 INFO [train.py:1142] (1/4) Number of model parameters: 68625511 2023-10-02 01:35:28,009 INFO [checkpoint.py:112] (1/4) Loading checkpoint from zipformer/exp-w-tal-csasr/epoch-20.pt 2023-10-02 01:35:37,320 INFO [train.py:1157] (1/4) Using DDP 2023-10-02 01:35:37,826 INFO [train.py:1169] (1/4) Loading optimizer state dict 2023-10-02 01:35:38,599 INFO [train.py:1177] (1/4) Loading scheduler state dict 2023-10-02 01:35:38,600 INFO [multi_dataset.py:40] (1/4) About to get multidataset train cuts 2023-10-02 01:35:38,600 INFO [multi_dataset.py:43] (1/4) Loading Aishell-2 in lazy mode 2023-10-02 01:35:38,663 INFO [multi_dataset.py:50] (1/4) Loading TAL-CSASR in lazy mode 2023-10-02 01:35:38,664 INFO [multi_dataset.py:57] (1/4) Loading LibriSpeech in lazy mode 2023-10-02 01:35:38,664 INFO [multi_dataset.py:161] (1/4) About to get train-clean-100 cuts 2023-10-02 01:35:38,665 INFO [multi_dataset.py:168] (1/4) About to get train-clean-360 cuts 2023-10-02 01:35:38,680 INFO [multi_dataset.py:175] (1/4) About to get train-other-500 cuts 2023-10-02 01:35:48,671 INFO [asr_datamodule.py:218] (1/4) Enable MUSAN 2023-10-02 01:35:48,672 INFO [asr_datamodule.py:219] (1/4) About to get Musan cuts 2023-10-02 01:35:51,350 INFO [asr_datamodule.py:243] (1/4) Enable SpecAugment 2023-10-02 01:35:51,350 INFO [asr_datamodule.py:244] (1/4) Time warp factor: 80 2023-10-02 01:35:51,351 INFO [asr_datamodule.py:254] (1/4) Num frame mask: 10 2023-10-02 01:35:51,351 INFO [asr_datamodule.py:267] (1/4) About to create train dataset 2023-10-02 01:35:51,351 INFO [asr_datamodule.py:294] (1/4) Using DynamicBucketingSampler. 2023-10-02 01:35:51,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:35:51,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 01:35:51,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 01:35:51,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:51,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:51,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:51,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:51,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:51,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:35:51,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:51,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:35:52,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:35:52,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 01:35:52,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 01:35:52,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 01:35:52,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:35:52,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 01:35:52,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 01:35:53,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:53,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:53,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:35:53,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:35:53,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:35:53,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:35:53,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:53,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:53,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:35:53,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:53,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:35:53,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:53,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:35:54,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 01:35:54,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:35:54,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:54,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 01:35:54,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 01:35:54,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:35:54,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:35:54,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 01:35:55,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 01:35:55,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:35:55,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:35:55,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:35:55,720 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 01:35:55,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 01:35:55,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:35:55,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:56,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 01:35:56,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 01:35:56,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 01:35:56,303 INFO [asr_datamodule.py:309] (1/4) About to create train dataloader 2023-10-02 01:35:56,304 INFO [multi_dataset.py:103] (1/4) About to get multidataset dev cuts 2023-10-02 01:35:56,304 INFO [multi_dataset.py:106] (1/4) Loading Aishell-2 DEV set in lazy mode 2023-10-02 01:35:56,305 INFO [multi_dataset.py:182] (1/4) About to get dev-clean cuts 2023-10-02 01:35:56,306 INFO [multi_dataset.py:189] (1/4) About to get dev-other cuts 2023-10-02 01:35:56,332 INFO [asr_datamodule.py:340] (1/4) About to create dev dataset 2023-10-02 01:35:56,759 INFO [asr_datamodule.py:357] (1/4) About to create dev dataloader 2023-10-02 01:35:56,759 INFO [train.py:1358] (1/4) Sanity check -- see if any of the batches in epoch 1 would cause OOM. 2023-10-02 01:35:56,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:35:56,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 01:35:56,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 01:35:56,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:57,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:57,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:57,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:57,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:57,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:35:57,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:57,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:35:57,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:35:57,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 01:35:58,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 01:35:58,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 01:35:58,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:35:58,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 01:35:58,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 01:35:58,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:58,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:58,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:35:58,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:35:58,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:35:59,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:35:59,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:59,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:59,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:35:59,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:59,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:35:59,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:59,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:00,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 01:36:00,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:00,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:00,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 01:36:00,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 01:36:00,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:36:00,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:00,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 01:36:01,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 01:36:01,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:36:01,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:36:01,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:36:01,689 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 01:36:01,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 01:36:01,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:36:01,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:02,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 01:36:02,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 01:36:02,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 01:36:03,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:36:03,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 01:36:03,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 01:36:03,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:03,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:03,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:03,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:03,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:03,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:03,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:03,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:36:03,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:36:03,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 01:36:04,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 01:36:04,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 01:36:04,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:36:04,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 01:36:04,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 01:36:04,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:04,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:04,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:04,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:04,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:05,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:36:05,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:05,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:05,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:05,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:05,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:36:05,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:05,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:06,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 01:36:06,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:06,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:06,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 01:36:06,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 01:36:06,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:36:06,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:06,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 01:36:07,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 01:36:07,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:36:07,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:36:07,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:36:07,696 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 01:36:07,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 01:36:07,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:36:07,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:07,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 01:36:08,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 01:36:08,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 01:36:08,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:36:09,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 01:36:09,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:09,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:36:10,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:10,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:36:10,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:10,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 01:36:10,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 01:36:10,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:10,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:11,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:11,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:11,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:36:11,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:11,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 01:36:11,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:12,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:36:12,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:12,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 01:36:12,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:36:12,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:36:12,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:13,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:14,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:14,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 01:36:14,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 01:36:14,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:14,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:14,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:36:15,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:15,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 01:36:15,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:15,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:15,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:36:16,018 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 01:36:16,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:36:16,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:16,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:16,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 01:36:16,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:36:16,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:16,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:16,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:17,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:17,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 01:36:17,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:17,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:36:18,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 01:36:18,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 01:36:19,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:36:19,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:19,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:19,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:19,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:36:19,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:36:19,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:19,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:20,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:36:20,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:36:20,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 01:36:20,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:36:20,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:36:20,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 01:36:20,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:20,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 01:36:21,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:21,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:21,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:21,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:21,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:36:21,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 01:36:21,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 01:36:21,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:21,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:36:21,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:21,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:21,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 01:36:22,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 01:36:22,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:36:22,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:22,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:36:22,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 01:36:22,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 01:36:22,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:22,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:36:23,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:36:23,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:23,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:23,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:23,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:24,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 01:36:24,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:24,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:36:24,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:36:24,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:36:24,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:24,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:36:24,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 01:36:24,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:36:24,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:24,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:24,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:25,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 01:36:25,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:25,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:25,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:36:25,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:36:25,851 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 01:36:25,873 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 01:36:26,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:26,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:36:26,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:36:26,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:26,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:27,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:27,074 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 01:36:27,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:36:28,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:36:28,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:28,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:28,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:28,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:36:29,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:36:29,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:29,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:29,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:29,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:29,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:29,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 01:36:29,618 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 01:36:29,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:29,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:36:29,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:29,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:29,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:36:29,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:36:29,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:36:29,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:29,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:30,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:30,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:30,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:30,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:30,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:36:30,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:30,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:30,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:30,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:31,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:36:31,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:31,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 01:36:31,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 01:36:31,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 01:36:31,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:31,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:36:31,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:32,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:32,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:32,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:32,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:32,919 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 01:36:33,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:33,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:33,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:36:33,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 01:36:34,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:36:34,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:34,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:36:34,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:36:34,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:34,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:36:34,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:34,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 01:36:35,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:35,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:35,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:35,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:36:35,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:35,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 01:36:35,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:36:35,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:36:35,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:36,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:36:36,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 01:36:36,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:36:36,181 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 01:36:36,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:37,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:37,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:36:37,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 01:36:37,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:37,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:37,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 01:36:38,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:36:38,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:38,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:38,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:38,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:38,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:40,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:36:40,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:40,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:36:40,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:40,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:36:40,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:36:40,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:40,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:36:40,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:40,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:40,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 01:36:40,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:36:40,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:41,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:36:42,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:42,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:42,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:36:43,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:43,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 01:36:43,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:43,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:36:43,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:43,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:36:43,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 01:36:43,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:43,706 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 01:36:43,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:43,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:36:44,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:44,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:44,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:44,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:44,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:44,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:36:46,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:46,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:46,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:36:46,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:36:47,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:36:47,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:36:47,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:47,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:36:47,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:36:47,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:47,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:36:47,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 01:36:47,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:47,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:36:47,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:36:47,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:36:47,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:36:48,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:36:48,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:36:48,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:48,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:36:48,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:48,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:36:48,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:49,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:36:49,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:49,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:49,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 01:36:50,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:50,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:36:50,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 01:36:50,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:36:50,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:50,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 01:36:51,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:51,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:51,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:51,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 01:36:51,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:51,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:36:51,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 01:36:51,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:52,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:36:52,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:36:52,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 01:36:53,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 01:36:53,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:53,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:53,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:53,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 01:36:53,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:36:53,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:36:53,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:53,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:54,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:36:54,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 01:36:54,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:36:54,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:55,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 01:36:55,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:55,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:36:55,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:55,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 01:36:56,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:56,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:36:56,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:56,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:36:56,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 01:36:56,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:36:56,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:56,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 01:36:57,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:57,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:57,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:57,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:57,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:57,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:57,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:36:57,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:58,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:58,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:58,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:58,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 01:36:59,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:59,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 01:36:59,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:59,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 01:36:59,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:00,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 01:37:00,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:37:00,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:00,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:37:00,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:00,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:00,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:37:00,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:00,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:37:00,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:00,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:01,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:01,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:37:01,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:01,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:02,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 01:37:02,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:02,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:02,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:02,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:02,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 01:37:02,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:02,858 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 01:37:02,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 01:37:02,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:03,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:37:03,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 01:37:03,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:03,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:37:03,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:03,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:03,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:04,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:04,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:05,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:37:05,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 01:37:05,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:05,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:05,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:37:05,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:05,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:05,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:05,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 01:37:06,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 01:37:06,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:06,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 01:37:06,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:06,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:37:06,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:06,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 01:37:06,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:06,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:06,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:06,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:06,765 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 01:37:06,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 01:37:07,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:07,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:07,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 01:37:07,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 01:37:07,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:07,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:09,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 01:37:09,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:37:09,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 01:37:09,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:09,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:37:09,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 01:37:10,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:37:10,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:37:10,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:10,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:10,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 01:37:10,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:37:10,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 01:37:11,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:37:11,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:37:11,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 01:37:11,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:37:11,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:11,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:37:11,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 01:37:11,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:37:11,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:11,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:37:11,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 01:37:11,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:37:12,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:37:12,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:37:12,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:12,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:37:13,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 01:37:13,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 01:37:14,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:37:14,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:14,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:14,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:14,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:14,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 01:37:15,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 01:37:15,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 01:37:15,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:15,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:15,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:37:15,546 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 01:37:15,567 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 01:37:15,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:15,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:15,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:37:16,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:37:16,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:37:16,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 01:37:16,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 01:37:16,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:37:16,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:37:16,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:37:16,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 01:37:17,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:37:17,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 01:37:17,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 01:37:17,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:37:18,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:18,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:18,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:37:18,618 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 01:37:18,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:18,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:37:19,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:19,051 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 01:37:19,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 01:37:19,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:19,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:37:19,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:37:19,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:37:20,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:20,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:20,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:20,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:20,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:37:20,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:37:21,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:21,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 01:37:21,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:37:21,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:21,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:37:21,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:37:21,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:21,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 01:37:21,666 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 01:37:21,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:22,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:22,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:22,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:22,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:37:23,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 01:37:23,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:37:23,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:23,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:23,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:24,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:24,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 01:37:24,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:24,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:24,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 01:37:24,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:37:24,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:25,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 01:37:25,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 01:37:25,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:25,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 01:37:25,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:25,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:25,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:25,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:25,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:25,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:37:26,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:26,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 01:37:26,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:37:26,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:26,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:27,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:27,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:27,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 01:37:27,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 01:37:28,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:37:28,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:28,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:37:28,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:37:28,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:28,803 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 01:37:28,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:28,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 01:37:29,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:37:29,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:37:29,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:37:29,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:29,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 01:37:29,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 01:37:29,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:29,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:29,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:29,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:30,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:37:30,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:37:30,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:30,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:30,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:37:30,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:37:30,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:30,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:37:30,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:30,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:37:30,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:37:32,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 01:37:32,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 01:37:32,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:32,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:37:32,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:33,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:33,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:37:33,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 01:37:33,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:37:33,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:33,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:33,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 01:37:33,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:34,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 01:37:34,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:34,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:34,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:37:35,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:35,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:35,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:36,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:37:36,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:36,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:36,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:37,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 01:37:37,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:37:37,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:37,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 01:37:38,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:37:38,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 01:37:38,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:37:38,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:37:38,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 01:37:39,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:37:39,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:37:39,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:37:39,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:39,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 01:37:39,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:40,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:37:40,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:41,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:41,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 01:37:41,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:41,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:41,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:41,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 01:37:42,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:42,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:42,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:37:42,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:42,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:37:42,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:37:42,564 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 01:37:42,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:42,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:42,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:42,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:43,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:43,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:37:43,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 01:37:43,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:37:43,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:37:43,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:37:43,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:43,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:37:43,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 01:37:43,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 01:37:43,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:43,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:43,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:37:43,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:43,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:44,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:44,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:44,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:45,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:45,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:37:45,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:45,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:37:45,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:46,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:46,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:46,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 01:37:46,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 01:37:46,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 01:37:46,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:46,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:47,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 01:37:47,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:47,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:37:47,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:47,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:37:47,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:48,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:48,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 01:37:48,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:48,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 01:37:48,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 01:37:49,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:37:49,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:37:49,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:37:50,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:50,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 01:37:50,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:50,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:37:50,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 01:37:51,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:51,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:51,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:51,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:37:51,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 01:37:52,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 01:37:52,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 01:37:52,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:52,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:52,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:52,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:52,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 01:37:53,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 01:37:53,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 01:37:53,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 01:37:53,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 01:37:53,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 01:37:53,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:53,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 01:37:53,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:53,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:37:53,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:54,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:54,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:37:54,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:54,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:37:54,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:37:54,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:55,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:55,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:55,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 01:37:55,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:37:55,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:55,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:55,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:37:55,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 01:37:55,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:55,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 01:37:56,013 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 01:37:56,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 01:37:56,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:56,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:37:56,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:37:56,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:37:56,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:56,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:37:57,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:57,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:57,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 01:37:57,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:37:57,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 01:37:57,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:37:57,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:37:57,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 01:37:57,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:58,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:58,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:37:58,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:59,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:37:59,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 01:37:59,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:59,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:59,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:59,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:00,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:00,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:38:00,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:00,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:00,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:00,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:00,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:00,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:01,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:01,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:01,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:38:01,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 01:38:01,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:01,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:01,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:38:01,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:01,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 01:38:01,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:02,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 01:38:02,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:02,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:02,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:38:02,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:03,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:03,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:03,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:03,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:03,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 01:38:04,074 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 01:38:04,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 01:38:04,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:38:04,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:04,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:04,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:04,633 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 01:38:04,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 01:38:04,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:38:04,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:38:05,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:38:05,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:05,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 01:38:05,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:38:05,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 01:38:06,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:38:06,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:06,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 01:38:06,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:06,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:06,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 01:38:06,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:06,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:07,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:07,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:38:07,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:08,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 01:38:08,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 01:38:08,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 01:38:08,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:38:08,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:08,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:08,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:08,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:38:08,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:08,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:08,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 01:38:09,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 01:38:09,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:09,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 01:38:09,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 01:38:09,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 01:38:10,186 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 01:38:10,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:38:10,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:10,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 01:38:10,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:10,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:10,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 01:38:10,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:38:10,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:10,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:38:10,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:38:11,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:38:11,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:38:11,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 01:38:11,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:38:11,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:11,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:38:11,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:11,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:12,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:12,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:38:12,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:38:12,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:12,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:38:13,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:38:13,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:13,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 01:38:13,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:38:13,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:13,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 01:38:14,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:14,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:14,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 01:38:14,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:14,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 01:38:14,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:38:15,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:38:15,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:15,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:38:15,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:38:15,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:15,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:16,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:38:16,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:17,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 01:38:17,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:17,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:38:17,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:38:17,803 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 01:38:17,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 01:38:18,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:38:18,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:38:18,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:38:18,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:18,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:19,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 01:38:19,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:19,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 01:38:19,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:38:19,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:19,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:19,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:19,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 01:38:19,816 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 01:38:19,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 01:38:19,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 01:38:20,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:20,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 01:38:21,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:21,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:21,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:21,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:38:21,940 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 01:38:22,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:22,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:22,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:22,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:38:22,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 01:38:22,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:38:22,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:22,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 01:38:22,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:23,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:23,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:23,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:23,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 01:38:23,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:38:23,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:23,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:24,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:24,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:24,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 01:38:24,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 01:38:24,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:38:24,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:24,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:24,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:38:24,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 01:38:24,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:38:25,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:25,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:25,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 01:38:25,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:25,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:38:25,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 01:38:26,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:38:26,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:26,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:26,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 01:38:26,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 01:38:27,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:27,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 01:38:27,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:27,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:28,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 01:38:28,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 01:38:28,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:28,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:28,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:28,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 01:38:29,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 01:38:29,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 01:38:29,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:29,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 01:38:29,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 01:38:29,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 01:38:30,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:30,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:31,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:31,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:31,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:31,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:31,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 01:38:31,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:31,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:38:31,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:31,457 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 01:38:31,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 01:38:31,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 01:38:31,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 01:38:32,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:32,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:32,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:38:32,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:32,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:38:32,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 01:38:33,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:38:33,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 01:38:33,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 01:38:33,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:33,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:33,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:33,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:38:33,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:34,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:38:34,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:34,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:38:35,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:35,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:38:35,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:38:35,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:38:35,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:35,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:38:35,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:38:35,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:38:35,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 01:38:35,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:35,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 01:38:36,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:36,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 01:38:36,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:38:36,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:36,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:38:36,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:37,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 01:38:37,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 01:38:37,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:38:37,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 01:38:37,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 01:38:37,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:37,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 01:38:38,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:38:38,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:38,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:38:38,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:38:39,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 01:38:39,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 01:38:39,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 01:38:39,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:38:39,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:38:39,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 01:38:40,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:40,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:38:40,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:38:40,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:38:40,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:40,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:40,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 01:38:40,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:38:40,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 01:38:40,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 01:38:40,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:41,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:41,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:41,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:38:42,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:42,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:42,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 01:38:42,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:42,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:38:42,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:42,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:38:42,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 01:38:42,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:38:43,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:43,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:38:43,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:38:44,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:38:44,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:44,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 01:38:44,752 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 01:38:44,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:45,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:45,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:38:45,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:45,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 01:38:45,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:38:45,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:38:45,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:45,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:45,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 01:38:45,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:45,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 01:38:46,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:38:46,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:38:46,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 01:38:46,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:38:46,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:47,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:47,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:47,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 01:38:47,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:47,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:47,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 01:38:47,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:38:47,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 01:38:47,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:47,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:38:48,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:38:48,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:48,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:49,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:49,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:49,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 01:38:49,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:49,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 01:38:49,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:49,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:38:49,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 01:38:50,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:50,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:50,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:50,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 01:38:50,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:38:50,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:50,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 01:38:51,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:51,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:51,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:52,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:52,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 01:38:53,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:53,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:53,518 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 01:38:53,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:54,163 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 01:38:54,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:54,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:38:54,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:38:54,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:38:54,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:55,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:38:55,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:38:55,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:55,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:55,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:38:55,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:55,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:38:55,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:55,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:56,099 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 01:38:56,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 01:38:56,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:38:57,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:38:57,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:57,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:57,711 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 01:38:57,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:58,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:38:58,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:58,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 01:38:58,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:38:58,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 01:38:59,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 01:38:59,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:59,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:59,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:59,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:38:59,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:59,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:38:59,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:38:59,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 01:38:59,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:38:59,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:38:59,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:39:00,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:00,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:00,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:39:00,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:39:00,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 01:39:01,036 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 01:39:01,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:01,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:39:02,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:02,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:02,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 01:39:02,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:39:02,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:02,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 01:39:02,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:39:03,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:39:03,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:39:03,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:03,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:39:03,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:03,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:39:04,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:39:04,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:39:04,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:04,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:04,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:04,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:04,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:39:04,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 01:39:05,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:39:05,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:05,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 01:39:05,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:05,316 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 01:39:05,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:05,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:05,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:06,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:06,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:39:06,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 01:39:06,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 01:39:06,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 01:39:06,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:07,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 01:39:07,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:07,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 01:39:07,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:07,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 01:39:07,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:39:07,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:39:07,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:39:07,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:07,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 01:39:08,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:08,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:39:08,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:39:08,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:39:08,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:08,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 01:39:09,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:09,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:39:09,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:09,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:09,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:39:09,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 01:39:10,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:39:10,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:39:10,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 01:39:11,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:39:11,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:11,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:11,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:11,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:11,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:39:11,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:39:11,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 01:39:12,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:39:12,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:39:12,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 01:39:12,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:39:12,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:12,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:12,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 01:39:12,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:12,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 01:39:13,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:13,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:13,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:39:13,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 01:39:13,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 01:39:13,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 01:39:14,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:14,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 01:39:14,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:15,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 01:39:15,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:16,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:16,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:16,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:16,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:39:16,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:39:16,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:39:16,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 01:39:17,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:39:17,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:39:17,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 01:39:17,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:17,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:17,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 01:39:17,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 01:39:17,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 01:39:17,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:17,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 01:39:18,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:20,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:20,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:20,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 01:39:20,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:20,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 01:39:20,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:39:20,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:20,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:20,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 01:39:21,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:39:21,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 01:39:21,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 01:39:22,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 01:39:22,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:22,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:22,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:22,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 01:39:23,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 01:39:24,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:39:24,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:24,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:24,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:39:25,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:39:25,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 01:39:25,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:25,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:26,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 01:39:26,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:39:26,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:39:26,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:39:26,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:26,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:39:26,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:26,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:26,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 01:39:26,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:39:27,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:27,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:39:28,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 01:39:28,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:39:28,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:29,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 01:39:29,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:29,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:29,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:39:29,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:29,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:29,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 01:39:30,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:30,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:39:30,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:30,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 01:39:30,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:39:30,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 01:39:30,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:30,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:30,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 01:39:30,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:30,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:39:31,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 01:39:31,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:31,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:31,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:31,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:31,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:39:31,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:31,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:31,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:32,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:32,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:39:32,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:32,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:33,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 01:39:33,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:33,584 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 01:39:33,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:33,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:39:33,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:34,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 01:39:34,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:34,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 01:39:34,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 01:39:34,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:34,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:39:35,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:35,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 01:39:35,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 01:39:35,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 01:39:35,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:35,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:39:36,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 01:39:36,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:39:36,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:39:36,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:36,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:36,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:39:36,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 01:39:36,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:39:36,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:39:37,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:37,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:39:37,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:37,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:38,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:38,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 01:39:38,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:39:38,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:38,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:38,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 01:39:39,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 01:39:39,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:39,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 01:39:39,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:39:39,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:39:39,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:39,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:39,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 01:39:39,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:40,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:40,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 01:39:40,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:40,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:39:40,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 01:39:40,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:39:41,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:39:41,483 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 01:39:41,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:41,545 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 01:39:41,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:42,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:42,443 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 01:39:42,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:39:42,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 01:39:42,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:43,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:43,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:43,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:43,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:43,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:39:43,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 01:39:43,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 01:39:43,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:39:43,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 01:39:43,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 01:39:43,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:43,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:43,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:43,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:39:44,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:44,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:44,368 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 01:39:44,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:44,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:39:44,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:39:44,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:39:44,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 01:39:44,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:44,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 01:39:45,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 01:39:45,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 01:39:45,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:45,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:45,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:45,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 01:39:45,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 01:39:46,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:47,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:47,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:39:47,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:47,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 01:39:47,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:39:47,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:48,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:48,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:48,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:48,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 01:39:48,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:39:48,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:39:48,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:48,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 01:39:48,578 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 01:39:48,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:49,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 01:39:49,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:49,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:49,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 01:39:49,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:39:50,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:50,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:39:50,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:39:50,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:39:50,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:50,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 01:39:50,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 01:39:50,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 01:39:51,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:51,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 01:39:51,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:51,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:52,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:52,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 01:39:52,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:52,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 01:39:52,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:52,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 01:39:53,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 01:39:53,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:53,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 01:39:53,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:53,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:53,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:54,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 01:39:54,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 01:39:54,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:54,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:54,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:54,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:39:54,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:39:54,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:39:55,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:39:56,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:56,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:56,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 01:39:56,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:39:56,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 01:39:56,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:57,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:39:57,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:57,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 01:39:57,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 01:39:57,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 01:39:57,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 01:39:57,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:57,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:57,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:57,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:39:57,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:57,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 01:39:58,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:58,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:58,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:58,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:39:58,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 01:39:58,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 01:39:58,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:39:58,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:39:59,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 01:40:00,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:00,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 01:40:00,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:00,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:00,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:40:00,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:01,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:40:01,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:01,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:01,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:01,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:40:01,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:01,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:01,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:40:01,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:40:01,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 01:40:02,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:02,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 01:40:02,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 01:40:02,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 01:40:02,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:02,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:40:02,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:02,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:02,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 01:40:02,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:02,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:40:02,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:03,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 01:40:03,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:03,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:40:03,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 01:40:03,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:40:03,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:40:03,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:03,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:40:03,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:40:03,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 01:40:04,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:40:05,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:05,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:05,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:40:05,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:40:05,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:40:05,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:06,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 01:40:06,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:40:06,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:06,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:40:06,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:40:06,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 01:40:06,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 01:40:06,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:07,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 01:40:07,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:40:07,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:07,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:07,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:40:08,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:40:08,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 01:40:08,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:08,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:08,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 01:40:08,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:08,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:08,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:08,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:09,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:09,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:09,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:09,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:40:09,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:09,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:09,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 01:40:10,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:10,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:10,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 01:40:10,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:10,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:11,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:40:11,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 01:40:11,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:11,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:40:11,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:11,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 01:40:11,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:12,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 01:40:12,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:12,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:40:12,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 01:40:12,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 01:40:12,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:12,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 01:40:13,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:14,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:14,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:14,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:14,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:14,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:14,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:14,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:15,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:15,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 01:40:15,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:15,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 01:40:15,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:15,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:15,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:40:16,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:40:16,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 01:40:16,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:16,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:16,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:16,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:40:16,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:17,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 01:40:17,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:17,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:40:17,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:17,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:40:17,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:40:17,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:40:17,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:40:18,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:18,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:40:18,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:18,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:40:18,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:19,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:19,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:19,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:19,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:19,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:40:19,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 01:40:19,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:19,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:20,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 01:40:20,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 01:40:20,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 01:40:20,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:20,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:20,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:20,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:40:21,093 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 01:40:21,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:21,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:21,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 01:40:21,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 01:40:21,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:40:21,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:21,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:40:22,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 01:40:23,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:23,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 01:40:23,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:23,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:23,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:40:23,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 01:40:23,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:40:23,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:23,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 01:40:23,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:24,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:24,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:40:24,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:24,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:24,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:40:24,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:24,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:24,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:40:24,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:40:25,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:40:25,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 01:40:25,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 01:40:25,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 01:40:26,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:26,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 01:40:26,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 01:40:27,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:27,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 01:40:27,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:27,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:28,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 01:40:28,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:28,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:40:28,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:40:28,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:40:28,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:28,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:40:29,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:29,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 01:40:29,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:29,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:29,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:40:29,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 01:40:29,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:29,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:29,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:40:29,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 01:40:29,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 01:40:30,112 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 01:40:30,190 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 01:40:30,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:40:30,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:30,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:40:30,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:30,492 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 01:40:30,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:40:30,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:30,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:40:30,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:40:30,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:30,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 01:40:31,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:31,617 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 01:40:31,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:40:31,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:32,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:32,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:40:32,417 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 01:40:32,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 01:40:32,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:40:32,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:32,676 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 01:40:32,714 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 01:40:33,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 01:40:33,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:33,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 01:40:33,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 01:40:34,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 01:40:34,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 01:40:34,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:34,666 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 01:40:34,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 01:40:34,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 01:40:34,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 01:40:34,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:35,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 01:40:35,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:40:36,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:36,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 01:40:36,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:40:36,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 01:40:36,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:40:37,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:40:37,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:37,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:37,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:37,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:40:37,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:40:37,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:40:37,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:37,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:40:37,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:37,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:37,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:40:37,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:38,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:40:38,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:40:38,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:38,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:40:38,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 01:40:38,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 01:40:38,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:38,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:39,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:40:39,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:39,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:39,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:39,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:40:39,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:40:39,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:40:39,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:39,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:40,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:40:40,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:40,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:40:40,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 01:40:40,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:40:40,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:40,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:41,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:41,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:41,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:40:41,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:40:41,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:40:41,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 01:40:41,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:42,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:42,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:40:42,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:40:42,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:42,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:43,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:43,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:43,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:40:43,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:43,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 01:40:43,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:40:43,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:43,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 01:40:44,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:44,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:40:44,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:40:45,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:45,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:40:45,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:45,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 01:40:45,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:40:45,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:45,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 01:40:46,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:40:46,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:46,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:46,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 01:40:46,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:46,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:46,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:46,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 01:40:46,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:40:46,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 01:40:46,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:47,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:47,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:40:47,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:47,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:47,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:40:47,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 01:40:47,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 01:40:47,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:47,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:48,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:48,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:48,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:40:48,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:48,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:48,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:48,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:40:48,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:49,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:49,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:40:49,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 01:40:50,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:40:50,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:50,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:50,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:40:50,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:50,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:50,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:50,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:40:50,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:40:51,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:51,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:51,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:51,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:51,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:52,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:40:52,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:52,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:52,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 01:40:52,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:52,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:52,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:40:53,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:53,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:54,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 01:40:54,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:54,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 01:40:54,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:54,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:54,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:54,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:40:54,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:54,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:54,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:55,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:40:55,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:55,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:40:55,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:40:55,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:56,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:40:56,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:56,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 01:40:56,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:57,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:40:57,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:57,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 01:40:58,278 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 01:40:58,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:58,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:58,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:40:58,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:58,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 01:40:58,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 01:40:58,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:40:58,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:40:58,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:59,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:59,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:59,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 01:40:59,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:40:59,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 01:40:59,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 01:40:59,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:59,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:40:59,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 01:40:59,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:41:00,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 01:41:00,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:41:00,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:00,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:00,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:41:00,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 01:41:00,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:00,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 01:41:00,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 01:41:00,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:00,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 01:41:00,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 01:41:00,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 01:41:01,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:41:01,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:41:01,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:41:01,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:41:01,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:02,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:02,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 01:41:02,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:02,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:02,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:02,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 01:41:02,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 01:41:02,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 01:41:03,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:41:03,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:03,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 01:41:03,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:04,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:41:04,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:04,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:04,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 01:41:04,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:41:04,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:04,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:41:04,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:04,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:04,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 01:41:04,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 01:41:04,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:04,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:04,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:41:04,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:41:05,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:41:05,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 01:41:05,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:05,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:05,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:41:05,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:05,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:41:06,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:06,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:06,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:41:07,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:07,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 01:41:07,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:07,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:41:07,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:07,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:07,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:07,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:41:07,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:07,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:07,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:08,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 01:41:08,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:41:08,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:08,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:08,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:41:08,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:08,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:08,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:41:08,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:08,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 01:41:08,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:41:09,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:09,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:09,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:09,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:41:09,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:09,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:09,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 01:41:09,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 01:41:09,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:41:09,750 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 01:41:09,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:09,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:41:09,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 01:41:09,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:41:09,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 01:41:09,976 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 01:41:09,977 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 01:41:10,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 01:41:10,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:10,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:10,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:41:10,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:10,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:41:10,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:10,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:11,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:41:11,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 01:41:12,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:12,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:12,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:41:12,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:12,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:41:12,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:13,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:13,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 01:41:13,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 01:41:13,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:41:14,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 01:41:14,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:14,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:41:14,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:41:14,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:41:14,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 01:41:14,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:41:15,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:15,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:41:15,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:41:16,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:16,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:16,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:16,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 01:41:16,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:16,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 01:41:16,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:16,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:41:17,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:17,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:41:17,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:17,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:17,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:17,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:41:17,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:17,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:41:17,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 01:41:17,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:18,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:41:18,117 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 01:41:18,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:41:18,331 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 01:41:18,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:41:18,463 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 01:41:18,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:18,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:41:18,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:18,891 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 01:41:19,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:19,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:41:19,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:41:19,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:41:20,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:20,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:41:20,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:41:20,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 01:41:20,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:21,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:41:21,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 01:41:21,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:21,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:21,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:41:21,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:21,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:41:21,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:41:21,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 01:41:22,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:22,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:22,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:22,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:22,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:22,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:22,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:23,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:23,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:41:23,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:41:23,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:41:23,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:41:24,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:41:25,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:41:25,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:41:25,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 01:41:25,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:25,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:41:25,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 01:41:25,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:41:25,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:26,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:26,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:41:26,477 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 01:41:26,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:27,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:27,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:41:27,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:27,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:27,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 01:41:27,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:41:27,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:27,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:41:27,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:41:28,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:41:28,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:28,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:41:28,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:29,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:41:29,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:30,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:30,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:41:30,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:41:30,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 01:41:30,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:41:30,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:30,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:30,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:30,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:30,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:41:30,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:41:30,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 01:41:30,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:41:30,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:30,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 01:41:31,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:31,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:31,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:31,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:32,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:41:32,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:41:32,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:32,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:41:32,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 01:41:32,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:41:32,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 01:41:33,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 01:41:34,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:34,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:34,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:34,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:41:34,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:34,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 01:41:34,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:35,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 01:41:35,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:41:35,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:41:35,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:35,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:41:35,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 01:41:35,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:41:35,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:35,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:35,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:36,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:41:36,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 01:41:36,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:41:36,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:36,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:36,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 01:41:36,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:41:37,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 01:41:37,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:41:37,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 01:41:38,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 01:41:38,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:38,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:41:38,504 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 01:41:38,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 01:41:38,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 01:41:38,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:39,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:39,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:41:39,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:41:39,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 01:41:39,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 01:41:40,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:41:40,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:40,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 01:41:40,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:40,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:40,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 01:41:41,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:41,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 01:41:41,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:41:42,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 01:41:43,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:43,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:43,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:43,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 01:41:43,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:41:44,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:44,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:44,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:44,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:41:44,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:41:44,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:44,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:44,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:44,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:41:45,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:45,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:41:45,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 01:41:45,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 01:41:45,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:45,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:45,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 01:41:45,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 01:41:45,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 01:41:45,528 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 01:41:45,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 01:41:45,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:45,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:45,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:45,975 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 01:41:46,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:46,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:41:46,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:41:46,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:46,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:46,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:47,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 01:41:47,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:47,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:48,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:41:48,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:41:48,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:41:48,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 01:41:48,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:48,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:41:48,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:49,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:41:49,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:49,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:49,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:49,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 01:41:49,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:41:50,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:50,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:50,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:50,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:41:50,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:50,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:41:50,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 01:41:50,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:50,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:41:51,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:51,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:52,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:41:52,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 01:41:52,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:41:52,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:52,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 01:41:52,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:52,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:41:52,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:41:53,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:53,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:41:53,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 01:41:53,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:53,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:54,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:41:54,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:54,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:54,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 01:41:55,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:41:55,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:55,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:41:55,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:41:55,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 01:41:55,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:55,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:55,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 01:41:55,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:55,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 01:41:55,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:56,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:56,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:57,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:41:57,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 01:41:57,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:57,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:57,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:57,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:58,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:58,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:41:58,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 01:41:58,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:58,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:41:58,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:41:58,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:41:59,053 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 01:41:59,054 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 01:41:59,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 01:41:59,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:59,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 01:41:59,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 01:41:59,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:59,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 01:41:59,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 01:42:00,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:00,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:00,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:42:01,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:01,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 01:42:01,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:42:01,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 01:42:01,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:42:01,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:02,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:02,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 01:42:02,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:42:02,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:02,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:02,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:42:02,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 01:42:02,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:42:02,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:02,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 01:42:03,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:42:03,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:03,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:03,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:03,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:42:04,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:04,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:42:04,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:42:04,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:42:04,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:42:04,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:42:05,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:05,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:06,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:06,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 01:42:06,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:06,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:06,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:42:06,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:42:06,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:07,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:07,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:07,388 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 01:42:07,629 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 01:42:07,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:42:07,681 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 01:42:07,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 01:42:07,788 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 01:42:08,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:08,037 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 01:42:08,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 01:42:08,286 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 01:42:08,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:42:08,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 01:42:08,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 01:42:08,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:42:08,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 01:42:09,037 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 01:42:09,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 01:42:10,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:10,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:10,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:10,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 01:42:10,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:42:10,868 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 01:42:11,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:11,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:11,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 01:42:11,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:11,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:11,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 01:42:11,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:42:11,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:12,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:12,343 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 01:42:12,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:12,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:42:12,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:12,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:42:12,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 01:42:12,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:13,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:13,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:13,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 01:42:13,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:13,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:42:14,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 01:42:14,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:14,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:42:14,994 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 01:42:15,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:15,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:15,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:42:15,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:15,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:15,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 01:42:15,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:42:15,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:42:16,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 01:42:16,121 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 01:42:16,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:16,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 01:42:16,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:16,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 01:42:16,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:16,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:42:16,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:17,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:17,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 01:42:17,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 01:42:17,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:42:17,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 01:42:17,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:17,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:18,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:42:18,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:18,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:18,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:18,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:18,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:18,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:42:18,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:42:19,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:19,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:42:19,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:19,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:19,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:42:20,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:20,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:42:20,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:20,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 01:42:20,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:20,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:20,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:20,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:20,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:42:20,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:21,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:21,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 01:42:21,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:42:21,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 01:42:21,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:21,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:21,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:21,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:42:21,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:21,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:42:21,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:42:21,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 01:42:21,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:42:22,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:42:22,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:42:22,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:22,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:42:22,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 01:42:22,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:42:23,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:42:23,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:24,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:42:24,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:24,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:42:24,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:42:24,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:24,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:24,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:42:24,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:42:24,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:24,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:42:25,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:42:25,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:25,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:42:25,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:26,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:26,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:26,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:26,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:26,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:26,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:26,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:42:26,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:26,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:27,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 01:42:27,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:27,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:42:27,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 01:42:27,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 01:42:27,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:28,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:28,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:28,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:28,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:42:28,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:29,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:29,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:42:29,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:42:29,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:29,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 01:42:29,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:42:29,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:29,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 01:42:29,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:29,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:29,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:29,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:42:30,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:30,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:30,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:30,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:30,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:42:30,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:42:30,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:42:30,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:30,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:42:31,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:31,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:42:31,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:31,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:31,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:42:31,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:42:33,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:42:33,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:33,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 01:42:33,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:33,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 01:42:33,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:42:34,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:42:34,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 01:42:34,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:34,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:42:34,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 01:42:34,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:42:34,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:42:34,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:34,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:34,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 01:42:34,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:34,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:35,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:35,061 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 01:42:35,062 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 01:42:35,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:35,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:42:35,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:35,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:35,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 01:42:36,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 01:42:36,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 01:42:36,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:36,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:36,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:37,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:42:37,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:37,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:42:37,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:37,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:42:38,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:38,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:38,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:38,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:38,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:42:38,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 01:42:38,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:39,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:42:39,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:42:39,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:39,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:39,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:42:39,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:42:39,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:39,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:42:39,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:42:39,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:42:40,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:40,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 01:42:40,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:40,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:40,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:40,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 01:42:40,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:40,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:42:40,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:42:40,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 01:42:41,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:42:41,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:42:41,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:41,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:42,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:42:42,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:42,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:42,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:42,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:42,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:42:42,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 01:42:43,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 01:42:43,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:43,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 01:42:43,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:43,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 01:42:43,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 01:42:43,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:44,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:44,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:42:44,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:42:45,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:42:45,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:42:45,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:42:45,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:42:45,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 01:42:45,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:42:45,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:46,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:46,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:46,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:46,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:46,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:46,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:42:46,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:46,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:47,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:47,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:42:47,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:47,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 01:42:47,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 01:42:47,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:42:47,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:47,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 01:42:47,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:42:48,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:48,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:48,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:42:48,097 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 01:42:48,137 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 01:42:48,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:42:48,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:48,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:42:48,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:48,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:48,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 01:42:49,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:49,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 01:42:49,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 01:42:49,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:42:49,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:42:49,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:49,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:49,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:42:49,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:50,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:42:50,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 01:42:50,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:42:50,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:51,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 01:42:51,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 01:42:51,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:51,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 01:42:51,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:51,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:51,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:42:51,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:51,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:52,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:52,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:52,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 01:42:52,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 01:42:52,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:42:52,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:42:53,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 01:42:53,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:42:53,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:54,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:42:54,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:42:54,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 01:42:55,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:55,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 01:42:55,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:55,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:42:55,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:55,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 01:42:56,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:56,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:56,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:56,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:42:56,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 01:42:56,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 01:42:56,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:42:56,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:57,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:42:57,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:42:57,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:57,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:57,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:57,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:42:57,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:57,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:57,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:57,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 01:42:58,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 01:42:58,333 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 01:42:58,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:42:58,519 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 01:42:58,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 01:42:58,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:58,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:42:58,708 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 01:42:58,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:42:58,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 01:42:59,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:42:59,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:42:59,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:00,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:43:00,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:00,059 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 01:43:00,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:00,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 01:43:00,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:00,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:43:00,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 01:43:00,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:00,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 01:43:01,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:01,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:01,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:43:01,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:01,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 01:43:01,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:01,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:01,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:43:01,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:43:01,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:01,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:01,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:01,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 01:43:02,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:02,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:02,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:43:02,524 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 01:43:02,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 01:43:02,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:43:02,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:43:02,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 01:43:02,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:03,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:43:04,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:43:05,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 01:43:05,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:43:05,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:43:05,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:05,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:05,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:05,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 01:43:05,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 01:43:05,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:06,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:43:06,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:43:06,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:43:06,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:06,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:06,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:43:06,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:06,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:43:06,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:43:06,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 01:43:07,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:43:07,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:07,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:43:07,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:07,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:07,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 01:43:07,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 01:43:07,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:07,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 01:43:07,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:43:08,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 01:43:08,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:43:08,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:43:08,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 01:43:08,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 01:43:08,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:43:08,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:09,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:09,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:43:09,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:09,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:09,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 01:43:09,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:09,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:09,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:43:10,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:10,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 01:43:10,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 01:43:10,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 01:43:10,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:10,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:43:11,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:11,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:11,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:11,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:43:11,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:43:11,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:11,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:11,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:11,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:11,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:12,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:12,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 01:43:12,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:12,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:43:13,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:13,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:43:13,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:13,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:13,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:13,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:14,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:14,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:14,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:14,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:43:14,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:43:14,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:43:14,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 01:43:14,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:43:14,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:14,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 01:43:14,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:15,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:15,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:43:15,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:43:15,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 01:43:15,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 01:43:16,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 01:43:16,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:43:16,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:16,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:16,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:43:17,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:17,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 01:43:18,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:43:18,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:18,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:18,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:18,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 01:43:18,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:18,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 01:43:18,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:18,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:18,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 01:43:18,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:19,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:43:19,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 01:43:19,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 01:43:19,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:19,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:19,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:19,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:19,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:43:19,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:43:19,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:20,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:43:20,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:20,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:20,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 01:43:20,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:20,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 01:43:20,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:43:20,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 01:43:20,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:20,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:20,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 01:43:22,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 01:43:22,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:22,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:22,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:22,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:43:22,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 01:43:22,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:22,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:43:23,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 01:43:23,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:23,310 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 01:43:23,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 01:43:23,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:23,689 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 01:43:23,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:43:23,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 01:43:23,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 01:43:23,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 01:43:23,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:23,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:24,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:24,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 01:43:24,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:24,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:24,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:24,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:43:24,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 01:43:24,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:43:25,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:43:25,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:25,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 01:43:25,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 01:43:26,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:43:26,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:43:26,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:43:26,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:26,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:43:26,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:43:26,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:43:26,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 01:43:26,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:43:26,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:26,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:26,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:26,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 01:43:26,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:26,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 01:43:27,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:27,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 01:43:27,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 01:43:27,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:27,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:27,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 01:43:27,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:43:27,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:27,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:27,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:28,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:43:28,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:43:28,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:28,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 01:43:29,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:29,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:43:29,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:29,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:29,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 01:43:29,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:30,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:43:30,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:31,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:43:32,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 01:43:32,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:32,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 01:43:32,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:43:33,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:43:33,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:43:33,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:43:33,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 01:43:33,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 01:43:33,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 01:43:33,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 01:43:34,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:43:35,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:35,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:43:35,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:35,343 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 01:43:35,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:43:35,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:35,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 01:43:35,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 01:43:35,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 01:43:35,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 01:43:36,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:43:36,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:43:36,345 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 01:43:36,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:36,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:36,492 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 01:43:36,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:43:37,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:37,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:37,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 01:43:37,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:37,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:37,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:37,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:43:37,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:43:38,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:38,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:43:38,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:38,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:38,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:38,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:38,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:38,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:43:38,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:38,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:38,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:39,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:39,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:40,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 01:43:40,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:40,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:40,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:40,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:43:40,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:43:41,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:41,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:41,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 01:43:41,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:43:41,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 01:43:41,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:41,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 01:43:41,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 01:43:41,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:41,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:41,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:41,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:43:41,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:42,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:42,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:42,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 01:43:42,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:42,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:43:42,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 01:43:42,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:42,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 01:43:42,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 01:43:43,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 01:43:43,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:43,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:43,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:43:44,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:44,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:43:44,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:43:44,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:44,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:44,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 01:43:45,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:45,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:45,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:45,234 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 01:43:45,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:45,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:43:45,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:43:45,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:45,515 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 01:43:45,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:45,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:43:46,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:46,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 01:43:46,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 01:43:46,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:46,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:43:46,402 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 01:43:46,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 01:43:46,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:43:46,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 01:43:46,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:43:47,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:43:47,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:43:47,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:47,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:47,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:47,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:43:47,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:43:47,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:47,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:43:47,827 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 01:43:47,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 01:43:48,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:43:48,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:48,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:48,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:48,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:49,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:43:49,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:49,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:43:49,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:49,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:43:49,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 01:43:49,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:49,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:49,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:43:49,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:43:50,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:50,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:50,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:50,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:43:50,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:50,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:43:50,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:50,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:43:51,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:51,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:43:51,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 01:43:51,352 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 01:43:51,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:51,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 01:43:51,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 01:43:51,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:43:51,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:51,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:51,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 01:43:51,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:51,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:43:52,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:52,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:52,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:52,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:53,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:53,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:53,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:53,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:53,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:54,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:54,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:54,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:54,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 01:43:54,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:43:54,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 01:43:54,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:43:54,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 01:43:54,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:54,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:55,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:55,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 01:43:55,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:43:55,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:43:55,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:56,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:56,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 01:43:56,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:43:56,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:43:56,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:56,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 01:43:56,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:56,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 01:43:56,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:56,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:56,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:43:56,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:43:56,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 01:43:57,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 01:43:57,968 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 01:43:57,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:58,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:58,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:43:58,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:58,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:43:58,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:58,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 01:43:59,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:43:59,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:59,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:59,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:44:00,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:44:00,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 01:44:00,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:00,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:00,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 01:44:00,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:00,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:00,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:01,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:44:01,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:01,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:44:01,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:44:02,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:02,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 01:44:03,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:44:03,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 01:44:03,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 01:44:03,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:03,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:44:03,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 01:44:04,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:04,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:44:04,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:44:04,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:04,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:44:04,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:05,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:05,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 01:44:05,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 01:44:05,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:44:05,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:44:06,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:07,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 01:44:07,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:44:07,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:07,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:07,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:44:07,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:44:07,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 01:44:07,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:07,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:08,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:08,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 01:44:08,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:44:09,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:44:09,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:09,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:09,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:09,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:09,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:44:09,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:10,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:11,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:44:11,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 01:44:11,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:44:11,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 01:44:11,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:11,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 01:44:12,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:44:12,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:12,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:44:12,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:12,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:44:12,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:12,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:12,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 01:44:12,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:12,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:44:12,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:13,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:13,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 01:44:13,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:13,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:13,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:44:13,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:13,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:44:13,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:14,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 01:44:14,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 01:44:14,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 01:44:14,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:14,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:14,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:14,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:44:14,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:44:14,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:44:15,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:15,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 01:44:15,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 01:44:15,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:44:16,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:16,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:16,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:16,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 01:44:16,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:16,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:16,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 01:44:16,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 01:44:17,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:17,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:17,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:17,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:17,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:44:18,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:18,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 01:44:18,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:18,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:44:18,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:18,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:18,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:44:19,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:44:19,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:44:19,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:19,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:44:19,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:44:20,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:44:20,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:44:20,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:44:20,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:20,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:44:20,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 01:44:20,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:20,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:20,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 01:44:21,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:21,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:21,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:21,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 01:44:21,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:44:21,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 01:44:21,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:44:21,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:44:21,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:44:22,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 01:44:22,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:22,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:22,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 01:44:22,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:23,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:23,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 01:44:23,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 01:44:23,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:24,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:44:24,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:24,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:24,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:25,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:25,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:25,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:44:25,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:44:25,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:25,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 01:44:25,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:44:25,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:25,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:44:26,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:26,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:44:26,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:26,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 01:44:26,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:44:26,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:26,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:44:26,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:27,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:27,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:27,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 01:44:27,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:27,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:44:27,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 01:44:28,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:44:29,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:29,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:29,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:44:29,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:44:29,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 01:44:30,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 01:44:30,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 01:44:30,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:30,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:30,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 01:44:30,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:30,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:44:30,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:30,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 01:44:30,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 01:44:31,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:31,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 01:44:31,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 01:44:31,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:31,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 01:44:32,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 01:44:32,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:32,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:44:32,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:44:32,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:44:32,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:33,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 01:44:33,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:44:33,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:33,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 01:44:33,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:44:33,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:33,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:33,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:44:34,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 01:44:34,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 01:44:34,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:34,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 01:44:34,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:34,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:44:34,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:44:35,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:35,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:44:35,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:44:35,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:44:35,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:44:35,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:35,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:35,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:35,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:44:36,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:36,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:36,572 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 01:44:36,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:36,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:36,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:44:36,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:37,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:44:37,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:37,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 01:44:37,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:38,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:44:38,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:38,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:44:38,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:38,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 01:44:38,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:38,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:44:38,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:44:38,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:44:39,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:39,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:39,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:44:39,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:39,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:44:39,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:39,570 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 01:44:40,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:40,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:44:40,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:44:40,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 01:44:40,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:44:40,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:40,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 01:44:40,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:40,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:41,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:41,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:44:41,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:44:42,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:44:42,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 01:44:42,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:42,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 01:44:42,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:42,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:44:43,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:43,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 01:44:43,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:43,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:44:43,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:44:43,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:43,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:44:43,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 01:44:43,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 01:44:43,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:44:43,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:43,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:44:44,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:44:44,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:44,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:44:44,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:44,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 01:44:44,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 01:44:44,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:44:44,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 01:44:45,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:44:45,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:45,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:45,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:45,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:45,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:44:45,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:44:46,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:46,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:46,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 01:44:47,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:47,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:47,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:47,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 01:44:47,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 01:44:47,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:47,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:48,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:48,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:48,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:44:48,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 01:44:49,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:49,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:49,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:49,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:44:49,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:44:50,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:44:50,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:44:50,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:44:51,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:44:51,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:44:51,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:52,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:44:52,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 01:44:52,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:52,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:52,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:44:52,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 01:44:52,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:52,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:44:52,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:52,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:44:52,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:53,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 01:44:53,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:44:53,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:53,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:53,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:44:53,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:44:53,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:44:53,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:54,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:54,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:54,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:44:54,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 01:44:54,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:55,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:56,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:56,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 01:44:56,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 01:44:56,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:56,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:56,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:57,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 01:44:57,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 01:44:57,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 01:44:57,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:57,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:57,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:44:58,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:44:58,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:44:58,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 01:44:58,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:44:58,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:58,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:44:59,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:59,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:45:00,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 01:45:00,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:00,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:00,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:00,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:45:01,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:01,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:01,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:01,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:45:01,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:01,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:01,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:01,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:45:01,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 01:45:01,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 01:45:01,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:01,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:02,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:02,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:02,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 01:45:02,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 01:45:02,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:02,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 01:45:02,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:45:03,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:03,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:03,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:03,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 01:45:04,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 01:45:04,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:04,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:04,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:45:04,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:45:05,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:05,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:05,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:05,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 01:45:05,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:05,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 01:45:05,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:05,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:05,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:45:05,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:05,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:45:05,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:05,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:05,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:05,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 01:45:05,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:06,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:06,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:45:06,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:45:06,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:06,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:45:06,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:06,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:45:06,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 01:45:06,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:06,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 01:45:07,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:07,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 01:45:07,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 01:45:07,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:45:07,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:07,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:45:07,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:45:08,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:45:08,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:09,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:09,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:09,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:09,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:09,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:10,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:45:10,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:45:11,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:11,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:45:11,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 01:45:11,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 01:45:11,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:45:11,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 01:45:11,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:11,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 01:45:12,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:12,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 01:45:12,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:12,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:45:12,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:13,793 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 01:45:13,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:45:13,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 01:45:13,952 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 01:45:13,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:14,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:14,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:45:14,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:14,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 01:45:14,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:14,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:45:14,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:45:14,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:45:14,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:45:15,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:15,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:45:16,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 01:45:16,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 01:45:16,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 01:45:16,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:45:16,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:45:17,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:45:17,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:45:17,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:17,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:45:17,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 01:45:18,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:18,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:18,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 01:45:19,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:20,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:20,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:20,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:20,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:20,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 01:45:20,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:45:20,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 01:45:20,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:45:20,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 01:45:20,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:21,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:45:21,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:21,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:45:21,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:21,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:45:21,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:45:21,352 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 01:45:21,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:21,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:22,352 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 01:45:22,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:45:22,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:22,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 01:45:22,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:23,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:45:23,181 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 01:45:23,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:45:23,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 01:45:23,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:23,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:23,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:45:23,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:45:23,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:45:23,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:23,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 01:45:23,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:24,100 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 01:45:24,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:45:24,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 01:45:24,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:45:24,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:24,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:25,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:25,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:25,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 01:45:25,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 01:45:25,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:45:25,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:25,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:45:26,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:45:26,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:27,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:27,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:27,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:45:27,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:45:27,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:27,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:27,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:45:28,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 01:45:28,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 01:45:28,379 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 01:45:28,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:29,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 01:45:29,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:29,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:29,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:29,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:29,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:30,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:30,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 01:45:30,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:45:30,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:30,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 01:45:31,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:32,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 01:45:32,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:32,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:45:32,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 01:45:32,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 01:45:32,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:32,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:32,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:32,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:45:33,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 01:45:33,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 01:45:33,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 01:45:33,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 01:45:33,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:33,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:33,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:33,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:45:33,779 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 01:45:34,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:34,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:45:34,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:34,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:45:34,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:45:34,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:34,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:34,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 01:45:34,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:34,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:45:34,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:45:34,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:34,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 01:45:35,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:35,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 01:45:36,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:36,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:45:36,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 01:45:36,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:36,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:45:36,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:45:36,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 01:45:36,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:45:36,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:45:37,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 01:45:37,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:37,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:45:37,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:37,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:38,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:38,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:38,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:38,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:38,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:39,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:45:39,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:45:39,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:45:39,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:39,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:45:40,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 01:45:40,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:40,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 01:45:40,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 01:45:40,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 01:45:40,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:41,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:45:41,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:41,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:41,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:41,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:45:41,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:45:41,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:45:41,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:45:42,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:42,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:42,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 01:45:42,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 01:45:42,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:45:42,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 01:45:42,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:45:42,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:43,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:43,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:43,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 01:45:43,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:45:43,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:45:43,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 01:45:43,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:44,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 01:45:44,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:45:44,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:45,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:45,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 01:45:45,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:45,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:45:45,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:45:45,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 01:45:45,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:45,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:45:45,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:45:45,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 01:45:45,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:45,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:45:45,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:45,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:45,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 01:45:45,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:45:46,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:45:46,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 01:45:46,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:45:46,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:46,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:45:46,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:46,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:47,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 01:45:47,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 01:45:47,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:47,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:45:47,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:47,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 01:45:48,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:48,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:48,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 01:45:48,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:45:48,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:48,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:48,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:45:49,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:45:49,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 01:45:49,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:49,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:45:49,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:45:49,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:45:49,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:50,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:45:50,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 01:45:50,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:50,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:45:50,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:45:50,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:51,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:45:51,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 01:45:51,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:52,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:45:52,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:52,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 01:45:52,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:45:53,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:53,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:45:53,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:54,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:45:54,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 01:45:54,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:54,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:54,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:55,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:55,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:55,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:55,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:55,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:55,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:55,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:55,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:55,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:56,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 01:45:56,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 01:45:56,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:56,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:56,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:56,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:45:56,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:56,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:56,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:45:56,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:57,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:57,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:58,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 01:45:58,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:45:58,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 01:45:58,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:45:58,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:45:58,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:58,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:58,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 01:45:58,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:45:58,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:45:59,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:59,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:59,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:45:59,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:59,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:59,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:59,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:59,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 01:46:00,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:00,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:00,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:46:00,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:01,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:01,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 01:46:01,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:46:01,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:46:01,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:46:01,721 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 01:46:02,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:46:02,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:46:02,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 01:46:02,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:46:02,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 01:46:03,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 01:46:03,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:46:03,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:46:03,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:46:03,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:46:03,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:46:03,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:03,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 01:46:03,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 01:46:04,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:04,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:04,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:46:04,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:04,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:46:04,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 01:46:04,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 01:46:04,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 01:46:04,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:04,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 01:46:04,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 01:46:04,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:05,046 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 01:46:05,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:46:05,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:05,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:05,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 01:46:05,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:46:05,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:05,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:05,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:46:05,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:05,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:06,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:06,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:06,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:46:07,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 01:46:07,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:46:07,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:07,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:08,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:46:08,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:08,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:46:08,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:08,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:46:08,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:09,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:46:09,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:09,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:46:09,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 01:46:09,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:10,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:10,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:10,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 01:46:10,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:11,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:46:12,049 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 01:46:12,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:12,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:46:12,309 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 01:46:12,362 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 01:46:12,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:12,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:12,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:46:12,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:12,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:12,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:12,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 01:46:12,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:12,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:12,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:12,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 01:46:12,981 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 01:46:12,985 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 01:46:12,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 01:46:13,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:13,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:46:13,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:13,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:13,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 01:46:13,747 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 01:46:13,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:14,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:14,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:14,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:14,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 01:46:14,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 01:46:14,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 01:46:14,494 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 01:46:14,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:46:14,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:14,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 01:46:14,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:15,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:15,031 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 01:46:15,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:15,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 01:46:15,347 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 01:46:16,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 01:46:16,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 01:46:16,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 01:46:16,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:16,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:16,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:16,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:16,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 01:46:16,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 01:46:16,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:16,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:46:16,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:17,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:17,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:17,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 01:46:17,212 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 01:46:17,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:18,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:18,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 01:46:18,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:46:18,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:18,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:46:18,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 01:46:18,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:46:18,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:46:18,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:46:19,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:46:19,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 01:46:19,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 01:46:19,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 01:46:19,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:19,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 01:46:19,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:46:20,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:46:20,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 01:46:21,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:21,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:21,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:46:21,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:21,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:22,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:22,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:46:22,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:46:22,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:22,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 01:46:22,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:22,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:22,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:22,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:46:22,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:46:23,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:23,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:23,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:23,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:23,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:23,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:46:23,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 01:46:23,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 01:46:23,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:46:24,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:24,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 01:46:25,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:46:25,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:25,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 01:46:25,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:25,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:25,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:46:25,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:25,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:26,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:46:26,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 01:46:26,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:46:26,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:26,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:26,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:26,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 01:46:26,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:46:27,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 01:46:27,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:46:27,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:27,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 01:46:27,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 01:46:27,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:28,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:28,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:28,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:46:28,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:28,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:28,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:30,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:30,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:30,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:30,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:46:30,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:46:30,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:46:31,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:46:31,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:46:31,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:46:31,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 01:46:31,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:31,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:32,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:32,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:32,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:32,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 01:46:32,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:46:32,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:32,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:46:32,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:46:32,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:33,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:46:33,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:33,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 01:46:33,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 01:46:33,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 01:46:34,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 01:46:34,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 01:46:34,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:34,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:34,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:34,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:35,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:35,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:35,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:46:35,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:46:35,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:35,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:35,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:36,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:36,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 01:46:36,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 01:46:36,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:46:36,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 01:46:36,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 01:46:36,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:37,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 01:46:37,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:46:37,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:38,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:38,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:46:38,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 01:46:38,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:38,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:38,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:38,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:46:39,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 01:46:39,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 01:46:39,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:46:39,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 01:46:39,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 01:46:39,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:46:39,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:39,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:39,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:39,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:46:39,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:46:39,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 01:46:39,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:39,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 01:46:40,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 01:46:40,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:46:40,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 01:46:40,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:46:40,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:40,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:40,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:40,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:46:40,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:40,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:46:41,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:41,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:41,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:46:41,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:46:41,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:41,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 01:46:41,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:46:41,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:46:42,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:42,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:43,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 01:46:43,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:43,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:44,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:44,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:44,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 01:46:44,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:46:44,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:44,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:44,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:46:45,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:46:45,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 01:46:45,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:46:45,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:45,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:46,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:46,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 01:46:46,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:46,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 01:46:46,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:46,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:46,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:46,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:46,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:47,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 01:46:47,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 01:46:47,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 01:46:47,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:47,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:47,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:47,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:48,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:46:48,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:48,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:48,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:48,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:48,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:48,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:49,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 01:46:49,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:46:49,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 01:46:49,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:49,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 01:46:49,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:49,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 01:46:49,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 01:46:49,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:49,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:46:50,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:46:50,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:50,243 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 01:46:50,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:50,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 01:46:50,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:50,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:50,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 01:46:51,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:51,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:46:52,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:52,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:52,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:52,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:52,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:46:53,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 01:46:53,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 01:46:53,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 01:46:53,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 01:46:53,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:53,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:46:53,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:53,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 01:46:53,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:54,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:54,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:46:54,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:46:54,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:46:54,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:54,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:54,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:54,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:46:54,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:46:54,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 01:46:54,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:46:55,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 01:46:55,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:56,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:56,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:56,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:56,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:46:57,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 01:46:57,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 01:46:57,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:57,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:46:57,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:57,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:58,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:46:58,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 01:46:58,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:58,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 01:46:58,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:46:59,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:59,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 01:46:59,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:59,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:59,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 01:47:00,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:00,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:00,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:01,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:47:01,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 01:47:01,136 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 01:47:01,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:01,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:01,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:01,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 01:47:01,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:01,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 01:47:02,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:02,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:02,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:02,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:02,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 01:47:02,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:02,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 01:47:03,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:03,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:03,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:03,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 01:47:03,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:47:04,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 01:47:04,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:04,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:04,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:04,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:04,487 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 01:47:04,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 01:47:05,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 01:47:05,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:05,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:06,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:47:06,103 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 01:47:06,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:06,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:47:06,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:47:06,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 01:47:06,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 01:47:06,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:06,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:47:06,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:06,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:47:06,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 01:47:07,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 01:47:07,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:07,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:07,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 01:47:07,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:07,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:07,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:47:07,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:08,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:47:08,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:08,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 01:47:08,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 01:47:08,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 01:47:08,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:47:08,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:08,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 01:47:09,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:09,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:10,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:47:10,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:10,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:10,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 01:47:10,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:10,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:47:10,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:47:11,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:11,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:11,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:47:11,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:11,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 01:47:11,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:11,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:11,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:11,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:12,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:12,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:47:12,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:47:12,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:12,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 01:47:12,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 01:47:12,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:12,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:12,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:47:12,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:12,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:47:12,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:47:12,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:12,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:13,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:13,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:13,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 01:47:13,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:13,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:13,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:47:13,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:14,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:14,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:47:14,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:14,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:47:15,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:47:15,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:47:15,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:15,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:15,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:15,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 01:47:15,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 01:47:15,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:15,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:15,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:15,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:16,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:16,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 01:47:16,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:17,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:47:17,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:47:17,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:17,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:17,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:47:17,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:47:17,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 01:47:18,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:18,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:47:18,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:47:18,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:47:18,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 01:47:18,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:47:18,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:19,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:19,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 01:47:19,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 01:47:19,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:47:20,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:20,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 01:47:20,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:20,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:47:20,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:47:20,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 01:47:20,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:20,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 01:47:20,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:20,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:20,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 01:47:21,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:22,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:47:22,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:22,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 01:47:22,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:22,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:22,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:22,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:47:23,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 01:47:24,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 01:47:24,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 01:47:24,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 01:47:24,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:47:24,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:24,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:24,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:24,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:47:24,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 01:47:25,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 01:47:25,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:47:25,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:47:25,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:47:25,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:25,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:25,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:25,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 01:47:25,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:47:25,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:25,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 01:47:25,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 01:47:26,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 01:47:26,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:47:26,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:26,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:47:26,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:26,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 01:47:26,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:47:26,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 01:47:26,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:26,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:47:27,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:27,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 01:47:27,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:47:28,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 01:47:28,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 01:47:28,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:28,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:47:28,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 01:47:28,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:47:28,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:47:29,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:29,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:29,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:47:29,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:29,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:47:29,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:47:29,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:29,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:47:29,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 01:47:30,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 01:47:30,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:47:30,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 01:47:30,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:30,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:30,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:47:30,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:30,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:30,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:47:30,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:30,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:30,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:30,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:31,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:47:31,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:31,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:47:31,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:32,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:32,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:32,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 01:47:32,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 01:47:32,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:33,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:47:33,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:47:33,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:47:33,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:33,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:47:33,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:33,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:47:33,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:47:33,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:33,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:33,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 01:47:34,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:47:34,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:47:34,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:34,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:47:34,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:47:34,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:34,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:47:34,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:47:34,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:35,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:47:35,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:35,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 01:47:35,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:35,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 01:47:35,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:47:36,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:36,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:47:37,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 01:47:37,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 01:47:37,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:37,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 01:47:37,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:47:37,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:47:37,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 01:47:37,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:37,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:47:37,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 01:47:37,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:37,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:37,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 01:47:38,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 01:47:38,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:47:38,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 01:47:38,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:47:38,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:38,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:47:38,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 01:47:38,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 01:47:38,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 01:47:38,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:38,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:38,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 01:47:38,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:47:38,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:38,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:38,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:47:39,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 01:47:39,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:39,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:47:39,549 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 01:47:39,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:47:39,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:39,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:40,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 01:47:40,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:40,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:40,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:40,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 01:47:40,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:41,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:41,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:41,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 01:47:42,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:42,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:42,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:42,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:47:43,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:43,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:47:43,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:47:43,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:43,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:43,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 01:47:43,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:47:43,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:43,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:47:43,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 01:47:43,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:44,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:44,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:47:44,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:47:44,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:47:45,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 01:47:45,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:47:45,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:45,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 01:47:45,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:47:46,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:46,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:46,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:47:46,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 01:47:46,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 01:47:46,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:46,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:47:46,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:46,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 01:47:46,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:47,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 01:47:47,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:47:47,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:47:47,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:47,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:47,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:47:47,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:47,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:47,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:47,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:47,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 01:47:48,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:47:48,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:47:48,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:48,437 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 01:47:48,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:47:48,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:48,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:48,662 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 01:47:48,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:47:48,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 01:47:48,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:49,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:49,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:49,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 01:47:49,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 01:47:49,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:49,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:49,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:47:49,859 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 01:47:50,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:47:50,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 01:47:50,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 01:47:51,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:51,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:47:51,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:51,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 01:47:51,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 01:47:51,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:51,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:47:52,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:52,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 01:47:52,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:52,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:52,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 01:47:52,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:52,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:52,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 01:47:53,066 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 01:47:53,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:53,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 01:47:53,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 01:47:53,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:54,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:54,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 01:47:54,278 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 01:47:54,293 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 01:47:55,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 01:47:55,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:55,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 01:47:55,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 01:47:55,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:47:55,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:47:56,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 01:47:56,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:47:56,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 01:47:56,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:47:56,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:56,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:56,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:47:56,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:47:56,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:56,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 01:47:56,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 01:47:56,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 01:47:56,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:57,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 01:47:57,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:57,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:47:57,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:57,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:57,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:47:57,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 01:47:57,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:57,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:47:58,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:47:58,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:58,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:58,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:47:58,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:47:58,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 01:47:58,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:47:58,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:47:58,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:47:58,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 01:47:59,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:47:59,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:48:00,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 01:48:00,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:00,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:48:00,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:01,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:48:01,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:48:01,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 01:48:01,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:48:01,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:48:01,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:48:01,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:48:02,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:48:02,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 01:48:02,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:02,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:48:02,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:48:02,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:48:02,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:48:02,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:48:02,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:48:02,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:03,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:48:03,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:48:04,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:04,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 01:48:04,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:04,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:04,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:04,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:48:04,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:04,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 01:48:05,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:48:05,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:48:05,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 01:48:05,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:48:05,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:48:05,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 01:48:05,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 01:48:05,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 01:48:06,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:48:06,006 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 01:48:06,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:06,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:06,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:48:06,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 01:48:06,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:48:06,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:06,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 01:48:06,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 01:48:06,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 01:48:07,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 01:48:07,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:48:08,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:48:08,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:08,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 01:48:08,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:08,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:48:08,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:08,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:48:08,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:48:08,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:48:09,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:48:09,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:48:09,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:09,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:09,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 01:48:09,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:48:09,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:48:09,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:09,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:48:09,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:48:10,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:48:10,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:48:10,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:48:10,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:48:11,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:11,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:48:11,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:48:11,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:48:11,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 01:48:11,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:48:11,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:14,341 INFO [scaling.py:1022] (1/4) Whitening: name=None, num_groups=1, num_channels=256, metric=5.97 vs. limit=7.5 2023-10-02 01:48:14,638 INFO [scaling.py:1022] (1/4) Whitening: name=None, num_groups=1, num_channels=512, metric=2.93 vs. limit=7.5 2023-10-02 01:48:15,930 INFO [train.py:1386] (1/4) Maximum memory allocated so far is 20006MB 2023-10-02 01:48:18,277 INFO [train.py:1386] (1/4) Maximum memory allocated so far is 20147MB 2023-10-02 01:48:21,313 INFO [train.py:1386] (1/4) Maximum memory allocated so far is 20147MB 2023-10-02 01:48:23,307 INFO [train.py:1386] (1/4) Maximum memory allocated so far is 20147MB 2023-10-02 01:48:29,317 INFO [train.py:1386] (1/4) Maximum memory allocated so far is 20147MB 2023-10-02 01:48:31,442 INFO [scaling.py:1022] (1/4) Whitening: name=None, num_groups=1, num_channels=384, metric=5.84 vs. limit=7.5 2023-10-02 01:48:32,205 INFO [train.py:1386] (1/4) Maximum memory allocated so far is 20147MB 2023-10-02 01:48:32,220 INFO [train.py:1267] (1/4) Loading grad scaler state dict 2023-10-02 01:48:49,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:48:49,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 01:48:49,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 01:48:49,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:49,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:49,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:49,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:49,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:49,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:48:49,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:49,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:48:50,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:48:50,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 01:48:50,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 01:48:50,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 01:48:50,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:48:50,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 01:48:50,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 01:48:50,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:50,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:50,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:51,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:48:51,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:48:51,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:48:51,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:51,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:51,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:51,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:51,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:48:51,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:51,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:48:52,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 01:48:52,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:48:52,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:52,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 01:48:52,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 01:48:52,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:48:52,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:48:52,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 01:48:53,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 01:48:53,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:48:54,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:48:54,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:48:54,268 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 01:48:54,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 01:48:54,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:48:54,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:54,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 01:48:54,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 01:48:54,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 01:48:54,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:48:57,873 INFO [train.py:1046] (1/4) Epoch 21, batch 0, loss[loss=0.1684, simple_loss=0.2486, pruned_loss=0.04415, over 24630.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2486, pruned_loss=0.04415, over 24630.00 frames. ], batch size: 65, lr: 4.96e-03, grad_scale: 16.0 2023-10-02 01:48:57,873 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 01:49:10,086 INFO [train.py:1078] (1/4) Epoch 21, validation: loss=0.2779, simple_loss=0.2712, pruned_loss=0.1423, over 1125622.00 frames. 2023-10-02 01:49:10,087 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20147MB 2023-10-02 01:49:13,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 01:49:13,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:49:16,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:49:19,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:49:20,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:49:20,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:49:21,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 01:49:23,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 01:49:25,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:49:25,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:49:28,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:49:28,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:49:28,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:49:29,183 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.76 vs. limit=15.0 2023-10-02 01:49:29,770 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.851e+02 2.013e+02 2.311e+02 4.182e+02, threshold=4.026e+02, percent-clipped=1.0 2023-10-02 01:49:29,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:49:31,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 01:49:34,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:49:37,286 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 01:49:40,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=708413.3333333334, ans=0.0 2023-10-02 01:49:41,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:49:41,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:49:43,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 01:49:47,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:49:47,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:49:50,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:49:54,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:49:57,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:50:01,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 01:50:02,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=708480.0, ans=0.5 2023-10-02 01:50:04,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 01:50:04,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:50:04,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:50:06,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:50:06,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:50:07,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 01:50:10,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:50:10,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:50:11,370 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.88 vs. limit=22.5 2023-10-02 01:50:15,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:50:18,022 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 01:50:19,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:50:22,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:50:24,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:50:26,156 INFO [train.py:1046] (1/4) Epoch 21, batch 50, loss[loss=0.1846, simple_loss=0.2671, pruned_loss=0.05106, over 23999.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2511, pruned_loss=0.05218, over 1064299.06 frames. ], batch size: 86, lr: 4.96e-03, grad_scale: 16.0 2023-10-02 01:50:26,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 01:50:26,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:50:26,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:50:28,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:50:29,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:50:30,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=708613.3333333334, ans=0.1 2023-10-02 01:50:30,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=708613.3333333334, ans=0.125 2023-10-02 01:50:31,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:50:36,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 01:50:36,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:50:36,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=708613.3333333334, ans=0.0 2023-10-02 01:50:42,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:50:43,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 01:50:45,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 01:50:47,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:50:48,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:50:48,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:50:50,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:50:50,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:50:51,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:50:51,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:51:02,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:51:03,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:51:03,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:51:05,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 01:51:06,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:51:08,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:51:08,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 01:51:09,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:51:09,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=708813.3333333334, ans=0.0 2023-10-02 01:51:10,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 01:51:17,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:51:17,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:51:18,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:51:18,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:51:18,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:51:23,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 01:51:23,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 01:51:25,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:51:25,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:51:27,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:51:27,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=708880.0, ans=0.0 2023-10-02 01:51:28,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:51:28,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 01:51:30,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 01:51:30,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:51:32,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:51:34,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:51:35,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 01:51:35,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 01:51:36,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:51:38,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:51:39,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:51:39,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:51:40,806 INFO [train.py:1046] (1/4) Epoch 21, batch 100, loss[loss=0.1764, simple_loss=0.2581, pruned_loss=0.04735, over 24458.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2518, pruned_loss=0.05025, over 1878600.90 frames. ], batch size: 63, lr: 4.96e-03, grad_scale: 16.0 2023-10-02 01:51:42,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:51:43,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:51:48,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:51:49,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 01:51:49,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:51:54,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:51:54,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:51:54,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:51:54,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:51:54,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:51:57,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 01:51:58,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:52:00,006 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.895e+02 2.091e+02 2.365e+02 3.412e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-02 01:52:00,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:52:00,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:52:00,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:52:02,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 01:52:04,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:52:04,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:52:04,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:52:08,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:52:11,437 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 01:52:11,450 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 01:52:12,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:52:12,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:52:17,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:52:19,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:52:20,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:25,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:26,661 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 01:52:26,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=709146.6666666666, ans=0.1 2023-10-02 01:52:27,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:52:30,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:52:32,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:52:32,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=709146.6666666666, ans=0.125 2023-10-02 01:52:33,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:36,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:52:38,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:52:39,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=709146.6666666666, ans=0.125 2023-10-02 01:52:40,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:52:41,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:43,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:52:44,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:52:44,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:52:44,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:44,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 01:52:44,590 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 01:52:44,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:52:45,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:52:46,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:52:46,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:52:46,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:52:46,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:52:48,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:52:48,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:52:49,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:52:50,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:52:52,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:52:52,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:52:54,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:52:54,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=709213.3333333334, ans=0.0 2023-10-02 01:52:56,952 INFO [train.py:1046] (1/4) Epoch 21, batch 150, loss[loss=0.1525, simple_loss=0.2306, pruned_loss=0.03718, over 23129.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2537, pruned_loss=0.05101, over 2506552.48 frames. ], batch size: 51, lr: 4.95e-03, grad_scale: 16.0 2023-10-02 01:52:57,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:52:57,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:52:57,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:52:59,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:52:59,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:53:01,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:53:02,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:53:07,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 01:53:07,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 01:53:07,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 01:53:12,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:53:12,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:53:12,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:53:13,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:53:13,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:53:14,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:53:15,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:53:18,143 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 01:53:19,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:53:26,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:53:29,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:53:30,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 01:53:33,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:53:33,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:53:33,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:53:36,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:53:36,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:53:38,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:53:41,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:53:41,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 01:53:46,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:53:46,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:53:48,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:53:48,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:53:49,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:53:51,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 01:53:53,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:53:54,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:53:55,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:53:57,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:53:57,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 01:53:59,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:53:59,152 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 01:54:03,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:54:06,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:54:06,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:54:09,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 01:54:09,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:54:11,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:54:12,624 INFO [train.py:1046] (1/4) Epoch 21, batch 200, loss[loss=0.1737, simple_loss=0.2579, pruned_loss=0.0447, over 24573.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2535, pruned_loss=0.04993, over 3009588.13 frames. ], batch size: 71, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 01:54:12,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 01:54:12,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:54:14,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:54:15,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:54:16,166 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.85 vs. limit=15.0 2023-10-02 01:54:20,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:54:21,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:54:21,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:54:30,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=709680.0, ans=0.1 2023-10-02 01:54:32,803 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.906e+02 2.104e+02 2.322e+02 3.848e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-02 01:54:40,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:54:40,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:54:43,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:54:43,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:54:45,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:54:45,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:54:46,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:54:47,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:54:47,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:54:47,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:54:49,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 01:54:49,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:54:50,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:54:50,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=709746.6666666666, ans=0.0 2023-10-02 01:54:53,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:55:01,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:55:01,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=709813.3333333334, ans=0.125 2023-10-02 01:55:09,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:09,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:55:18,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:20,361 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.35 vs. limit=15.0 2023-10-02 01:55:20,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 01:55:20,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:55:22,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:55:22,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:55:23,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:55:26,254 INFO [train.py:1046] (1/4) Epoch 21, batch 250, loss[loss=0.1773, simple_loss=0.2577, pruned_loss=0.04842, over 24450.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2527, pruned_loss=0.04865, over 3395549.31 frames. ], batch size: 63, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 01:55:26,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 01:55:26,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:55:26,406 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 01:55:29,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:29,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:55:29,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=709946.6666666666, ans=0.125 2023-10-02 01:55:29,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=709946.6666666666, ans=0.125 2023-10-02 01:55:30,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:30,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=709946.6666666666, ans=0.125 2023-10-02 01:55:31,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:55:35,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:55:35,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:36,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:55:41,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:55:45,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=710013.3333333334, ans=0.125 2023-10-02 01:55:50,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:55:52,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:55:53,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:55:53,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=710013.3333333334, ans=0.1 2023-10-02 01:55:59,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:56:00,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:56:00,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:56:00,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:56:02,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:56:02,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:56:03,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:56:05,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:56:06,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 01:56:08,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:56:10,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:56:11,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:56:11,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:56:11,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:56:14,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:56:14,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:56:15,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:56:17,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:56:18,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:56:23,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:56:27,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:56:28,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=710213.3333333334, ans=0.125 2023-10-02 01:56:30,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:56:36,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:56:37,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:56:39,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 01:56:40,518 INFO [train.py:1046] (1/4) Epoch 21, batch 300, loss[loss=0.1734, simple_loss=0.2555, pruned_loss=0.0457, over 24340.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2519, pruned_loss=0.0491, over 3679561.75 frames. ], batch size: 77, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 01:56:40,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:56:40,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:56:43,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 01:56:43,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:56:44,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:56:44,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 01:56:49,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:56:51,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:56:54,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:56:56,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 01:56:56,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:56:57,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:56:57,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 01:56:57,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:57:02,047 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.853e+02 2.062e+02 2.397e+02 3.479e+02, threshold=4.124e+02, percent-clipped=0.0 2023-10-02 01:57:02,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:57:04,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:57:04,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 01:57:09,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 01:57:09,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:12,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:57:14,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:14,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 01:57:14,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:57:17,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:57:17,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=710413.3333333334, ans=0.2 2023-10-02 01:57:20,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:57:20,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:57:25,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:57:25,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 01:57:26,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:57:28,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:29,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 01:57:29,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:57:29,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=710480.0, ans=0.125 2023-10-02 01:57:31,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=710480.0, ans=0.1 2023-10-02 01:57:32,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:57:35,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:57:35,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 01:57:35,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=710480.0, ans=0.125 2023-10-02 01:57:39,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:39,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:57:42,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:43,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=710546.6666666666, ans=0.0 2023-10-02 01:57:44,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:57:44,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=710546.6666666666, ans=0.1 2023-10-02 01:57:45,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 01:57:45,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:57:46,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:57:48,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 01:57:48,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:48,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:57:49,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:57:51,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:57:51,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:57:52,016 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.14 vs. limit=15.0 2023-10-02 01:57:55,493 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 01:57:56,577 INFO [train.py:1046] (1/4) Epoch 21, batch 350, loss[loss=0.1696, simple_loss=0.237, pruned_loss=0.05107, over 23845.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2487, pruned_loss=0.04871, over 3901723.33 frames. ], batch size: 195, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 01:57:56,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:57:56,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:58:00,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:00,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=710613.3333333334, ans=0.0 2023-10-02 01:58:05,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:58:08,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:58:08,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:11,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 01:58:11,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:58:11,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=710680.0, ans=0.125 2023-10-02 01:58:13,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 01:58:15,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=710680.0, ans=0.125 2023-10-02 01:58:16,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:16,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 01:58:16,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:58:20,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 01:58:22,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:58:22,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:58:24,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:58:26,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:58:26,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:58:26,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:58:27,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:58:27,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:58:29,850 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.98 vs. limit=15.0 2023-10-02 01:58:30,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:58:30,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:35,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:58:35,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:58:37,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:58:38,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:58:42,449 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.78 vs. limit=15.0 2023-10-02 01:58:42,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 01:58:42,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:47,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:58:47,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:58:47,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:58:49,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 01:58:53,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:58:55,153 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 01:58:56,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 01:58:56,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:59:00,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:59:00,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 01:59:02,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:59:03,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=710880.0, ans=0.0 2023-10-02 01:59:04,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:59:04,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=710880.0, ans=0.0 2023-10-02 01:59:07,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:59:07,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:59:07,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:59:07,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=710880.0, ans=0.0 2023-10-02 01:59:10,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:59:11,356 INFO [train.py:1046] (1/4) Epoch 21, batch 400, loss[loss=0.1827, simple_loss=0.2554, pruned_loss=0.055, over 23676.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2495, pruned_loss=0.04897, over 4087986.81 frames. ], batch size: 232, lr: 4.95e-03, grad_scale: 16.0 2023-10-02 01:59:12,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:59:13,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=710946.6666666666, ans=0.125 2023-10-02 01:59:14,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:59:14,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 01:59:15,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:59:15,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:59:19,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:59:19,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:59:21,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:59:23,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:59:24,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 01:59:24,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=711013.3333333334, ans=0.09899494936611666 2023-10-02 01:59:26,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 01:59:26,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:59:26,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 01:59:26,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:59:28,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=711013.3333333334, ans=0.1 2023-10-02 01:59:31,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:59:31,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:59:31,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 01:59:31,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:59:31,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:59:32,632 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.816e+02 1.987e+02 2.321e+02 3.446e+02, threshold=3.973e+02, percent-clipped=0.0 2023-10-02 01:59:32,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:59:32,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:59:34,199 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 01:59:34,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 01:59:36,559 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.40 vs. limit=22.5 2023-10-02 01:59:38,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:59:39,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:59:41,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 01:59:41,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 01:59:41,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=711080.0, ans=0.1 2023-10-02 01:59:44,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:59:48,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:59:54,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 01:59:55,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=711146.6666666666, ans=0.125 2023-10-02 01:59:56,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=711146.6666666666, ans=0.2 2023-10-02 01:59:57,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:59:59,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 02:00:00,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:00:03,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:00:03,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 02:00:08,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:00:09,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 02:00:10,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:00:13,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:00:13,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 02:00:17,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 02:00:17,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 02:00:18,008 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.28 vs. limit=15.0 2023-10-02 02:00:19,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:00:19,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:00:21,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 02:00:21,735 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.88 vs. limit=15.0 2023-10-02 02:00:22,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:00:24,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:00:24,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:00:25,795 INFO [train.py:1046] (1/4) Epoch 21, batch 450, loss[loss=0.1677, simple_loss=0.2496, pruned_loss=0.04293, over 24326.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2504, pruned_loss=0.04932, over 4233936.73 frames. ], batch size: 77, lr: 4.95e-03, grad_scale: 16.0 2023-10-02 02:00:25,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 02:00:25,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:00:26,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:00:27,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:00:27,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 02:00:28,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:00:30,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:00:31,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:00:31,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=711280.0, ans=0.125 2023-10-02 02:00:34,862 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.31 vs. limit=15.0 2023-10-02 02:00:41,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:00:42,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:00:44,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 02:00:44,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 02:00:47,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:00:50,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:00:51,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:00:54,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:00:56,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:00:57,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=711413.3333333334, ans=0.125 2023-10-02 02:00:58,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 02:00:58,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 02:01:00,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 02:01:00,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:01:01,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:01:03,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:01:03,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=711413.3333333334, ans=0.125 2023-10-02 02:01:04,658 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 02:01:04,669 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 02:01:04,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:01:06,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:01:07,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 02:01:12,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:01:13,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:01:13,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 02:01:13,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 02:01:18,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:01:20,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:01:20,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:01:23,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 02:01:26,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:01:26,822 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=711546.6666666666, ans=0.125 2023-10-02 02:01:27,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 02:01:29,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 02:01:29,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:01:33,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=711546.6666666666, ans=0.125 2023-10-02 02:01:35,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:01:36,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:01:37,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:01:37,821 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 02:01:40,464 INFO [train.py:1046] (1/4) Epoch 21, batch 500, loss[loss=0.15, simple_loss=0.2268, pruned_loss=0.03658, over 24437.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.251, pruned_loss=0.04941, over 4338719.52 frames. ], batch size: 58, lr: 4.95e-03, grad_scale: 16.0 2023-10-02 02:01:40,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=711613.3333333334, ans=0.0 2023-10-02 02:01:40,822 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=711613.3333333334, ans=0.125 2023-10-02 02:01:41,152 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.28 vs. limit=15.0 2023-10-02 02:01:41,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:01:43,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:01:43,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:01:43,567 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 02:01:45,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 02:01:45,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:01:47,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 02:01:50,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 02:01:53,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:01:56,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:01:56,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:01:58,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:01:58,876 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.20 vs. limit=15.0 2023-10-02 02:02:02,242 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.454e+02 2.030e+02 2.224e+02 2.686e+02 4.005e+02, threshold=4.448e+02, percent-clipped=1.0 2023-10-02 02:02:06,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:06,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:02:06,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 02:02:06,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:08,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 02:02:08,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:02:09,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:02:10,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:02:12,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:02:12,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:12,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 02:02:15,747 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 02:02:15,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=711746.6666666666, ans=0.125 2023-10-02 02:02:19,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:02:19,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:20,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:21,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:22,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:02:22,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=711746.6666666666, ans=0.025 2023-10-02 02:02:25,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 02:02:26,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=711813.3333333334, ans=0.1 2023-10-02 02:02:27,485 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.26 vs. limit=15.0 2023-10-02 02:02:28,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:02:29,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:02:32,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:02:35,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:39,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:02:41,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 02:02:41,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:02:41,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:02:44,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 02:02:44,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=711880.0, ans=0.125 2023-10-02 02:02:46,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:02:47,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:02:53,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 02:02:55,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 02:02:55,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:02:55,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 02:02:55,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:02:55,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:02:56,816 INFO [train.py:1046] (1/4) Epoch 21, batch 550, loss[loss=0.209, simple_loss=0.2752, pruned_loss=0.07143, over 22661.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2529, pruned_loss=0.05056, over 4419346.45 frames. ], batch size: 322, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 02:02:56,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:56,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:56,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:02:58,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:03:01,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:03:03,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 02:03:03,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:03:07,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:03:07,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:03:10,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:03:13,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:03:13,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=712013.3333333334, ans=0.2 2023-10-02 02:03:18,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 02:03:18,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 02:03:19,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:03:21,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=712013.3333333334, ans=0.0 2023-10-02 02:03:26,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:03:26,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:03:29,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:03:32,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:03:32,281 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 02:03:32,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:03:32,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=712080.0, ans=0.2 2023-10-02 02:03:33,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 02:03:34,009 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=712080.0, ans=0.0 2023-10-02 02:03:36,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:03:36,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:03:37,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:03:37,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:03:39,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 02:03:41,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 02:03:43,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:03:43,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:03:44,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:03:44,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:03:44,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=712146.6666666666, ans=0.0 2023-10-02 02:03:47,946 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.58 vs. limit=15.0 2023-10-02 02:03:48,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:03:49,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:03:52,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:03:52,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:03:53,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 02:03:55,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:03:56,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:03:58,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 02:04:00,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:04:01,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:04:01,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 02:04:03,424 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.61 vs. limit=15.0 2023-10-02 02:04:06,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 02:04:07,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 02:04:10,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:04:10,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:04:10,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:04:10,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=712280.0, ans=0.09899494936611666 2023-10-02 02:04:11,860 INFO [train.py:1046] (1/4) Epoch 21, batch 600, loss[loss=0.1644, simple_loss=0.2512, pruned_loss=0.03882, over 24437.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2533, pruned_loss=0.05063, over 4484516.78 frames. ], batch size: 69, lr: 4.94e-03, grad_scale: 8.0 2023-10-02 02:04:17,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:04:19,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 02:04:20,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 02:04:22,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:04:24,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:04:26,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:04:28,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=712346.6666666666, ans=0.0 2023-10-02 02:04:29,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 02:04:29,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:04:35,558 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.808e+02 2.033e+02 2.469e+02 3.913e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-02 02:04:35,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 02:04:38,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:04:38,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:04:38,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:04:43,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:04:43,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:04:43,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:04:49,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:04:50,168 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.45 vs. limit=15.0 2023-10-02 02:04:53,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:04:53,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:04:53,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:05:02,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 02:05:08,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:05:08,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:05:11,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 02:05:12,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:05:15,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 02:05:15,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:05:15,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:05:20,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 02:05:21,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 02:05:23,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:05:24,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:05:26,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:05:27,960 INFO [train.py:1046] (1/4) Epoch 21, batch 650, loss[loss=0.1643, simple_loss=0.2337, pruned_loss=0.04747, over 17215.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2521, pruned_loss=0.05061, over 4519276.49 frames. ], batch size: 37, lr: 4.94e-03, grad_scale: 8.0 2023-10-02 02:05:29,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 02:05:30,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:05:35,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:05:35,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:05:39,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:05:43,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 02:05:46,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:05:46,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:05:50,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:05:52,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 02:05:53,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=712680.0, ans=0.0 2023-10-02 02:05:55,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:05:55,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:05:56,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 02:05:58,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:00,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:06:01,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 02:06:01,702 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 02:06:01,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:06:01,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:06:04,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:06,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:06:07,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:06:07,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:06:09,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 02:06:09,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:06:09,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:06:09,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=712746.6666666666, ans=0.125 2023-10-02 02:06:10,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:06:10,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:06:11,602 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.80 vs. limit=10.0 2023-10-02 02:06:13,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:06:13,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 02:06:14,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 02:06:15,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:15,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:06:15,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:06:15,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:06:17,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:06:24,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:24,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:06:25,384 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.15 vs. limit=15.0 2023-10-02 02:06:27,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:06:30,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:06:30,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:06:31,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:06:34,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=712880.0, ans=0.2 2023-10-02 02:06:35,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=712880.0, ans=0.125 2023-10-02 02:06:38,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:06:38,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:06:40,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:06:40,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:06:41,694 INFO [train.py:1046] (1/4) Epoch 21, batch 700, loss[loss=0.183, simple_loss=0.2659, pruned_loss=0.05003, over 24531.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2506, pruned_loss=0.05009, over 4557398.95 frames. ], batch size: 71, lr: 4.94e-03, grad_scale: 8.0 2023-10-02 02:06:43,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 02:06:44,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 02:06:47,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 02:06:47,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:48,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:06:50,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 02:06:54,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:06:57,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:06:59,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:07:00,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:07:00,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=713013.3333333334, ans=0.0 2023-10-02 02:07:00,876 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=713013.3333333334, ans=0.125 2023-10-02 02:07:01,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:07:03,241 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.445e+02 1.955e+02 2.291e+02 2.559e+02 6.231e+02, threshold=4.583e+02, percent-clipped=1.0 2023-10-02 02:07:04,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:07:06,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 02:07:06,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:07:07,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 02:07:11,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 02:07:12,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=713080.0, ans=0.1 2023-10-02 02:07:15,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:07:15,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:07:16,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:07:18,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=713080.0, ans=0.0 2023-10-02 02:07:19,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:07:19,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 02:07:23,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:07:23,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:07:25,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 02:07:27,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:07:29,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:07:30,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:07:35,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:07:35,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 02:07:35,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=713146.6666666666, ans=0.05 2023-10-02 02:07:39,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 02:07:39,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 02:07:42,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:07:42,966 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.63 vs. limit=22.5 2023-10-02 02:07:43,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:07:45,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:07:47,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:07:47,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 02:07:52,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 02:07:52,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 02:07:52,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 02:07:53,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 02:07:54,832 INFO [train.py:1046] (1/4) Epoch 21, batch 750, loss[loss=0.1932, simple_loss=0.2693, pruned_loss=0.05853, over 23960.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2503, pruned_loss=0.04989, over 4588052.28 frames. ], batch size: 80, lr: 4.94e-03, grad_scale: 8.0 2023-10-02 02:07:54,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 02:07:54,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:07:57,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 02:07:59,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:08:00,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:08:00,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:08:03,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:08:04,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:08:04,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:08:07,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:08:07,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:08:09,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:08:11,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:08:12,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:08:13,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 02:08:14,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:08:14,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:08:16,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:08:19,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:08:20,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 02:08:20,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:08:23,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 02:08:23,217 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 02:08:24,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 02:08:24,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:08:24,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 02:08:27,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:08:35,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:08:35,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:08:35,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:08:36,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:08:36,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:08:36,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 02:08:38,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:08:40,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 02:08:40,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:08:44,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:08:45,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 02:08:46,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:08:51,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:08:52,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:08:52,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:08:54,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:08:54,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=713546.6666666666, ans=0.1 2023-10-02 02:08:58,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 02:08:58,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:08:58,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:09:03,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:09:03,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:09:05,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:09:05,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:09:07,706 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.19 vs. limit=15.0 2023-10-02 02:09:09,595 INFO [train.py:1046] (1/4) Epoch 21, batch 800, loss[loss=0.1916, simple_loss=0.2648, pruned_loss=0.05923, over 23856.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2517, pruned_loss=0.04969, over 4632446.25 frames. ], batch size: 195, lr: 4.94e-03, grad_scale: 16.0 2023-10-02 02:09:15,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:09:15,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:18,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:09:18,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:09:18,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:19,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:09:20,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:24,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:09:24,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:09:27,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 02:09:27,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:09:29,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:09:29,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:09:29,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:09:30,649 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.851e+02 2.052e+02 2.465e+02 3.868e+02, threshold=4.103e+02, percent-clipped=0.0 2023-10-02 02:09:30,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 02:09:30,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:09:30,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 02:09:35,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:37,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:09:40,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:09:40,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:09:42,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:09:42,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:09:47,311 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.18 vs. limit=5.0 2023-10-02 02:09:47,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:09:47,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:09:47,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 02:09:49,299 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 02:09:49,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 02:09:49,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:09:49,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:09:52,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:52,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:09:56,561 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 02:09:57,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 02:09:59,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:10:00,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:10:05,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:10:08,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:10:10,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 02:10:10,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:10:15,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 02:10:22,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:10:23,521 INFO [train.py:1046] (1/4) Epoch 21, batch 850, loss[loss=0.185, simple_loss=0.2673, pruned_loss=0.05139, over 24060.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2524, pruned_loss=0.04985, over 4657591.32 frames. ], batch size: 80, lr: 4.94e-03, grad_scale: 16.0 2023-10-02 02:10:23,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:10:23,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 02:10:25,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:10:25,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:10:26,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 02:10:26,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:10:27,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:10:28,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:10:28,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=713946.6666666666, ans=0.125 2023-10-02 02:10:29,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:10:30,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:10:31,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=713946.6666666666, ans=0.125 2023-10-02 02:10:33,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 02:10:33,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 02:10:33,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 02:10:35,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:10:35,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:10:37,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:10:39,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:10:39,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:10:41,129 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.66 vs. limit=15.0 2023-10-02 02:10:44,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:10:44,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:10:44,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 02:10:48,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 02:10:51,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:10:52,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 02:10:55,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 02:10:57,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 02:11:00,018 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 02:11:00,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:11:00,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:11:00,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 02:11:02,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:11:04,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:11:04,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 02:11:06,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:11:08,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:11:08,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:11:10,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:11:10,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:11:12,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:11:12,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 02:11:17,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:11:17,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:11:18,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:11:18,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:11:18,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:11:20,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:11:21,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:11:23,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:11:24,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:11:26,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:11:31,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=714213.3333333334, ans=0.125 2023-10-02 02:11:33,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 02:11:34,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:11:34,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 02:11:35,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:11:35,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:11:37,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 02:11:37,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=714280.0, ans=0.125 2023-10-02 02:11:38,615 INFO [train.py:1046] (1/4) Epoch 21, batch 900, loss[loss=0.1539, simple_loss=0.2303, pruned_loss=0.0387, over 24302.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2531, pruned_loss=0.05044, over 4674791.36 frames. ], batch size: 56, lr: 4.94e-03, grad_scale: 16.0 2023-10-02 02:11:42,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:11:45,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:11:47,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 02:11:51,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:11:51,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 02:11:51,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=714280.0, ans=0.0 2023-10-02 02:11:51,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=714280.0, ans=0.125 2023-10-02 02:11:52,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 02:11:54,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:11:54,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:11:54,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:11:54,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:11:56,356 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.26 vs. limit=6.0 2023-10-02 02:11:59,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=714346.6666666666, ans=0.125 2023-10-02 02:12:00,924 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.833e+02 2.065e+02 2.368e+02 4.209e+02, threshold=4.129e+02, percent-clipped=1.0 2023-10-02 02:12:02,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:12:02,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:12:02,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:12:04,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:12:11,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 02:12:13,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:12:20,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:12:20,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:12:21,441 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 02:12:21,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 02:12:27,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:12:27,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:12:28,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:12:28,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=714480.0, ans=0.125 2023-10-02 02:12:35,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:12:35,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:12:38,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 02:12:38,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:12:41,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 02:12:42,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=714546.6666666666, ans=0.1 2023-10-02 02:12:43,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:12:43,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:12:44,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:12:44,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=714546.6666666666, ans=0.0 2023-10-02 02:12:46,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:12:51,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 02:12:51,067 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 02:12:52,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 02:12:52,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 02:12:54,285 INFO [train.py:1046] (1/4) Epoch 21, batch 950, loss[loss=0.1649, simple_loss=0.255, pruned_loss=0.03745, over 24002.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2531, pruned_loss=0.05021, over 4688014.60 frames. ], batch size: 80, lr: 4.94e-03, grad_scale: 16.0 2023-10-02 02:12:55,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:12:59,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 02:13:02,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:13:06,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:13:06,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:13:07,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 02:13:09,899 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 02:13:14,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:13:14,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:13:14,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:13:14,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:13:16,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 02:13:16,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 02:13:17,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:13:20,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 02:13:20,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:13:24,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:13:24,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=714746.6666666666, ans=0.1 2023-10-02 02:13:26,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:13:26,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:13:27,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 02:13:28,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 02:13:30,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:13:31,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:13:35,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:13:35,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:13:36,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=714746.6666666666, ans=0.125 2023-10-02 02:13:38,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 02:13:39,831 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.42 vs. limit=10.0 2023-10-02 02:13:40,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 02:13:40,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:13:41,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=714813.3333333334, ans=0.125 2023-10-02 02:13:42,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:13:43,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:13:43,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:13:46,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 02:13:47,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:13:49,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:13:51,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:13:51,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 02:13:51,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:13:51,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:13:51,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 02:13:56,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:13:58,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:14:00,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=714880.0, ans=0.09899494936611666 2023-10-02 02:14:01,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:14:03,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 02:14:03,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 02:14:06,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:14:08,797 INFO [train.py:1046] (1/4) Epoch 21, batch 1000, loss[loss=0.1827, simple_loss=0.2651, pruned_loss=0.05015, over 24069.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2528, pruned_loss=0.05058, over 4699258.70 frames. ], batch size: 86, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:14:08,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 02:14:09,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:14:15,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:14:16,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 02:14:16,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 02:14:20,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=714946.6666666666, ans=0.0 2023-10-02 02:14:22,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:14:22,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:14:24,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:14:27,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 02:14:30,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 02:14:31,674 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.852e+02 2.059e+02 2.423e+02 3.876e+02, threshold=4.119e+02, percent-clipped=0.0 2023-10-02 02:14:31,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 02:14:33,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:14:34,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 02:14:37,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 02:14:37,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 02:14:37,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:14:38,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:14:41,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=715080.0, ans=0.0 2023-10-02 02:14:47,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:14:47,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:14:49,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:14:49,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=715080.0, ans=0.125 2023-10-02 02:14:51,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:14:51,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 02:14:51,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:14:52,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:14:54,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:14:54,342 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 02:14:57,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 02:14:58,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 02:15:01,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 02:15:02,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:15:08,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:15:08,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:15:09,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:15:09,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:15:10,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 02:15:12,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:15:12,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 02:15:12,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 02:15:13,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:15:13,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:15:16,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:15:20,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:15:21,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:15:23,788 INFO [train.py:1046] (1/4) Epoch 21, batch 1050, loss[loss=0.178, simple_loss=0.2612, pruned_loss=0.04741, over 24037.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2519, pruned_loss=0.04997, over 4719749.92 frames. ], batch size: 80, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:15:23,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:15:25,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:15:26,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 02:15:28,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:15:29,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:15:32,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:15:33,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:15:36,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:15:36,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:15:36,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:15:37,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:15:38,216 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.11 vs. limit=22.5 2023-10-02 02:15:39,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 02:15:39,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:15:39,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 02:15:40,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:15:40,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 02:15:42,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:15:48,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:15:48,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:15:48,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:15:53,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 02:15:53,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 02:15:54,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:15:56,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 02:15:59,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 02:16:00,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:16:03,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 02:16:04,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 02:16:05,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:16:05,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:16:08,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:16:11,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 02:16:13,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 02:16:13,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 02:16:14,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:16:15,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:16:16,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 02:16:20,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:16:23,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:16:23,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:16:24,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:16:24,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:16:28,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:16:28,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 02:16:29,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:16:29,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 02:16:31,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 02:16:31,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:16:33,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:16:36,668 INFO [train.py:1046] (1/4) Epoch 21, batch 1100, loss[loss=0.1705, simple_loss=0.2422, pruned_loss=0.04941, over 23782.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2504, pruned_loss=0.04961, over 4710019.35 frames. ], batch size: 212, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:16:39,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:16:44,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:16:47,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:16:47,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:16:48,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 02:16:50,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:16:53,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 02:16:54,271 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.23 vs. limit=12.0 2023-10-02 02:16:54,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:16:57,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:16:57,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 02:16:59,259 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.784e+02 1.995e+02 2.356e+02 3.579e+02, threshold=3.990e+02, percent-clipped=0.0 2023-10-02 02:16:59,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 02:17:00,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:17:00,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:17:01,273 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.06 vs. limit=10.0 2023-10-02 02:17:02,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=715680.0, ans=0.07 2023-10-02 02:17:03,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:17:04,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:17:07,078 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.56 vs. limit=6.0 2023-10-02 02:17:09,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:17:13,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 02:17:13,391 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 02:17:14,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:17:17,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:17:17,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:17:19,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:17:22,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 02:17:22,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:17:22,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:17:22,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:17:22,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:17:24,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 02:17:30,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:17:30,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 02:17:33,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:17:33,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=715813.3333333334, ans=0.125 2023-10-02 02:17:35,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=715880.0, ans=0.0 2023-10-02 02:17:36,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:17:39,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 02:17:39,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 02:17:40,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:17:42,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:17:42,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:17:44,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 02:17:46,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:17:46,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:17:47,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 02:17:47,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:17:48,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 02:17:48,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:17:48,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:17:50,273 INFO [train.py:1046] (1/4) Epoch 21, batch 1150, loss[loss=0.1618, simple_loss=0.2409, pruned_loss=0.04139, over 24640.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2508, pruned_loss=0.04956, over 4718484.52 frames. ], batch size: 65, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:17:50,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:17:57,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:17:57,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=715946.6666666666, ans=0.07 2023-10-02 02:17:59,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:18:00,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=715946.6666666666, ans=0.125 2023-10-02 02:18:01,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:18:01,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:18:01,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 02:18:03,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:18:05,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 02:18:06,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:18:06,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:18:10,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 02:18:12,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:18:14,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=716013.3333333334, ans=22.5 2023-10-02 02:18:16,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:18:16,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:18:16,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 02:18:16,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:18:16,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:18:22,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 02:18:24,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:18:25,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:18:32,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:18:37,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:18:37,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 02:18:39,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:18:39,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:18:47,937 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 02:18:49,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:18:57,518 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 02:19:00,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:19:01,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:19:01,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:19:01,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:19:05,847 INFO [train.py:1046] (1/4) Epoch 21, batch 1200, loss[loss=0.1673, simple_loss=0.254, pruned_loss=0.04031, over 24298.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2516, pruned_loss=0.04984, over 4712167.62 frames. ], batch size: 74, lr: 4.93e-03, grad_scale: 32.0 2023-10-02 02:19:05,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:19:10,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:19:10,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:19:13,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:19:13,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:19:13,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:19:14,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:19:16,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:19:16,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=716280.0, ans=0.05 2023-10-02 02:19:19,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:19:19,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:19:19,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=716346.6666666666, ans=0.0 2023-10-02 02:19:21,709 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 02:19:23,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 02:19:27,890 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.837e+02 2.069e+02 2.341e+02 3.988e+02, threshold=4.139e+02, percent-clipped=0.0 2023-10-02 02:19:28,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:19:30,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:19:32,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:19:33,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:19:33,682 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 02:19:35,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:19:35,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=716413.3333333334, ans=0.2 2023-10-02 02:19:38,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=716413.3333333334, ans=0.1 2023-10-02 02:19:39,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=716413.3333333334, ans=0.125 2023-10-02 02:19:41,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:19:41,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:19:42,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 02:19:42,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=716413.3333333334, ans=0.2 2023-10-02 02:19:44,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:19:44,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=716413.3333333334, ans=0.125 2023-10-02 02:19:48,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 02:19:52,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 02:19:52,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:19:53,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:19:55,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:19:57,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:19:58,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:19:58,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:19:58,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=716480.0, ans=0.2 2023-10-02 02:20:00,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:20:00,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 02:20:02,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:20:02,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:20:02,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:20:05,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:20:05,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:20:08,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 02:20:10,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:20:12,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 02:20:16,251 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 02:20:18,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:20:20,217 INFO [train.py:1046] (1/4) Epoch 21, batch 1250, loss[loss=0.2308, simple_loss=0.2871, pruned_loss=0.08723, over 19265.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2522, pruned_loss=0.0501, over 4716234.48 frames. ], batch size: 388, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:20:20,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:20:22,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:20:24,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:20:28,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 02:20:30,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:20:32,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:20:32,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 02:20:32,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=716613.3333333334, ans=0.0 2023-10-02 02:20:35,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:20:37,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:20:40,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 02:20:41,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:20:41,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:20:41,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:20:44,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:20:49,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 02:20:49,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:20:49,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:20:50,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:20:51,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:20:54,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:20:56,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:20:59,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=716746.6666666666, ans=0.0 2023-10-02 02:21:00,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 02:21:02,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:21:05,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:21:06,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 02:21:07,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:21:07,391 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 02:21:07,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:21:07,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:21:07,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=716813.3333333334, ans=0.0 2023-10-02 02:21:11,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:21:13,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:21:13,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:21:14,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 02:21:14,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 02:21:14,957 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=716813.3333333334, ans=0.0 2023-10-02 02:21:16,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 02:21:18,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:21:20,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 02:21:20,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:21:23,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 02:21:23,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:21:25,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 02:21:25,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 02:21:26,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:21:26,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 02:21:26,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:21:29,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 02:21:31,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:21:33,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:21:33,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:21:36,633 INFO [train.py:1046] (1/4) Epoch 21, batch 1300, loss[loss=0.1613, simple_loss=0.2419, pruned_loss=0.04038, over 24552.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2527, pruned_loss=0.05053, over 4708850.81 frames. ], batch size: 60, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:21:36,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:21:39,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:21:40,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 02:21:44,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:21:46,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:21:47,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:21:48,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:21:48,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:21:49,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=717013.3333333334, ans=0.125 2023-10-02 02:21:50,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 02:21:55,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:21:55,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=717013.3333333334, ans=0.125 2023-10-02 02:21:56,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:21:56,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 02:21:59,129 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.481e+02 1.796e+02 1.993e+02 2.237e+02 3.308e+02, threshold=3.985e+02, percent-clipped=0.0 2023-10-02 02:22:01,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:22:05,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:22:05,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:22:07,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:22:08,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:22:10,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:22:10,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=717080.0, ans=0.0 2023-10-02 02:22:11,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 02:22:11,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 02:22:15,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:22:15,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:22:18,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 02:22:18,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 02:22:20,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:22:21,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:22:23,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 02:22:23,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:22:24,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 02:22:25,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:22:28,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:22:28,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:22:34,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 02:22:34,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 02:22:37,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 02:22:40,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:22:43,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 02:22:44,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:22:50,632 INFO [train.py:1046] (1/4) Epoch 21, batch 1350, loss[loss=0.1717, simple_loss=0.2384, pruned_loss=0.05246, over 23729.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.252, pruned_loss=0.05011, over 4709446.26 frames. ], batch size: 164, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:22:50,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 02:22:53,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:22:55,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:22:56,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:22:58,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:22:59,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:23:01,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:23:04,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:23:06,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 02:23:08,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:23:09,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:23:12,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 02:23:13,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:23:13,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:23:13,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 02:23:15,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 02:23:17,195 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.93 vs. limit=15.0 2023-10-02 02:23:18,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 02:23:20,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:23:20,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 02:23:28,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=717413.3333333334, ans=0.1 2023-10-02 02:23:32,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:23:41,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:23:42,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:23:42,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 02:23:42,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=717480.0, ans=0.0 2023-10-02 02:23:45,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:23:48,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 02:23:48,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:23:48,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:23:51,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:23:51,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=717546.6666666666, ans=0.0 2023-10-02 02:23:52,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 02:23:53,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:23:57,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=717546.6666666666, ans=0.0 2023-10-02 02:23:59,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 02:24:00,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 02:24:04,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=717546.6666666666, ans=0.0 2023-10-02 02:24:06,662 INFO [train.py:1046] (1/4) Epoch 21, batch 1400, loss[loss=0.1756, simple_loss=0.2653, pruned_loss=0.043, over 24537.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2507, pruned_loss=0.04959, over 4710125.29 frames. ], batch size: 71, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:24:06,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 02:24:08,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:24:11,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:24:12,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:24:15,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 02:24:16,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=717613.3333333334, ans=0.125 2023-10-02 02:24:17,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 02:24:26,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:24:30,121 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.883e+02 2.074e+02 2.449e+02 3.328e+02, threshold=4.148e+02, percent-clipped=0.0 2023-10-02 02:24:30,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:24:32,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:24:32,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:24:37,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:24:38,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 02:24:42,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=717746.6666666666, ans=0.125 2023-10-02 02:24:43,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=717746.6666666666, ans=0.0 2023-10-02 02:24:48,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:24:48,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:24:54,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 02:24:54,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:24:54,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:24:55,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:24:55,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:24:57,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:24:57,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:24:57,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:24:59,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 02:24:59,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:25:05,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:08,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:25:14,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 02:25:15,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 02:25:16,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:25:18,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 02:25:19,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:25:21,076 INFO [train.py:1046] (1/4) Epoch 21, batch 1450, loss[loss=0.1716, simple_loss=0.2557, pruned_loss=0.04368, over 24619.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.25, pruned_loss=0.04972, over 4707596.70 frames. ], batch size: 68, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:25:21,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:25:23,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:25:26,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:25:26,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:26,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 02:25:29,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=717946.6666666666, ans=0.0 2023-10-02 02:25:31,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=717946.6666666666, ans=0.125 2023-10-02 02:25:32,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:25:34,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:25:36,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:25:36,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 02:25:37,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:25:39,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 02:25:40,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:40,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:25:40,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 02:25:42,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:25:43,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:25:43,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 02:25:43,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:25:45,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:25:46,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:49,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:25:49,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=718080.0, ans=0.125 2023-10-02 02:25:53,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:25:53,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:25:55,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:25:55,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:56,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:25:56,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:25:56,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:58,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:26:02,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 02:26:03,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:26:09,100 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 02:26:10,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:26:11,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:26:13,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:26:13,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 02:26:18,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:26:18,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 02:26:20,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 02:26:20,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:26:22,760 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.00 vs. limit=15.0 2023-10-02 02:26:24,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:26:24,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:26:26,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 02:26:29,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 02:26:29,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 02:26:30,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:26:30,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=718213.3333333334, ans=0.125 2023-10-02 02:26:31,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:26:34,539 INFO [train.py:1046] (1/4) Epoch 21, batch 1500, loss[loss=0.1686, simple_loss=0.2491, pruned_loss=0.04404, over 24509.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2507, pruned_loss=0.04991, over 4717084.07 frames. ], batch size: 66, lr: 4.92e-03, grad_scale: 8.0 2023-10-02 02:26:40,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=718280.0, ans=0.125 2023-10-02 02:26:40,795 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.15 vs. limit=15.0 2023-10-02 02:26:42,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 02:26:42,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:26:42,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:26:44,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:26:44,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:26:46,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:26:46,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 02:26:47,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:26:47,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:26:47,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:26:47,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=718280.0, ans=0.0 2023-10-02 02:26:49,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:26:50,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:26:51,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:26:51,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=718346.6666666666, ans=0.125 2023-10-02 02:26:57,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:26:57,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 02:26:58,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:27:00,061 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.376e+02 1.883e+02 2.099e+02 2.535e+02 4.584e+02, threshold=4.198e+02, percent-clipped=1.0 2023-10-02 02:27:00,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:27:01,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:27:04,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 02:27:04,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=718413.3333333334, ans=0.2 2023-10-02 02:27:07,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 02:27:09,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:27:09,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 02:27:12,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 02:27:13,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:27:14,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:27:15,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:27:15,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 02:27:17,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:27:17,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:27:18,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 02:27:18,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:27:22,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:27:22,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 02:27:27,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:27:29,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:27:31,582 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.91 vs. limit=15.0 2023-10-02 02:27:32,168 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 02:27:32,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:27:32,211 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 02:27:34,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:27:35,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:27:36,831 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 02:27:37,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=718546.6666666666, ans=0.1 2023-10-02 02:27:38,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:27:40,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 02:27:42,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:27:44,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:27:46,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:27:47,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:27:48,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:27:48,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:27:48,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 02:27:48,790 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.11 vs. limit=22.5 2023-10-02 02:27:49,583 INFO [train.py:1046] (1/4) Epoch 21, batch 1550, loss[loss=0.1683, simple_loss=0.2508, pruned_loss=0.04289, over 24484.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.251, pruned_loss=0.04986, over 4712402.31 frames. ], batch size: 63, lr: 4.92e-03, grad_scale: 8.0 2023-10-02 02:27:49,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 02:27:49,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:27:50,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 02:27:52,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 02:27:53,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:27:55,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:27:56,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:27:56,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:27:57,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=718613.3333333334, ans=0.125 2023-10-02 02:27:58,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:27:59,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:28:00,221 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.49 vs. limit=12.0 2023-10-02 02:28:01,607 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 02:28:01,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:28:01,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=718613.3333333334, ans=0.0 2023-10-02 02:28:02,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:28:04,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:28:06,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:28:06,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 02:28:07,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:28:09,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 02:28:09,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 02:28:11,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 02:28:11,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:28:13,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:28:16,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:28:16,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=718680.0, ans=0.0 2023-10-02 02:28:19,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 02:28:19,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 02:28:28,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:28:31,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:28:31,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:28:31,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:28:32,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 02:28:36,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:28:38,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:28:39,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=718813.3333333334, ans=0.125 2023-10-02 02:28:42,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:28:46,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:28:46,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:28:47,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 02:28:47,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:28:47,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:28:49,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:28:49,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 02:28:49,229 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 02:28:49,496 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:28:53,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:28:58,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 02:29:02,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:29:03,927 INFO [train.py:1046] (1/4) Epoch 21, batch 1600, loss[loss=0.2053, simple_loss=0.2655, pruned_loss=0.07253, over 22730.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2516, pruned_loss=0.05018, over 4712462.40 frames. ], batch size: 322, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:29:03,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:29:04,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 02:29:05,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:29:06,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:29:06,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:29:07,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:29:08,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:29:09,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:29:11,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 02:29:12,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 02:29:12,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=718946.6666666666, ans=0.035 2023-10-02 02:29:14,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=718946.6666666666, ans=0.05 2023-10-02 02:29:15,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 02:29:19,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:29:20,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 02:29:20,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:29:24,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:29:27,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:29:28,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 02:29:29,736 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.788e+02 1.991e+02 2.210e+02 3.444e+02, threshold=3.981e+02, percent-clipped=0.0 2023-10-02 02:29:30,290 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.24 vs. limit=15.0 2023-10-02 02:29:31,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:29:32,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 02:29:32,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:29:32,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 02:29:32,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=719080.0, ans=0.2 2023-10-02 02:29:36,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 02:29:44,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:29:46,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 02:29:46,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:29:48,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:29:48,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:29:49,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 02:29:53,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 02:29:54,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:29:54,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:29:55,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:29:57,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:29:57,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:29:58,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:30:00,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:30:05,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:30:05,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:30:08,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 02:30:08,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:30:10,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 02:30:14,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:30:17,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:30:17,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:30:17,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 02:30:17,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 02:30:17,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 02:30:17,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 02:30:19,685 INFO [train.py:1046] (1/4) Epoch 21, batch 1650, loss[loss=0.1702, simple_loss=0.2556, pruned_loss=0.04239, over 24664.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2522, pruned_loss=0.05053, over 4714964.78 frames. ], batch size: 65, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:30:22,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:30:22,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:30:22,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:30:23,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:30:26,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:30:28,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 02:30:29,094 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.46 vs. limit=15.0 2023-10-02 02:30:32,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:30:32,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:30:32,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:30:32,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:30:33,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 02:30:33,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 02:30:39,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:30:42,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:30:51,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 02:30:52,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:30:54,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 02:30:57,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:30:59,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:31:00,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:31:02,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:31:03,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:31:03,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:31:03,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:31:05,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:31:05,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:31:06,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:31:07,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:31:07,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:31:10,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:31:12,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 02:31:15,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:31:15,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 02:31:17,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 02:31:17,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 02:31:17,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:31:18,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:31:18,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:31:19,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:31:19,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 02:31:23,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:31:26,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:31:26,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:31:28,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 02:31:33,176 INFO [train.py:1046] (1/4) Epoch 21, batch 1700, loss[loss=0.1751, simple_loss=0.2465, pruned_loss=0.05192, over 18577.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2516, pruned_loss=0.05036, over 4712364.92 frames. ], batch size: 40, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:31:33,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:31:33,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:31:33,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 02:31:33,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:31:33,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:31:33,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:31:36,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:31:36,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:31:37,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 02:31:39,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:31:48,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:31:51,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:31:56,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:31:56,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:31:56,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:31:58,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:31:59,514 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.914e+02 2.162e+02 2.469e+02 4.106e+02, threshold=4.325e+02, percent-clipped=1.0 2023-10-02 02:32:00,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 02:32:02,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:32:02,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:05,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:32:05,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:32:05,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=719746.6666666666, ans=0.0 2023-10-02 02:32:07,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 02:32:07,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 02:32:09,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:10,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 02:32:12,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:32:19,531 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.31 vs. limit=6.0 2023-10-02 02:32:22,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:32:24,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:32:24,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:32:26,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 02:32:26,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 02:32:26,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:32:29,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:29,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 02:32:29,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:32:29,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:32:31,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:31,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:32:32,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:32:32,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:32:34,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:32:34,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:32:34,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:32:39,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:32:40,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 02:32:42,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:32:43,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:32:45,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=719880.0, ans=0.2 2023-10-02 02:32:47,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 02:32:48,777 INFO [train.py:1046] (1/4) Epoch 21, batch 1750, loss[loss=0.1645, simple_loss=0.2288, pruned_loss=0.05008, over 23556.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2504, pruned_loss=0.04915, over 4717032.43 frames. ], batch size: 256, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:32:51,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:32:53,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:32:54,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 02:32:54,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 02:32:54,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=719946.6666666666, ans=0.125 2023-10-02 02:32:56,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:59,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:32:59,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:33:06,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 02:33:08,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:33:09,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 02:33:09,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:33:11,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:33:13,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 02:33:15,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 02:33:17,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:33:18,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 02:33:18,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=720013.3333333334, ans=0.125 2023-10-02 02:33:26,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:33:28,167 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.18 vs. limit=15.0 2023-10-02 02:33:28,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:33:28,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:33:32,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:33:32,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:33:34,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:33:34,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=720080.0, ans=0.1 2023-10-02 02:33:35,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:33:37,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:33:37,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:33:38,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 02:33:39,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:33:41,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 02:33:41,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:33:42,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:33:42,937 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=720146.6666666666, ans=0.0 2023-10-02 02:33:44,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:33:47,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:33:47,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 02:33:47,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:33:51,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:33:54,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=720213.3333333334, ans=0.0 2023-10-02 02:33:55,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:33:58,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:33:59,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:34:00,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 02:34:00,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:34:01,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:34:01,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:34:01,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:34:01,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:34:03,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:34:05,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=720213.3333333334, ans=0.2 2023-10-02 02:34:07,462 INFO [train.py:1046] (1/4) Epoch 21, batch 1800, loss[loss=0.173, simple_loss=0.2559, pruned_loss=0.04505, over 24454.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2493, pruned_loss=0.04863, over 4718717.25 frames. ], batch size: 66, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:34:07,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:34:07,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:34:08,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 02:34:10,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:34:14,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 02:34:14,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:34:16,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=720280.0, ans=0.125 2023-10-02 02:34:17,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:34:19,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=720280.0, ans=0.2 2023-10-02 02:34:20,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:34:20,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:34:22,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:34:23,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:34:23,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 02:34:25,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:34:28,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:34:32,580 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.823e+02 2.052e+02 2.264e+02 3.155e+02, threshold=4.104e+02, percent-clipped=0.0 2023-10-02 02:34:32,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 02:34:34,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 02:34:35,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 02:34:35,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:34:35,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:34:35,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:34:37,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:34:44,501 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 02:34:47,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:34:48,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:34:50,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 02:34:50,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 02:34:52,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:34:53,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:34:55,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:34:59,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 02:35:04,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=720480.0, ans=0.125 2023-10-02 02:35:05,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:35:05,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 02:35:06,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:35:06,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:35:08,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:35:08,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 02:35:11,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:35:11,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:35:12,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=720546.6666666666, ans=15.0 2023-10-02 02:35:14,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 02:35:14,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:35:15,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:35:16,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:35:16,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:35:17,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:35:18,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:35:20,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:35:20,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:35:22,596 INFO [train.py:1046] (1/4) Epoch 21, batch 1850, loss[loss=0.1845, simple_loss=0.2521, pruned_loss=0.05846, over 23866.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2502, pruned_loss=0.04844, over 4743268.39 frames. ], batch size: 179, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:35:24,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:35:25,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:35:27,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=720613.3333333334, ans=0.2 2023-10-02 02:35:34,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:35:34,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 02:35:38,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 02:35:42,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 02:35:45,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:35:47,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 02:35:47,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 02:35:50,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=720746.6666666666, ans=0.0 2023-10-02 02:35:53,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:35:55,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 02:35:57,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:35:57,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:36:02,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 02:36:02,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:02,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:36:02,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=720746.6666666666, ans=0.125 2023-10-02 02:36:02,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=720746.6666666666, ans=0.125 2023-10-02 02:36:03,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:36:05,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:36:08,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:36:10,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:36:10,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:10,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 02:36:10,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:36:13,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:36:14,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=720813.3333333334, ans=0.0 2023-10-02 02:36:15,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:36:18,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 02:36:18,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:36:23,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:36:23,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:36:23,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 02:36:23,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 02:36:25,039 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:36:26,043 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 02:36:26,113 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 02:36:28,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:36:28,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:36:28,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:36:28,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:30,294 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 02:36:30,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:36:30,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=720880.0, ans=0.125 2023-10-02 02:36:31,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:31,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:36:33,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:36:34,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=720880.0, ans=10.0 2023-10-02 02:36:34,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:36:34,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 02:36:36,478 INFO [train.py:1046] (1/4) Epoch 21, batch 1900, loss[loss=0.1803, simple_loss=0.2656, pruned_loss=0.04755, over 24641.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2504, pruned_loss=0.04841, over 4738166.79 frames. ], batch size: 68, lr: 4.91e-03, grad_scale: 16.0 2023-10-02 02:36:38,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:38,079 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 02:36:38,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:36:39,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:36:41,952 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.84 vs. limit=8.0 2023-10-02 02:36:43,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:36:47,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:36:47,146 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 02:36:48,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 02:36:49,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:36:50,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:36:51,255 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 02:36:51,294 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 02:36:54,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 02:36:57,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:37:00,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 02:37:01,999 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.796e+02 1.986e+02 2.247e+02 3.290e+02, threshold=3.972e+02, percent-clipped=0.0 2023-10-02 02:37:02,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 02:37:02,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=721013.3333333334, ans=0.1 2023-10-02 02:37:09,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=721080.0, ans=0.1 2023-10-02 02:37:12,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 02:37:16,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 02:37:16,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:37:17,753 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 02:37:17,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 02:37:17,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 02:37:17,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 02:37:17,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:37:18,819 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.69 vs. limit=15.0 2023-10-02 02:37:22,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 02:37:24,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=721146.6666666666, ans=0.015 2023-10-02 02:37:25,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:37:30,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:37:30,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 02:37:30,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:37:34,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 02:37:34,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:37:36,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=721213.3333333334, ans=0.0 2023-10-02 02:37:42,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:37:42,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:37:42,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:37:43,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:37:43,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:37:43,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 02:37:45,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:37:48,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:37:48,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:37:49,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:37:49,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:37:49,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:37:50,873 INFO [train.py:1046] (1/4) Epoch 21, batch 1950, loss[loss=0.1774, simple_loss=0.2475, pruned_loss=0.05363, over 23415.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2507, pruned_loss=0.04846, over 4731985.52 frames. ], batch size: 285, lr: 4.91e-03, grad_scale: 16.0 2023-10-02 02:37:51,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:37:54,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:37:57,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:37:57,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:37:57,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:37:59,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=721280.0, ans=0.125 2023-10-02 02:38:00,354 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.19 vs. limit=15.0 2023-10-02 02:38:00,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 02:38:00,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 02:38:00,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:38:02,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:38:05,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:38:05,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:06,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:38:08,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:38:12,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:38:12,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:38:12,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:38:12,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:38:15,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:38:18,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:38:18,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:18,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 02:38:18,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 02:38:19,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:38:20,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:38:20,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:38:24,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:38:25,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:38:30,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:38:33,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:38:33,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:38:33,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 02:38:33,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:38:40,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:38:40,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:38:41,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:38:48,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=721480.0, ans=0.125 2023-10-02 02:38:49,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:50,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:52,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:56,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:38:59,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:39:00,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:39:01,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 02:39:01,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:39:02,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:39:04,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 02:39:06,063 INFO [train.py:1046] (1/4) Epoch 21, batch 2000, loss[loss=0.1706, simple_loss=0.2428, pruned_loss=0.0492, over 17949.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2521, pruned_loss=0.04884, over 4729184.64 frames. ], batch size: 39, lr: 4.91e-03, grad_scale: 32.0 2023-10-02 02:39:06,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:39:09,490 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.54 vs. limit=22.5 2023-10-02 02:39:10,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:39:10,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:39:10,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:39:11,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:39:14,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:39:17,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 02:39:17,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:39:20,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:39:22,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 02:39:23,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:39:23,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:39:25,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:39:26,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 02:39:26,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:39:28,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:39:28,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:39:30,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 02:39:31,583 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.854e+02 2.083e+02 2.376e+02 3.627e+02, threshold=4.166e+02, percent-clipped=0.0 2023-10-02 02:39:31,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:39:34,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 02:39:34,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:39:37,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:39:37,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:39:38,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:39:39,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:39:40,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:39:40,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 02:39:43,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 02:39:43,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:39:43,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:39:48,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:39:49,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:39:51,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:39:51,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:39:53,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:39:53,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:39:55,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:39:55,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:39:57,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:00,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:40:00,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 02:40:04,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=721880.0, ans=0.1 2023-10-02 02:40:05,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:40:05,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:10,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:10,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:40:13,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:14,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:40:14,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:15,292 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.87 vs. limit=15.0 2023-10-02 02:40:16,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:40:16,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:40:19,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:20,613 INFO [train.py:1046] (1/4) Epoch 21, batch 2050, loss[loss=0.1634, simple_loss=0.2379, pruned_loss=0.04442, over 23328.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2523, pruned_loss=0.04951, over 4715448.53 frames. ], batch size: 134, lr: 4.91e-03, grad_scale: 32.0 2023-10-02 02:40:20,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:23,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:40:24,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:28,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:40:30,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:40:31,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:33,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:40:34,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 02:40:34,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:40:37,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:40:37,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:40:40,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=722013.3333333334, ans=0.0 2023-10-02 02:40:43,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=722013.3333333334, ans=0.125 2023-10-02 02:40:46,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:40:46,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:49,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 02:40:50,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:52,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 02:40:53,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:40:56,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:40:58,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:40:58,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:41:00,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:41:01,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:41:02,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:41:04,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:41:05,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:41:07,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:41:09,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:41:10,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:41:14,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:41:20,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:41:22,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 02:41:27,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:41:27,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:41:30,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:41:30,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=722213.3333333334, ans=0.125 2023-10-02 02:41:32,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 02:41:35,361 INFO [train.py:1046] (1/4) Epoch 21, batch 2100, loss[loss=0.1802, simple_loss=0.2545, pruned_loss=0.0529, over 23354.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2509, pruned_loss=0.04929, over 4711517.28 frames. ], batch size: 119, lr: 4.91e-03, grad_scale: 32.0 2023-10-02 02:41:35,465 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 02:41:35,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:41:35,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:41:35,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=722280.0, ans=0.0 2023-10-02 02:41:36,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:41:36,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:41:36,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 02:41:38,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 02:41:38,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:41:41,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:41:43,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:41:43,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:41:44,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:41:44,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 02:41:45,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:41:47,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 02:41:47,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 02:41:49,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:41:49,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:41:49,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 02:41:51,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 02:41:54,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 02:41:54,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:41:58,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:41:59,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:42:00,601 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.890e+02 2.103e+02 2.367e+02 4.500e+02, threshold=4.205e+02, percent-clipped=1.0 2023-10-02 02:42:02,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:42:04,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 02:42:05,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:42:05,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 02:42:06,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 02:42:08,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:42:08,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 02:42:08,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 02:42:10,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 02:42:11,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:42:13,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:42:15,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:42:16,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:42:17,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:42:18,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:42:18,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 02:42:18,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:42:18,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:42:20,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:42:20,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 02:42:22,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 02:42:22,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 02:42:27,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=722480.0, ans=0.125 2023-10-02 02:42:27,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=722480.0, ans=0.09899494936611666 2023-10-02 02:42:28,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:42:31,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:42:32,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 02:42:34,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=722546.6666666666, ans=0.0 2023-10-02 02:42:37,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:42:40,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:42:40,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:42:40,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:42:40,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 02:42:40,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:42:42,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:42:42,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:42:43,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:42:43,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:42:45,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 02:42:46,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 02:42:46,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:42:49,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:42:49,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:42:50,502 INFO [train.py:1046] (1/4) Epoch 21, batch 2150, loss[loss=0.1636, simple_loss=0.2499, pruned_loss=0.03862, over 24667.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2494, pruned_loss=0.04901, over 4708183.18 frames. ], batch size: 68, lr: 4.91e-03, grad_scale: 8.0 2023-10-02 02:42:50,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:42:50,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:42:55,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 02:42:55,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=722613.3333333334, ans=0.125 2023-10-02 02:42:56,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:42:57,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:43:00,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:43:00,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:00,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:43:04,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:43:05,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:43:05,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:43:07,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=722680.0, ans=0.0 2023-10-02 02:43:08,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:08,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 02:43:08,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=722680.0, ans=0.125 2023-10-02 02:43:11,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=722680.0, ans=0.0 2023-10-02 02:43:13,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:43:13,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:43:14,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:14,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:43:16,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:16,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:43:16,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:43:17,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:43:17,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:43:19,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 02:43:20,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:43:20,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:43:20,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:43:23,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:43:24,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:43:25,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:43:26,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:43:28,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:43:28,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 02:43:28,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:43:32,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:43:32,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:33,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:43:36,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:43:37,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:43:38,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:38,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 02:43:40,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 02:43:40,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:43:40,429 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 02:43:40,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:43:41,226 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.84 vs. limit=15.0 2023-10-02 02:43:42,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:43:42,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 02:43:42,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:43:42,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 02:43:42,349 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 02:43:42,349 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 02:43:43,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 02:43:46,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:43:47,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:43:47,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:43:47,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:49,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 02:43:50,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:43:50,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:44:00,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:44:00,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 02:44:02,438 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.37 vs. limit=15.0 2023-10-02 02:44:05,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:44:06,415 INFO [train.py:1046] (1/4) Epoch 21, batch 2200, loss[loss=0.1875, simple_loss=0.2654, pruned_loss=0.05476, over 24008.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2499, pruned_loss=0.04915, over 4706109.67 frames. ], batch size: 80, lr: 4.91e-03, grad_scale: 8.0 2023-10-02 02:44:09,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:44:10,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:44:10,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:44:11,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:44:15,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:44:15,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:44:15,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 02:44:19,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 02:44:21,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:44:25,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 02:44:28,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:44:29,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:44:29,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:44:32,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:44:32,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten.whitening_limit, batch_count=723013.3333333334, ans=15.0 2023-10-02 02:44:33,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 02:44:35,307 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.860e+02 2.025e+02 2.303e+02 3.643e+02, threshold=4.050e+02, percent-clipped=0.0 2023-10-02 02:44:36,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:44:38,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:44:39,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 02:44:42,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:44:43,139 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.46 vs. limit=15.0 2023-10-02 02:44:44,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:44:47,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:44:48,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:44:48,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=723080.0, ans=0.0 2023-10-02 02:44:49,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 02:44:51,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:44:52,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 02:44:54,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:44:54,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 02:44:55,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:44:56,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:44:56,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:44:56,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:44:57,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=723146.6666666666, ans=0.07 2023-10-02 02:44:58,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:44:59,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=723146.6666666666, ans=0.125 2023-10-02 02:45:00,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:45:00,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:45:01,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 02:45:06,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 02:45:06,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:45:08,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:45:08,407 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 02:45:11,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:45:11,214 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 02:45:13,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 02:45:13,759 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 02:45:15,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:45:15,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 02:45:17,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:45:19,883 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 02:45:21,297 INFO [train.py:1046] (1/4) Epoch 21, batch 2250, loss[loss=0.1563, simple_loss=0.2368, pruned_loss=0.03791, over 24439.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2504, pruned_loss=0.0489, over 4721942.18 frames. ], batch size: 58, lr: 4.91e-03, grad_scale: 8.0 2023-10-02 02:45:21,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:45:22,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:45:28,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:45:28,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:45:32,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:45:33,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:45:35,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:45:38,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 02:45:38,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:45:38,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:45:39,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 02:45:41,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:45:41,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:45:43,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:45:43,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=723346.6666666666, ans=0.0 2023-10-02 02:45:48,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:45:49,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 02:45:50,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:45:51,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 02:45:52,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:45:54,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:45:55,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=723413.3333333334, ans=0.125 2023-10-02 02:45:59,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:46:02,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:46:03,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:46:03,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:46:06,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:46:08,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:46:12,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:46:15,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:46:20,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=723546.6666666666, ans=0.2 2023-10-02 02:46:20,600 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.55 vs. limit=15.0 2023-10-02 02:46:21,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 02:46:21,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:46:21,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:46:25,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 02:46:25,732 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=723546.6666666666, ans=0.0 2023-10-02 02:46:28,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:46:28,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 02:46:28,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=723546.6666666666, ans=0.125 2023-10-02 02:46:29,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:46:29,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:46:32,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 02:46:35,181 INFO [train.py:1046] (1/4) Epoch 21, batch 2300, loss[loss=0.1744, simple_loss=0.2468, pruned_loss=0.05103, over 23273.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2511, pruned_loss=0.04915, over 4726719.98 frames. ], batch size: 105, lr: 4.91e-03, grad_scale: 8.0 2023-10-02 02:46:35,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:46:35,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:46:38,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=723613.3333333334, ans=0.0 2023-10-02 02:46:42,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:46:42,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:46:45,211 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 02:46:46,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:46:54,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:46:54,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:46:55,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:46:56,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:46:56,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 02:46:56,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:46:56,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=723680.0, ans=0.125 2023-10-02 02:46:58,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:46:58,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=723680.0, ans=0.125 2023-10-02 02:46:59,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:47:03,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:47:04,816 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.907e+02 2.156e+02 2.525e+02 3.499e+02, threshold=4.312e+02, percent-clipped=0.0 2023-10-02 02:47:06,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:47:10,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:47:16,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:47:16,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:47:20,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:47:22,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:47:27,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:47:27,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:47:27,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=723813.3333333334, ans=0.125 2023-10-02 02:47:28,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:47:28,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 02:47:33,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 02:47:33,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:47:33,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:47:33,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:47:34,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:47:34,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 02:47:34,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:47:35,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 02:47:35,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:47:35,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:47:35,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 02:47:38,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=723880.0, ans=0.2 2023-10-02 02:47:41,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:47:42,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=723880.0, ans=0.0 2023-10-02 02:47:44,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:47:49,014 INFO [train.py:1046] (1/4) Epoch 21, batch 2350, loss[loss=0.1626, simple_loss=0.2502, pruned_loss=0.03747, over 24695.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2528, pruned_loss=0.04997, over 4729359.40 frames. ], batch size: 73, lr: 4.90e-03, grad_scale: 8.0 2023-10-02 02:47:50,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:47:50,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:47:50,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:47:52,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:47:52,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:47:53,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:47:53,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 02:48:01,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:48:01,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 02:48:05,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 02:48:09,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:48:10,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:48:10,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:48:10,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:48:10,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:48:12,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 02:48:12,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=724013.3333333334, ans=0.0 2023-10-02 02:48:13,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:48:17,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=724080.0, ans=0.125 2023-10-02 02:48:20,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 02:48:21,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:48:24,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:48:24,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:48:26,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:48:28,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 02:48:29,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:48:31,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:48:31,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:48:31,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=724080.0, ans=0.2 2023-10-02 02:48:32,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:48:33,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:48:36,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 02:48:36,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:48:36,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=724146.6666666666, ans=0.0 2023-10-02 02:48:40,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:48:40,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:48:42,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 02:48:42,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:48:45,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 02:48:45,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:48:49,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 02:48:53,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 02:48:53,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:48:53,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 02:48:53,916 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 02:48:55,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 02:48:57,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 02:49:01,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:49:01,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_ff3.min_abs, batch_count=724213.3333333334, ans=0.2 2023-10-02 02:49:04,170 INFO [train.py:1046] (1/4) Epoch 21, batch 2400, loss[loss=0.1708, simple_loss=0.2349, pruned_loss=0.05337, over 23365.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2518, pruned_loss=0.04992, over 4719851.26 frames. ], batch size: 285, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:49:04,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:49:07,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:49:08,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=724280.0, ans=10.0 2023-10-02 02:49:10,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:49:11,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 02:49:11,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 02:49:15,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=724280.0, ans=0.0 2023-10-02 02:49:19,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 02:49:19,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:49:21,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 02:49:21,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:49:21,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:49:23,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 02:49:26,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=724346.6666666666, ans=0.1 2023-10-02 02:49:27,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:49:29,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=724346.6666666666, ans=0.0 2023-10-02 02:49:30,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 02:49:33,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=724413.3333333334, ans=0.1 2023-10-02 02:49:35,035 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.859e+02 2.075e+02 2.399e+02 3.778e+02, threshold=4.149e+02, percent-clipped=0.0 2023-10-02 02:49:36,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 02:49:40,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 02:49:43,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:49:43,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:49:45,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=724413.3333333334, ans=0.0 2023-10-02 02:49:48,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:49:48,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 02:49:49,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:49:56,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:49:58,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:50:01,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:50:02,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:50:02,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:50:02,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:50:02,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:50:02,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:50:02,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:50:07,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:50:07,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:50:07,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 02:50:09,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 02:50:12,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:50:12,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:50:12,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 02:50:14,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 02:50:14,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 02:50:14,107 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 02:50:15,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 02:50:15,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:50:18,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:50:18,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:50:19,697 INFO [train.py:1046] (1/4) Epoch 21, batch 2450, loss[loss=0.1845, simple_loss=0.2633, pruned_loss=0.05282, over 23244.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2504, pruned_loss=0.04965, over 4716460.74 frames. ], batch size: 93, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:50:19,802 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 02:50:19,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:50:20,045 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:50:21,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 02:50:21,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=724613.3333333334, ans=0.2 2023-10-02 02:50:23,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:50:24,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:50:27,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:50:27,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:50:29,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 02:50:30,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=724613.3333333334, ans=0.125 2023-10-02 02:50:34,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=724680.0, ans=0.0 2023-10-02 02:50:35,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:50:35,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:50:38,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:50:38,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:50:38,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:50:39,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 02:50:44,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:50:44,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=724680.0, ans=0.2 2023-10-02 02:50:45,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:50:46,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:50:49,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:50:49,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:50:53,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:50:53,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:50:55,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 02:50:56,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:51:04,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:51:05,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:51:05,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:51:07,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:51:07,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:51:08,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:51:09,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 02:51:12,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:51:12,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:51:17,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:51:17,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:51:22,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:51:22,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 02:51:24,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:51:24,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:51:24,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 02:51:24,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:51:24,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=724880.0, ans=0.09899494936611666 2023-10-02 02:51:27,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:51:29,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:51:31,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:51:33,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:51:33,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=724946.6666666666, ans=0.125 2023-10-02 02:51:34,476 INFO [train.py:1046] (1/4) Epoch 21, batch 2500, loss[loss=0.169, simple_loss=0.2125, pruned_loss=0.0627, over 19145.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2498, pruned_loss=0.04962, over 4723859.68 frames. ], batch size: 388, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:51:36,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 02:51:36,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:51:42,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:51:48,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=725013.3333333334, ans=0.125 2023-10-02 02:51:49,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:51:51,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:51:52,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:51:52,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 02:51:59,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:51:59,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:52:00,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:52:00,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 02:52:01,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 02:52:03,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:52:04,790 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.878e+02 2.187e+02 2.689e+02 3.332e+02, threshold=4.374e+02, percent-clipped=0.0 2023-10-02 02:52:04,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:52:04,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 02:52:04,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:52:06,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 02:52:06,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:52:11,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:52:12,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:52:13,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=725080.0, ans=0.125 2023-10-02 02:52:15,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 02:52:17,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 02:52:17,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:52:17,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:52:21,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:52:25,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:52:28,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:52:31,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=725146.6666666666, ans=0.2 2023-10-02 02:52:34,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 02:52:35,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 02:52:35,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:52:37,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 02:52:39,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:52:39,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 02:52:40,677 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 02:52:40,677 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 02:52:40,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 02:52:43,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:52:45,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=725213.3333333334, ans=0.125 2023-10-02 02:52:46,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 02:52:46,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 02:52:46,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:52:49,409 INFO [train.py:1046] (1/4) Epoch 21, batch 2550, loss[loss=0.1727, simple_loss=0.2481, pruned_loss=0.04866, over 23606.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2501, pruned_loss=0.04942, over 4720240.67 frames. ], batch size: 149, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:52:49,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 02:52:51,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 02:52:53,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:52:55,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:52:55,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:52:55,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=725280.0, ans=0.0 2023-10-02 02:52:57,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:52:59,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 02:52:59,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:53:03,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 02:53:06,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:53:07,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:07,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:53:07,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=725346.6666666666, ans=0.125 2023-10-02 02:53:09,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 02:53:09,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:53:11,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:53:11,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:53:13,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:53:13,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 02:53:13,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 02:53:13,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:13,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 02:53:25,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:53:25,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=725413.3333333334, ans=0.0 2023-10-02 02:53:32,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:53:32,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:32,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:53:32,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:53:37,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=725480.0, ans=0.07 2023-10-02 02:53:39,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:53:40,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:53:40,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:53:42,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:53:42,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:53:42,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 02:53:45,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:53:45,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=725480.0, ans=0.125 2023-10-02 02:53:46,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:48,076 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.49 vs. limit=6.0 2023-10-02 02:53:50,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:53:50,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 02:53:50,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:53:50,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:51,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:53:52,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:53:54,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:53:58,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=725546.6666666666, ans=0.1 2023-10-02 02:54:01,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:54:03,810 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.61 vs. limit=22.5 2023-10-02 02:54:04,387 INFO [train.py:1046] (1/4) Epoch 21, batch 2600, loss[loss=0.1661, simple_loss=0.2432, pruned_loss=0.04456, over 24448.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2512, pruned_loss=0.05008, over 4706411.70 frames. ], batch size: 58, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:54:04,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:54:05,945 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 02:54:10,008 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 02:54:10,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:54:10,057 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 02:54:10,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 02:54:10,810 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.44 vs. limit=15.0 2023-10-02 02:54:11,467 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 02:54:12,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:54:12,997 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 02:54:15,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 02:54:16,236 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 02:54:17,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:54:20,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 02:54:21,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 02:54:21,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:54:23,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 02:54:24,720 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 02:54:24,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 02:54:32,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:54:32,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:54:32,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:54:32,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 02:54:34,848 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.872e+02 2.087e+02 2.390e+02 4.163e+02, threshold=4.174e+02, percent-clipped=0.0 2023-10-02 02:54:36,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:54:42,934 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 02:54:47,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:54:48,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:54:48,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 02:54:49,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:54:49,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:54:50,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 02:54:50,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=725813.3333333334, ans=0.035 2023-10-02 02:54:53,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:54:53,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:54:54,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:54:58,626 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 02:54:58,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:54:58,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:55:04,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=725880.0, ans=0.125 2023-10-02 02:55:06,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:55:06,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:55:06,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 02:55:07,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:55:09,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:55:10,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:55:12,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=725880.0, ans=0.125 2023-10-02 02:55:16,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 02:55:16,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:55:17,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 02:55:19,854 INFO [train.py:1046] (1/4) Epoch 21, batch 2650, loss[loss=0.1433, simple_loss=0.2226, pruned_loss=0.03198, over 24355.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2518, pruned_loss=0.05027, over 4709383.86 frames. ], batch size: 56, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:55:21,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=725946.6666666666, ans=0.0 2023-10-02 02:55:22,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 02:55:22,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:55:22,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:55:24,121 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 02:55:25,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:55:28,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:55:29,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 02:55:32,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:55:34,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:55:35,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 02:55:35,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:55:35,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:55:37,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 02:55:40,107 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 02:55:41,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:55:41,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=726013.3333333334, ans=0.125 2023-10-02 02:55:45,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 02:55:45,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:55:46,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 02:55:50,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:55:50,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 02:55:52,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:55:52,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:55:58,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 02:55:58,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 02:55:58,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=726080.0, ans=0.0 2023-10-02 02:56:01,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:56:04,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 02:56:04,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:56:06,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:06,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:56:07,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:56:07,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:56:08,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:56:10,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:56:11,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:56:12,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:56:13,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:56:14,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:56:16,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:56:17,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:56:20,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:56:20,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 02:56:21,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=726213.3333333334, ans=0.125 2023-10-02 02:56:21,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=726213.3333333334, ans=0.125 2023-10-02 02:56:23,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:24,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:56:24,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:56:24,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 02:56:28,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:56:28,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:30,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:31,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:56:32,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:56:32,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:56:33,939 INFO [train.py:1046] (1/4) Epoch 21, batch 2700, loss[loss=0.1668, simple_loss=0.255, pruned_loss=0.03925, over 24643.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2525, pruned_loss=0.05041, over 4713634.10 frames. ], batch size: 68, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:56:36,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:56:36,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 02:56:40,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:56:41,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 02:56:41,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=726280.0, ans=0.0 2023-10-02 02:56:44,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:56:44,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:56:44,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:56:45,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:56:45,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:56:45,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:56:45,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:56:45,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 02:56:47,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:56:51,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:56:51,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:56:51,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:55,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:56:56,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 02:56:56,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:56:58,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=726346.6666666666, ans=0.0 2023-10-02 02:57:01,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:57:01,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:57:01,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=726346.6666666666, ans=0.0 2023-10-02 02:57:04,044 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.850e+02 2.012e+02 2.260e+02 2.930e+02, threshold=4.023e+02, percent-clipped=0.0 2023-10-02 02:57:04,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=726413.3333333334, ans=0.125 2023-10-02 02:57:05,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:57:05,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:57:05,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:57:05,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:57:08,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:57:11,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:57:11,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:57:11,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:57:16,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:57:16,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:57:25,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:57:25,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:57:28,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:57:28,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:57:29,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=726480.0, ans=0.125 2023-10-02 02:57:30,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=726480.0, ans=0.1 2023-10-02 02:57:32,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:57:34,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:57:34,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:57:34,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:57:35,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:57:35,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:57:37,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=726546.6666666666, ans=0.125 2023-10-02 02:57:38,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:57:40,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:57:40,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:57:45,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 02:57:45,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=726546.6666666666, ans=0.125 2023-10-02 02:57:46,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:57:48,967 INFO [train.py:1046] (1/4) Epoch 21, batch 2750, loss[loss=0.1721, simple_loss=0.2377, pruned_loss=0.05318, over 23804.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2515, pruned_loss=0.04989, over 4708376.76 frames. ], batch size: 212, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:57:49,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:57:49,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 02:57:49,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 02:57:49,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:57:54,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:57:55,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:57:56,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:57:56,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:57:56,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:02,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:58:02,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 02:58:02,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:58:03,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:03,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 02:58:03,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:58:03,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:58:07,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 02:58:08,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=726680.0, ans=0.0 2023-10-02 02:58:09,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:58:09,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:11,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:58:11,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 02:58:12,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:58:12,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=726680.0, ans=0.125 2023-10-02 02:58:13,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:58:13,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:58:13,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:58:18,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:58:18,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 02:58:18,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_na.min_abs, batch_count=726746.6666666666, ans=0.02 2023-10-02 02:58:20,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:58:20,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=726746.6666666666, ans=0.125 2023-10-02 02:58:21,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:23,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:58:30,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:58:33,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 02:58:33,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:58:37,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:37,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:58:37,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:58:43,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:58:43,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:58:43,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 02:58:49,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:58:51,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 02:58:54,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:58:57,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:58:57,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 02:58:58,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:59:00,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:59:01,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 02:59:01,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:59:04,158 INFO [train.py:1046] (1/4) Epoch 21, batch 2800, loss[loss=0.1567, simple_loss=0.2354, pruned_loss=0.03904, over 24598.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2504, pruned_loss=0.04897, over 4720355.30 frames. ], batch size: 60, lr: 4.89e-03, grad_scale: 32.0 2023-10-02 02:59:04,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 02:59:06,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:59:06,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:59:07,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 02:59:07,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:59:08,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:59:10,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:59:10,440 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 02:59:10,441 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 02:59:13,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:59:15,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:59:15,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:59:17,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:59:20,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 02:59:21,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=727013.3333333334, ans=0.09899494936611666 2023-10-02 02:59:22,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 02:59:24,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 02:59:25,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:59:25,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:59:25,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=727013.3333333334, ans=0.0 2023-10-02 02:59:26,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:59:29,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:59:29,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:59:29,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 02:59:30,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=727013.3333333334, ans=0.125 2023-10-02 02:59:31,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:59:34,440 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.948e+02 2.235e+02 2.690e+02 3.655e+02, threshold=4.470e+02, percent-clipped=0.0 2023-10-02 02:59:37,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=727080.0, ans=0.2 2023-10-02 02:59:38,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:59:40,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:59:40,211 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:59:40,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=727080.0, ans=0.0 2023-10-02 02:59:42,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=727080.0, ans=0.2 2023-10-02 02:59:43,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:59:44,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:59:46,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:59:49,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:59:49,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 02:59:49,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:59:50,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:59:50,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:59:55,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:59:56,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:59:58,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:00:01,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:00:01,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:00:01,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 03:00:02,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 03:00:02,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:00:04,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:00:04,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 03:00:04,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:00:04,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=727213.3333333334, ans=0.125 2023-10-02 03:00:07,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:00:07,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:00:07,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 03:00:07,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:00:07,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:00:09,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:00:09,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 03:00:14,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:00:14,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:00:16,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:00:18,736 INFO [train.py:1046] (1/4) Epoch 21, batch 2850, loss[loss=0.1426, simple_loss=0.2251, pruned_loss=0.03006, over 24266.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2488, pruned_loss=0.04875, over 4720432.71 frames. ], batch size: 61, lr: 4.89e-03, grad_scale: 16.0 2023-10-02 03:00:18,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:00:23,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:00:23,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:00:23,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:00:26,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:00:26,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:00:28,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:00:29,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 03:00:37,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 03:00:37,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:00:38,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 03:00:40,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:00:41,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 03:00:43,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 03:00:44,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:00:55,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:00:56,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:00:56,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:00:58,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 03:00:58,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:00:58,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:00:59,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:00:59,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 03:01:01,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:01:01,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:01:03,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:01:03,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:01:05,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:01:05,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:01:07,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:01:08,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:01:10,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:01:11,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:01:13,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:01:16,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:01:20,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:01:22,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 03:01:22,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 03:01:23,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:01:25,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:01:25,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 03:01:25,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:01:26,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:01:26,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:01:26,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:01:26,648 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 03:01:27,963 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 03:01:27,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:01:28,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:01:32,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 03:01:32,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:01:33,979 INFO [train.py:1046] (1/4) Epoch 21, batch 2900, loss[loss=0.1726, simple_loss=0.2646, pruned_loss=0.04034, over 24339.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2488, pruned_loss=0.04856, over 4707275.25 frames. ], batch size: 74, lr: 4.89e-03, grad_scale: 16.0 2023-10-02 03:01:34,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:01:35,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 03:01:36,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=727613.3333333334, ans=0.2 2023-10-02 03:01:39,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:01:40,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 03:01:41,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 03:01:43,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:01:43,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:01:44,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:01:46,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:01:46,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=727613.3333333334, ans=0.125 2023-10-02 03:01:49,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:01:49,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:01:52,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 03:01:52,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 03:01:52,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=727680.0, ans=0.0 2023-10-02 03:01:53,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:01:55,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:01:56,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 03:01:57,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 03:01:59,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:01:59,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 03:01:59,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:02:03,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:02:03,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 03:02:06,015 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 1.860e+02 2.101e+02 2.423e+02 4.328e+02, threshold=4.202e+02, percent-clipped=0.0 2023-10-02 03:02:06,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:02:06,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=727746.6666666666, ans=0.125 2023-10-02 03:02:07,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:02:11,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:02:15,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:02:16,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 03:02:16,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 03:02:16,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:02:19,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:02:20,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 03:02:22,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:02:24,643 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.29 vs. limit=15.0 2023-10-02 03:02:26,184 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.58 vs. limit=22.5 2023-10-02 03:02:26,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:02:37,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:02:37,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:02:37,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 03:02:40,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=727880.0, ans=0.95 2023-10-02 03:02:41,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:02:43,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 03:02:43,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:02:43,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:02:49,274 INFO [train.py:1046] (1/4) Epoch 21, batch 2950, loss[loss=0.1664, simple_loss=0.2424, pruned_loss=0.04515, over 23615.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.25, pruned_loss=0.04941, over 4707303.40 frames. ], batch size: 149, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:02:49,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:02:50,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 03:02:50,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:02:50,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:02:51,424 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.87 vs. limit=6.0 2023-10-02 03:02:52,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:02:55,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:02:55,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 03:02:56,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 03:02:58,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:02:58,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:03:04,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:03:06,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:03:07,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:03:09,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:03:13,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:03:13,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:03:14,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:03:15,401 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-02 03:03:16,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:03:16,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:03:17,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 03:03:18,583 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.57 vs. limit=15.0 2023-10-02 03:03:24,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 03:03:24,020 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 03:03:25,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:03:26,698 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 03:03:28,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 03:03:28,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:03:28,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:03:28,661 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 03:03:28,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 03:03:31,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 03:03:32,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:03:32,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:03:34,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:03:37,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:03:37,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:03:37,158 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 03:03:37,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:03:39,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 03:03:43,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:03:44,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:03:44,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 03:03:44,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:03:46,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 03:03:49,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:03:49,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:03:50,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:03:52,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:03:52,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 03:03:53,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:03:54,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:03:54,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:03:54,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:03:56,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:03:56,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:03:58,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:03:58,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 03:03:59,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:04:01,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:04:01,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:04:04,058 INFO [train.py:1046] (1/4) Epoch 21, batch 3000, loss[loss=0.1594, simple_loss=0.2352, pruned_loss=0.04183, over 24363.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2516, pruned_loss=0.04996, over 4712437.00 frames. ], batch size: 61, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:04:04,058 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 03:04:18,546 INFO [train.py:1078] (1/4) Epoch 21, validation: loss=0.3071, simple_loss=0.2764, pruned_loss=0.1689, over 1125622.00 frames. 2023-10-02 03:04:18,547 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20364MB 2023-10-02 03:04:18,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=728280.0, ans=0.125 2023-10-02 03:04:20,007 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 03:04:20,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 03:04:24,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:04:24,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:04:24,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 03:04:25,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:04:31,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:04:40,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:04:45,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=728346.6666666666, ans=0.125 2023-10-02 03:04:46,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 03:04:46,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:04:51,185 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.800e+02 1.986e+02 2.235e+02 3.212e+02, threshold=3.971e+02, percent-clipped=0.0 2023-10-02 03:04:51,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:04:51,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:04:51,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:04:54,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:04:54,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 03:04:54,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=728413.3333333334, ans=0.0 2023-10-02 03:04:55,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 03:04:58,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:04:58,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:05:01,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:05:01,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:05:01,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:05:01,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:05:07,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:05:07,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:05:07,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:05:09,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:05:12,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 03:05:12,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=728480.0, ans=0.125 2023-10-02 03:05:14,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:05:14,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:05:14,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:05:15,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:05:17,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:05:18,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 03:05:18,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 03:05:19,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:05:20,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 03:05:20,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:05:22,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 03:05:25,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:05:27,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:05:27,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 03:05:27,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 03:05:27,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 03:05:27,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:05:28,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:05:28,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 03:05:28,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:05:30,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:05:32,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 03:05:33,387 INFO [train.py:1046] (1/4) Epoch 21, batch 3050, loss[loss=0.2243, simple_loss=0.2888, pruned_loss=0.07986, over 19754.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2524, pruned_loss=0.05017, over 4716989.41 frames. ], batch size: 388, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:05:33,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:05:36,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:05:36,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=728613.3333333334, ans=0.0 2023-10-02 03:05:37,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:05:42,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:05:45,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 03:05:49,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 03:05:51,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 03:05:51,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:05:55,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:05:59,123 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.46 vs. limit=15.0 2023-10-02 03:05:59,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:05:59,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:05:59,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:06:02,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:06:03,233 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.32 vs. limit=22.5 2023-10-02 03:06:04,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:06:04,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:06:05,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:06:05,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:06:07,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:06:08,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:06:10,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:06:10,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 03:06:11,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:06:11,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:06:16,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:06:17,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:06:17,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:06:17,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:06:22,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:06:22,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=728813.3333333334, ans=0.2 2023-10-02 03:06:23,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:06:29,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:06:29,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:06:29,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:06:31,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:06:31,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=728880.0, ans=0.125 2023-10-02 03:06:32,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:06:32,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:06:34,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 03:06:35,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:06:35,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:06:37,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 03:06:38,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:06:46,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:06:47,533 INFO [train.py:1046] (1/4) Epoch 21, batch 3100, loss[loss=0.1765, simple_loss=0.2634, pruned_loss=0.04486, over 24660.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.252, pruned_loss=0.05007, over 4718724.99 frames. ], batch size: 73, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:06:49,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:06:51,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:06:52,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 03:06:53,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 03:06:56,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 03:06:56,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:06:59,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:06:59,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:07:03,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 03:07:07,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:07:11,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 03:07:16,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:07:16,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:16,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:07:16,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:07:18,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 03:07:18,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=729080.0, ans=0.1 2023-10-02 03:07:21,274 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.950e+02 2.237e+02 2.681e+02 4.003e+02, threshold=4.474e+02, percent-clipped=0.0 2023-10-02 03:07:21,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:07:21,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 03:07:21,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:07:22,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:07:24,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 03:07:25,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:07:28,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:07:29,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 03:07:31,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 03:07:31,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:33,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:07:35,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:07:37,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:37,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:07:38,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:07:38,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:07:39,502 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.70 vs. limit=15.0 2023-10-02 03:07:40,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:07:40,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:07:40,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:40,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 03:07:43,901 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=729146.6666666666, ans=0.0 2023-10-02 03:07:46,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:07:46,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 03:07:49,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:07:49,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 03:07:49,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:07:49,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=729213.3333333334, ans=0.0 2023-10-02 03:07:50,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:51,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 03:08:01,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 03:08:02,736 INFO [train.py:1046] (1/4) Epoch 21, batch 3150, loss[loss=0.169, simple_loss=0.237, pruned_loss=0.0505, over 23790.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2511, pruned_loss=0.04979, over 4713369.77 frames. ], batch size: 212, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:08:02,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:08:04,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:08:06,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:08:07,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:08:07,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 03:08:08,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:08:08,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 03:08:10,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 03:08:11,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:08:13,700 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 03:08:15,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 03:08:16,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:08:17,171 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 03:08:17,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 03:08:18,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 03:08:20,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 03:08:20,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 03:08:20,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:08:20,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:08:21,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:08:21,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 03:08:23,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:08:24,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:08:24,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:08:27,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 03:08:31,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 03:08:31,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:08:34,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:08:34,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:08:35,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 03:08:37,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 03:08:39,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:08:39,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 03:08:39,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:08:40,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:08:40,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:08:40,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=729413.3333333334, ans=0.125 2023-10-02 03:08:42,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:08:42,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:08:43,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 03:08:45,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:08:45,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:08:47,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:08:47,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:08:48,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 03:08:50,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:08:51,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 03:08:51,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:08:53,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 03:08:53,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 03:08:54,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:08:55,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:08:57,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 03:08:57,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 03:08:58,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:09:00,809 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.59 vs. limit=15.0 2023-10-02 03:09:01,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:09:02,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:09:02,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:09:07,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:09:07,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:09:10,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 03:09:16,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:09:16,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 03:09:17,972 INFO [train.py:1046] (1/4) Epoch 21, batch 3200, loss[loss=0.175, simple_loss=0.2533, pruned_loss=0.04832, over 23699.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2499, pruned_loss=0.04941, over 4711885.34 frames. ], batch size: 106, lr: 4.89e-03, grad_scale: 16.0 2023-10-02 03:09:20,166 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.59 vs. limit=15.0 2023-10-02 03:09:20,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:09:22,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:09:22,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 03:09:24,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:09:29,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:09:32,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:09:38,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:09:41,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=729680.0, ans=0.125 2023-10-02 03:09:49,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 03:09:50,804 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.481e+02 1.933e+02 2.083e+02 2.478e+02 3.380e+02, threshold=4.166e+02, percent-clipped=0.0 2023-10-02 03:09:50,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:09:54,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 03:09:55,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:09:59,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:09:59,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:10:00,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:10:05,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 03:10:07,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 03:10:08,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 03:10:08,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=729813.3333333334, ans=0.1 2023-10-02 03:10:10,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=729813.3333333334, ans=0.0 2023-10-02 03:10:11,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 03:10:13,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:10:18,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:10:20,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:10:20,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:10:20,805 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 03:10:20,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:10:23,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:10:25,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 03:10:26,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 03:10:27,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 03:10:29,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 03:10:31,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:10:32,611 INFO [train.py:1046] (1/4) Epoch 21, batch 3250, loss[loss=0.1546, simple_loss=0.2382, pruned_loss=0.03552, over 24665.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2501, pruned_loss=0.04902, over 4722114.61 frames. ], batch size: 65, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:10:34,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 03:10:34,012 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 03:10:34,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:10:35,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:10:35,426 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 03:10:38,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:10:41,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:10:49,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:10:49,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 03:10:49,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=730013.3333333334, ans=0.0 2023-10-02 03:10:50,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:10:50,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:10:50,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:10:53,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:10:53,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:10:56,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:10:56,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:10:56,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:10:57,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:10:57,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:10:57,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:11:01,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:11:02,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:11:04,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:11:04,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:11:05,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:11:07,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:11:07,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:11:11,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 03:11:12,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:11:12,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:11:13,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:11:14,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:11:19,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:11:25,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:11:26,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:11:26,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 03:11:26,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:11:26,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 03:11:27,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:11:28,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=730146.6666666666, ans=0.125 2023-10-02 03:11:29,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 03:11:29,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 03:11:31,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:11:31,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:11:33,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:11:33,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 03:11:35,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:11:37,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:11:37,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:11:39,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 03:11:39,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:11:41,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:11:41,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 03:11:44,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:11:44,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 03:11:46,119 INFO [train.py:1046] (1/4) Epoch 21, batch 3300, loss[loss=0.165, simple_loss=0.2405, pruned_loss=0.04474, over 23502.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2508, pruned_loss=0.04929, over 4725783.05 frames. ], batch size: 134, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:11:46,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 03:11:47,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 03:11:49,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:11:54,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:11:55,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:11:55,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:11:58,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 03:11:58,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:12:01,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:12:03,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:12:07,090 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.23 vs. limit=15.0 2023-10-02 03:12:08,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 03:12:08,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:12:08,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:12:08,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=730346.6666666666, ans=0.125 2023-10-02 03:12:09,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:09,802 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 03:12:11,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:12:12,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:12:13,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:12:13,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:12:15,119 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 03:12:17,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:12:17,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:12:19,239 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.870e+02 2.047e+02 2.284e+02 2.829e+02, threshold=4.094e+02, percent-clipped=0.0 2023-10-02 03:12:20,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:20,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 03:12:22,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 03:12:22,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:23,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:12:26,605 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 03:12:27,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 03:12:30,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:12:31,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 03:12:32,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:12:35,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:12:37,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:12:39,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:12:40,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:12:40,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:12:41,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:12:42,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=730480.0, ans=0.0 2023-10-02 03:12:43,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:12:43,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:45,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:12:46,423 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 03:12:47,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 03:12:49,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 03:12:49,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:12:49,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:12:50,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:12:50,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:12:52,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:12:52,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:12:53,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 03:12:54,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:55,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:12:58,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 03:12:58,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:00,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:01,443 INFO [train.py:1046] (1/4) Epoch 21, batch 3350, loss[loss=0.1682, simple_loss=0.2524, pruned_loss=0.04206, over 24452.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2509, pruned_loss=0.04905, over 4729293.56 frames. ], batch size: 66, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:13:02,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:13:02,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:13:05,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:13:05,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:13:05,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:10,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:13:12,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:12,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:13:13,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:16,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:13:17,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:13:18,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:13:20,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 03:13:21,690 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 03:13:21,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:13:24,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 03:13:24,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 03:13:26,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:13:26,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:13:28,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:13:28,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 03:13:28,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:30,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:13:32,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:33,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:35,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:35,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:13:38,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:13:38,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=730746.6666666666, ans=0.0 2023-10-02 03:13:39,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:39,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:13:43,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:13:43,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=730746.6666666666, ans=0.125 2023-10-02 03:13:44,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:47,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:47,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:13:49,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:13:51,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 03:13:51,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 03:13:51,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 03:13:51,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:13:53,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 03:13:53,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:13:54,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:59,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=730880.0, ans=0.125 2023-10-02 03:14:02,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:14:04,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 03:14:05,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:14:06,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:14:06,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:14:11,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:14:14,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 03:14:14,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:14:14,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:14:15,757 INFO [train.py:1046] (1/4) Epoch 21, batch 3400, loss[loss=0.2355, simple_loss=0.2981, pruned_loss=0.08645, over 19412.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2522, pruned_loss=0.04964, over 4723771.26 frames. ], batch size: 388, lr: 4.88e-03, grad_scale: 8.0 2023-10-02 03:14:17,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:14:18,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 03:14:19,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:14:19,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 03:14:21,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:14:21,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:14:21,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:14:22,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:14:22,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 03:14:25,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=730946.6666666666, ans=0.2 2023-10-02 03:14:27,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 03:14:27,504 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 03:14:27,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:14:32,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:14:32,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:14:33,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:14:34,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:14:35,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=731013.3333333334, ans=0.125 2023-10-02 03:14:39,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:14:40,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 03:14:48,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:14:48,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:14:49,936 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.878e+02 1.982e+02 2.218e+02 2.946e+02, threshold=3.964e+02, percent-clipped=0.0 2023-10-02 03:14:50,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:14:50,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 03:14:56,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:15:01,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 03:15:05,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:15:05,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:15:06,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 03:15:06,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:15:07,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:15:08,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:15:08,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:15:11,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:15:15,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:15:15,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:15:18,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:15:21,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 03:15:26,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:15:30,858 INFO [train.py:1046] (1/4) Epoch 21, batch 3450, loss[loss=0.182, simple_loss=0.2655, pruned_loss=0.04926, over 23450.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2527, pruned_loss=0.04943, over 4725060.02 frames. ], batch size: 93, lr: 4.88e-03, grad_scale: 8.0 2023-10-02 03:15:30,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 03:15:36,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 03:15:36,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:15:37,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:15:37,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 03:15:39,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:15:40,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:15:46,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:15:46,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:15:47,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=731346.6666666666, ans=0.125 2023-10-02 03:15:48,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:15:48,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:15:50,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:15:55,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 03:16:01,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 03:16:02,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:16:02,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:16:04,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:16:09,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 03:16:09,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:16:14,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:16:15,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:16:15,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:16:17,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:16:20,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 03:16:20,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:16:22,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:16:25,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:16:26,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 03:16:29,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:16:35,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:16:36,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:16:38,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:16:42,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:16:42,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:16:44,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:16:44,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:16:46,184 INFO [train.py:1046] (1/4) Epoch 21, batch 3500, loss[loss=0.162, simple_loss=0.2214, pruned_loss=0.05132, over 23705.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2514, pruned_loss=0.04874, over 4738017.27 frames. ], batch size: 232, lr: 4.88e-03, grad_scale: 8.0 2023-10-02 03:16:47,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=731613.3333333334, ans=0.125 2023-10-02 03:16:48,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:16:52,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:16:52,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 03:16:53,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:16:56,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 03:16:59,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:16:59,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 03:17:05,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:17:06,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:17:06,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:17:06,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:17:06,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 03:17:08,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:08,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:17:08,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 03:17:08,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=731680.0, ans=0.2 2023-10-02 03:17:11,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:12,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 03:17:14,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:17:17,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:18,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 03:17:18,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:17:19,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=731746.6666666666, ans=0.1 2023-10-02 03:17:20,268 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.924e+02 2.111e+02 2.557e+02 4.190e+02, threshold=4.222e+02, percent-clipped=1.0 2023-10-02 03:17:21,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:17:23,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:17:23,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:25,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:17:26,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:17:26,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 03:17:28,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 03:17:29,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 03:17:30,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:17:31,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:32,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:17:32,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:17:34,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 03:17:35,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:17:41,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:17:41,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 03:17:41,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 03:17:41,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:17:46,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:17:46,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:17:47,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:50,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 03:17:51,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:17:53,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:17:53,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 03:17:56,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 03:17:58,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:58,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:18:00,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:18:00,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:18:01,523 INFO [train.py:1046] (1/4) Epoch 21, batch 3550, loss[loss=0.1617, simple_loss=0.232, pruned_loss=0.04571, over 24329.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2501, pruned_loss=0.04846, over 4731671.10 frames. ], batch size: 56, lr: 4.88e-03, grad_scale: 8.0 2023-10-02 03:18:03,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:18:13,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:18:15,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 03:18:16,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:18:20,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:18:21,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:18:21,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:18:21,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:18:24,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:18:26,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:18:27,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:18:27,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 03:18:29,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:18:33,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:18:33,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:18:36,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:18:36,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:18:36,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:18:37,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 03:18:37,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:18:39,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:18:40,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 03:18:43,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=732080.0, ans=0.0 2023-10-02 03:18:45,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:18:46,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:18:47,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:18:49,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 03:18:49,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:18:51,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 03:18:53,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:18:54,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:18:54,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:18:57,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 03:18:58,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:19:04,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:19:06,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 03:19:06,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:19:10,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:19:12,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 03:19:17,025 INFO [train.py:1046] (1/4) Epoch 21, batch 3600, loss[loss=0.1591, simple_loss=0.2369, pruned_loss=0.04072, over 24344.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2498, pruned_loss=0.04874, over 4721628.11 frames. ], batch size: 61, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:19:17,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 03:19:18,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:19:19,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:19:21,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:19:22,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:19:22,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:19:25,438 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.27 vs. limit=22.5 2023-10-02 03:19:26,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:19:27,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:19:27,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:19:28,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:19:30,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:19:30,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 03:19:33,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:19:35,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:19:39,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:19:43,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:19:44,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:19:44,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:19:44,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 03:19:45,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:19:47,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:19:48,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:19:49,448 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.02 vs. limit=6.0 2023-10-02 03:19:50,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:19:51,256 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.800e+02 2.043e+02 2.517e+02 4.317e+02, threshold=4.086e+02, percent-clipped=1.0 2023-10-02 03:19:52,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:19:54,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:19:54,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 03:20:00,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:20:01,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:20:03,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 03:20:08,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:20:11,974 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.26 vs. limit=15.0 2023-10-02 03:20:14,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:20:18,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:20:20,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.whiten.whitening_limit, batch_count=732546.6666666666, ans=12.0 2023-10-02 03:20:24,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:20:25,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:20:25,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 03:20:25,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 03:20:27,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 03:20:28,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:20:28,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:20:30,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 03:20:31,684 INFO [train.py:1046] (1/4) Epoch 21, batch 3650, loss[loss=0.1847, simple_loss=0.2694, pruned_loss=0.04994, over 24027.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2503, pruned_loss=0.04885, over 4726433.67 frames. ], batch size: 80, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:20:31,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:20:31,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:20:31,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:20:31,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 03:20:33,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 03:20:36,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:20:37,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 03:20:42,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 03:20:44,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:20:47,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 03:20:49,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 03:20:51,132 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.07 vs. limit=15.0 2023-10-02 03:20:53,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:20:53,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:20:53,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:20:54,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 03:20:54,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:20:56,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 03:20:57,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:20:57,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:20:59,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 03:20:59,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=732680.0, ans=0.0 2023-10-02 03:21:00,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 03:21:00,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:21:00,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:21:02,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=732746.6666666666, ans=0.2 2023-10-02 03:21:03,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:21:04,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 03:21:06,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 03:21:06,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:21:08,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 03:21:10,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:21:10,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:21:15,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:21:17,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:21:17,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:21:19,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=732813.3333333334, ans=0.125 2023-10-02 03:21:20,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:21:20,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:21:23,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:21:26,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:21:26,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:21:26,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:21:28,174 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.49 vs. limit=6.0 2023-10-02 03:21:28,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:21:29,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:21:29,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:21:35,191 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 03:21:38,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:21:38,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:21:38,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:21:38,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=732880.0, ans=0.1 2023-10-02 03:21:40,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:21:41,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:21:42,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=732880.0, ans=0.125 2023-10-02 03:21:43,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:21:45,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 03:21:45,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:21:47,730 INFO [train.py:1046] (1/4) Epoch 21, batch 3700, loss[loss=0.1512, simple_loss=0.2356, pruned_loss=0.03333, over 24670.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2515, pruned_loss=0.04986, over 4718379.56 frames. ], batch size: 65, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:21:49,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:21:52,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:21:52,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:21:55,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:21:55,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 03:21:55,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:21:56,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 03:21:56,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:21:58,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=732946.6666666666, ans=0.125 2023-10-02 03:21:59,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:22:00,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=733013.3333333334, ans=0.125 2023-10-02 03:22:03,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:22:05,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:22:05,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:22:05,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:22:06,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 03:22:09,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:22:09,374 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 03:22:17,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:22:17,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:22:19,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:22:19,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 03:22:19,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:22:22,358 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.917e+02 2.171e+02 2.489e+02 3.807e+02, threshold=4.342e+02, percent-clipped=0.0 2023-10-02 03:22:23,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:22:25,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 03:22:26,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:22:26,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:22:29,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:22:29,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:22:32,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 03:22:35,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:22:35,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 03:22:35,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:22:35,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 03:22:38,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=733146.6666666666, ans=0.125 2023-10-02 03:22:41,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:22:42,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:22:44,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:22:44,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 03:22:47,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:22:47,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:22:48,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:22:48,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:22:51,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:22:51,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 03:22:52,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=733213.3333333334, ans=0.0 2023-10-02 03:22:53,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 03:22:54,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:22:54,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:22:55,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:22:56,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:23:00,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:23:01,489 INFO [train.py:1046] (1/4) Epoch 21, batch 3750, loss[loss=0.1816, simple_loss=0.2569, pruned_loss=0.05308, over 23426.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2525, pruned_loss=0.05053, over 4712170.81 frames. ], batch size: 106, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:23:01,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:23:01,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:23:04,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 03:23:06,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 03:23:09,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 03:23:09,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 03:23:10,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:23:10,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:23:12,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:23:14,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:23:17,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:23:20,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:23:20,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:23:21,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=733346.6666666666, ans=0.125 2023-10-02 03:23:23,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:23:26,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:23:26,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 03:23:26,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:23:27,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:23:27,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:23:31,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 03:23:35,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 03:23:36,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:23:37,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:23:38,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=733413.3333333334, ans=0.125 2023-10-02 03:23:39,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:23:43,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:23:46,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 03:23:51,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 03:23:54,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:23:54,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=733480.0, ans=0.125 2023-10-02 03:23:57,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:23:57,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:24:00,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=733546.6666666666, ans=0.125 2023-10-02 03:24:01,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:24:01,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=733546.6666666666, ans=0.125 2023-10-02 03:24:04,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 03:24:05,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:24:07,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:24:09,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:24:13,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:24:16,561 INFO [train.py:1046] (1/4) Epoch 21, batch 3800, loss[loss=0.1695, simple_loss=0.2343, pruned_loss=0.05235, over 23698.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2526, pruned_loss=0.05071, over 4708857.71 frames. ], batch size: 232, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:24:21,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:24:24,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:24:26,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 03:24:27,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 03:24:28,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:24:31,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:24:31,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 03:24:34,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 03:24:34,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:24:34,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:24:35,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:24:37,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:24:37,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:24:37,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 03:24:39,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=733680.0, ans=0.5 2023-10-02 03:24:40,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 03:24:42,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:24:46,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:24:48,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:24:50,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:24:50,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=733746.6666666666, ans=0.1 2023-10-02 03:24:51,633 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.912e+02 2.251e+02 2.632e+02 4.326e+02, threshold=4.502e+02, percent-clipped=0.0 2023-10-02 03:24:51,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 03:24:51,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:24:53,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:24:55,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:24:56,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=733746.6666666666, ans=0.125 2023-10-02 03:24:59,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 03:24:59,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 03:25:00,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:25:03,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_na.min_abs, batch_count=733813.3333333334, ans=0.02 2023-10-02 03:25:06,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:25:08,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=733813.3333333334, ans=0.025 2023-10-02 03:25:11,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:25:11,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=733813.3333333334, ans=0.125 2023-10-02 03:25:12,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 03:25:14,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 03:25:15,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:25:18,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:25:20,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:25:21,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 03:25:24,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 03:25:24,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 03:25:24,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:25:26,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:25:30,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:25:30,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:25:32,225 INFO [train.py:1046] (1/4) Epoch 21, batch 3850, loss[loss=0.1625, simple_loss=0.2525, pruned_loss=0.03618, over 24286.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2513, pruned_loss=0.05045, over 4710379.08 frames. ], batch size: 74, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:25:36,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:25:38,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 03:25:38,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:25:39,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:25:40,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=733946.6666666666, ans=0.125 2023-10-02 03:25:42,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:25:47,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:25:48,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 03:25:48,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 03:25:56,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:25:57,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:26:01,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:26:01,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:26:04,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:04,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:26:06,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:06,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:26:06,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=734080.0, ans=0.1 2023-10-02 03:26:07,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:08,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:10,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:10,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:26:10,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 03:26:10,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 03:26:12,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:26:12,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:15,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:15,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:15,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 03:26:18,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 03:26:19,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:23,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 03:26:24,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 03:26:30,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:30,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:33,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:34,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 03:26:36,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 03:26:40,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:40,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:43,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:26:43,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:26:43,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:44,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:44,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:26:45,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 03:26:45,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:26:46,313 INFO [train.py:1046] (1/4) Epoch 21, batch 3900, loss[loss=0.1588, simple_loss=0.2322, pruned_loss=0.0427, over 23461.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.25, pruned_loss=0.04983, over 4713948.28 frames. ], batch size: 134, lr: 4.87e-03, grad_scale: 8.0 2023-10-02 03:26:47,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 03:26:47,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:47,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:49,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:26:49,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:51,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:26:51,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:51,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:53,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:26:53,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 03:26:53,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:54,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:26:56,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:26:56,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:26:56,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=734280.0, ans=0.0 2023-10-02 03:26:57,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:27:02,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:27:02,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:27:03,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:27:04,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 03:27:06,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:27:06,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 03:27:06,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:27:09,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 03:27:09,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 03:27:11,304 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.20 vs. limit=10.0 2023-10-02 03:27:15,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:27:16,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:27:16,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:27:16,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=734413.3333333334, ans=0.1 2023-10-02 03:27:18,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:27:21,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:27:22,591 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.899e+02 2.124e+02 2.634e+02 1.113e+03, threshold=4.247e+02, percent-clipped=1.0 2023-10-02 03:27:24,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:27:26,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:27:26,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:27:27,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:27:33,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:27:33,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:27:40,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:27:40,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:27:51,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:27:53,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:27:55,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 03:27:55,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 03:27:56,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:27:56,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 03:27:58,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:27:59,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 03:28:02,212 INFO [train.py:1046] (1/4) Epoch 21, batch 3950, loss[loss=0.1858, simple_loss=0.2687, pruned_loss=0.05141, over 24361.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2492, pruned_loss=0.04932, over 4705288.97 frames. ], batch size: 77, lr: 4.87e-03, grad_scale: 8.0 2023-10-02 03:28:02,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=734613.3333333334, ans=0.0 2023-10-02 03:28:05,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:28:07,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 03:28:07,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:28:10,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:28:10,880 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.79 vs. limit=22.5 2023-10-02 03:28:12,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:28:16,176 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 03:28:17,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:28:18,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 03:28:19,291 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 03:28:19,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:28:21,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:28:21,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:28:22,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:28:25,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 03:28:26,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:28:28,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:28:28,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:28:28,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:28:29,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:28:34,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=734746.6666666666, ans=0.125 2023-10-02 03:28:35,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=734746.6666666666, ans=0.95 2023-10-02 03:28:39,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:28:40,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:28:40,290 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=734746.6666666666, ans=0.1 2023-10-02 03:28:44,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 03:28:51,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 03:28:51,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 03:28:51,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:28:54,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:29:02,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:29:02,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:29:03,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:29:03,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:29:03,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 03:29:07,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:29:07,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=734880.0, ans=0.1 2023-10-02 03:29:08,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:29:13,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 03:29:13,695 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.53 vs. limit=15.0 2023-10-02 03:29:17,017 INFO [train.py:1046] (1/4) Epoch 21, batch 4000, loss[loss=0.1721, simple_loss=0.2469, pruned_loss=0.04865, over 23393.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2498, pruned_loss=0.04965, over 4710050.74 frames. ], batch size: 119, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:29:19,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=734946.6666666666, ans=0.1 2023-10-02 03:29:23,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:29:29,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:29:35,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:29:35,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:29:36,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:29:36,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 03:29:36,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=735013.3333333334, ans=0.0 2023-10-02 03:29:38,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:29:38,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 03:29:38,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:29:38,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 03:29:41,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:29:44,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:29:44,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:29:44,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:29:44,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:29:44,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 03:29:47,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:29:48,554 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 03:29:50,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:29:50,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:29:53,497 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 03:29:54,650 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.771e+02 2.002e+02 2.200e+02 3.082e+02, threshold=4.004e+02, percent-clipped=0.0 2023-10-02 03:29:54,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:29:54,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:29:59,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 03:30:01,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:30:03,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:30:04,624 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 03:30:05,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:30:06,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 03:30:06,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:30:07,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:30:09,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:30:09,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=735146.6666666666, ans=0.1 2023-10-02 03:30:10,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:30:10,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:30:10,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:30:13,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 03:30:13,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:30:16,153 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 03:30:19,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:30:22,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 03:30:25,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:30:25,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:30:26,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:30:26,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=735213.3333333334, ans=0.0 2023-10-02 03:30:28,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:30:30,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=735213.3333333334, ans=0.1 2023-10-02 03:30:33,400 INFO [train.py:1046] (1/4) Epoch 21, batch 4050, loss[loss=0.173, simple_loss=0.2454, pruned_loss=0.05032, over 23449.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2516, pruned_loss=0.04989, over 4702573.78 frames. ], batch size: 119, lr: 4.87e-03, grad_scale: 8.0 2023-10-02 03:30:34,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:30:36,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 03:30:36,913 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.74 vs. limit=10.0 2023-10-02 03:30:37,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 03:30:39,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:30:40,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:30:40,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:30:41,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:30:43,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:30:46,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:30:47,709 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.78 vs. limit=8.0 2023-10-02 03:30:49,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:30:50,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 03:30:52,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:30:52,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:30:57,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:30:58,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:31:01,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 03:31:02,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 03:31:02,768 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 03:31:04,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:31:12,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 03:31:12,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:31:15,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:31:18,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:31:19,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:31:19,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:31:22,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:31:27,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 03:31:27,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 03:31:28,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:31:29,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 03:31:32,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:31:35,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=735546.6666666666, ans=0.0 2023-10-02 03:31:40,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 03:31:42,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:31:42,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:31:43,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 03:31:43,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 03:31:43,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:31:46,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=735546.6666666666, ans=0.125 2023-10-02 03:31:47,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:31:48,519 INFO [train.py:1046] (1/4) Epoch 21, batch 4100, loss[loss=0.1488, simple_loss=0.2255, pruned_loss=0.03608, over 24322.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2524, pruned_loss=0.04983, over 4717113.66 frames. ], batch size: 56, lr: 4.87e-03, grad_scale: 8.0 2023-10-02 03:31:48,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:31:48,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:31:54,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 03:31:57,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 03:31:57,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 03:31:59,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 03:31:59,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:31:59,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=735613.3333333334, ans=0.125 2023-10-02 03:32:00,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:32:00,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:32:02,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:32:02,119 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 03:32:06,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:32:06,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:32:07,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:32:07,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:32:07,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=735680.0, ans=0.1 2023-10-02 03:32:10,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:32:11,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:32:12,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:32:12,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 03:32:13,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:32:13,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:32:14,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:32:14,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:32:15,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 03:32:17,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:32:18,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 03:32:20,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:32:23,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:32:23,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 03:32:25,650 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.845e+02 2.063e+02 2.238e+02 3.608e+02, threshold=4.126e+02, percent-clipped=0.0 2023-10-02 03:32:25,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:32:25,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:32:27,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:32:28,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 03:32:30,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:32:32,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:32:32,542 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:32:34,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 03:32:35,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:32:35,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:32:36,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=735813.3333333334, ans=0.125 2023-10-02 03:32:40,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:32:45,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:32:45,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=735813.3333333334, ans=0.0 2023-10-02 03:32:48,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:32:50,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:32:55,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:32:55,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:32:58,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:32:59,221 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.74 vs. limit=12.0 2023-10-02 03:33:00,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:33:01,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=735946.6666666666, ans=0.5 2023-10-02 03:33:03,331 INFO [train.py:1046] (1/4) Epoch 21, batch 4150, loss[loss=0.1556, simple_loss=0.2347, pruned_loss=0.03823, over 24326.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2524, pruned_loss=0.05027, over 4716848.87 frames. ], batch size: 56, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:33:04,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:33:06,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:33:08,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:33:08,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:33:11,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 03:33:11,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:33:12,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 03:33:12,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 03:33:14,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 03:33:14,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:33:18,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:33:18,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:33:23,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:33:23,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:33:24,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:33:26,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 03:33:26,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:33:27,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 03:33:27,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=736013.3333333334, ans=0.125 2023-10-02 03:33:31,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:33:34,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:33:37,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 03:33:39,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 03:33:39,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:33:40,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=736080.0, ans=0.125 2023-10-02 03:33:43,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 03:33:43,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:33:43,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:33:44,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:33:45,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:33:50,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 03:33:53,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:33:54,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:33:56,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 03:33:56,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:33:57,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 03:33:59,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=736146.6666666666, ans=0.07 2023-10-02 03:34:00,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:34:00,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:34:02,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:34:03,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 03:34:03,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:03,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 03:34:04,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff2.min_abs, batch_count=736213.3333333334, ans=0.1 2023-10-02 03:34:05,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:34:07,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 03:34:07,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:34:07,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:34:07,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:34:09,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 03:34:09,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:34:09,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:34:11,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:34:11,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:34:11,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 03:34:12,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:34:14,701 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.51 vs. limit=15.0 2023-10-02 03:34:16,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:34:18,535 INFO [train.py:1046] (1/4) Epoch 21, batch 4200, loss[loss=0.1891, simple_loss=0.2539, pruned_loss=0.0621, over 23812.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2507, pruned_loss=0.04999, over 4717977.52 frames. ], batch size: 195, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:34:18,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 03:34:18,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=736280.0, ans=0.0 2023-10-02 03:34:20,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:34:22,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:34:24,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:34:24,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:34:24,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:34:27,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 03:34:29,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 03:34:31,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:32,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:34:34,976 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.56 vs. limit=6.0 2023-10-02 03:34:35,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:34:39,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 03:34:39,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:34:39,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:42,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 03:34:42,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:34:43,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:44,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:34:44,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:34:46,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:34:47,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 03:34:47,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:52,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:34:52,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:34:54,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:34:55,340 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.914e+02 2.073e+02 2.283e+02 3.336e+02, threshold=4.146e+02, percent-clipped=0.0 2023-10-02 03:34:55,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=736413.3333333334, ans=0.95 2023-10-02 03:34:56,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:34:58,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:34:58,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 03:34:58,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:34:59,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:35:01,743 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:35:02,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:35:04,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:35:10,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=736480.0, ans=0.125 2023-10-02 03:35:13,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:35:14,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 03:35:17,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:35:23,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 03:35:23,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:35:25,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 03:35:25,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=736546.6666666666, ans=0.125 2023-10-02 03:35:30,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:35:33,678 INFO [train.py:1046] (1/4) Epoch 21, batch 4250, loss[loss=0.1752, simple_loss=0.261, pruned_loss=0.04465, over 24010.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2501, pruned_loss=0.04954, over 4711789.73 frames. ], batch size: 80, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:35:35,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:35:35,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:35:39,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:35:44,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:35:44,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 03:35:44,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:35:49,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:35:53,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:35:56,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:35:56,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:35:57,237 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.61 vs. limit=15.0 2023-10-02 03:35:59,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:35:59,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:36:00,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:36:02,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:36:03,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:36:05,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:36:05,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:36:06,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 03:36:09,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 03:36:09,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:36:09,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:36:11,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:36:11,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:36:11,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:36:11,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:36:16,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 03:36:17,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:36:21,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:36:23,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:36:25,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 03:36:25,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:36:25,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 03:36:26,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:36:26,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=736813.3333333334, ans=0.125 2023-10-02 03:36:28,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:36:29,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:36:29,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:36:32,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 03:36:33,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:36:35,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:36:39,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:36:42,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:36:44,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:36:44,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:36:45,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:36:47,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:36:49,218 INFO [train.py:1046] (1/4) Epoch 21, batch 4300, loss[loss=0.1729, simple_loss=0.2632, pruned_loss=0.04131, over 24302.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2491, pruned_loss=0.0493, over 4720132.61 frames. ], batch size: 74, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:36:49,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:36:49,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 03:36:51,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:36:52,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=736946.6666666666, ans=0.2 2023-10-02 03:36:54,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:36:56,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:37:00,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:37:02,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=737013.3333333334, ans=0.035 2023-10-02 03:37:09,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:37:09,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 03:37:09,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:37:12,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:37:12,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:37:12,161 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 03:37:14,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=737013.3333333334, ans=0.125 2023-10-02 03:37:15,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:37:15,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=737013.3333333334, ans=0.125 2023-10-02 03:37:16,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:37:20,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 03:37:20,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:37:21,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 03:37:22,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 03:37:24,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:37:25,612 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.828e+02 2.109e+02 2.471e+02 4.007e+02, threshold=4.219e+02, percent-clipped=0.0 2023-10-02 03:37:27,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:37:27,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:37:29,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:37:30,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:37:30,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:37:31,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 03:37:32,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 03:37:34,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:37:37,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:37:37,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:37:37,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:37:37,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:37:37,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 03:37:38,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 03:37:38,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 03:37:40,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:37:40,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 03:37:41,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 03:37:44,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:37:46,121 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 03:37:46,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:37:49,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:37:49,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:37:50,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 03:37:51,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=737213.3333333334, ans=0.0 2023-10-02 03:37:52,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:37:52,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:37:54,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:37:54,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:37:55,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:37:56,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:37:56,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=737213.3333333334, ans=0.0 2023-10-02 03:37:59,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:38:00,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:38:00,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:38:02,702 INFO [train.py:1046] (1/4) Epoch 21, batch 4350, loss[loss=0.1531, simple_loss=0.2365, pruned_loss=0.03485, over 24470.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2497, pruned_loss=0.04926, over 4702886.68 frames. ], batch size: 63, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:38:04,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=737280.0, ans=0.015 2023-10-02 03:38:04,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=737280.0, ans=0.125 2023-10-02 03:38:05,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 03:38:05,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 03:38:07,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=737280.0, ans=0.0 2023-10-02 03:38:09,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:38:13,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:38:16,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:38:16,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:38:21,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:38:25,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:38:27,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:38:28,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:38:31,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:38:33,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:38:33,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:38:37,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 03:38:38,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:38:38,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:38:44,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:38:47,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 03:38:47,958 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:38:51,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:38:52,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 03:38:57,152 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 03:38:59,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:38:59,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:39:00,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=737480.0, ans=0.1 2023-10-02 03:39:01,196 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 03:39:01,260 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 03:39:01,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:39:02,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:39:02,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:39:04,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:39:05,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:39:05,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:39:08,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 03:39:08,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:08,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:39:08,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:10,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 03:39:11,436 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 03:39:11,441 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 03:39:11,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 03:39:12,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:39:14,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:39:14,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:39:14,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:39:17,364 INFO [train.py:1046] (1/4) Epoch 21, batch 4400, loss[loss=0.1811, simple_loss=0.25, pruned_loss=0.05606, over 23798.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2498, pruned_loss=0.04884, over 4718822.75 frames. ], batch size: 195, lr: 4.86e-03, grad_scale: 16.0 2023-10-02 03:39:17,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 03:39:18,705 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 03:39:18,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:23,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:39:23,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:25,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:39:25,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=737613.3333333334, ans=0.125 2023-10-02 03:39:26,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 03:39:26,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 03:39:27,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 03:39:27,995 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 03:39:28,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 03:39:29,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:39:29,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=737613.3333333334, ans=0.0 2023-10-02 03:39:30,131 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.85 vs. limit=22.5 2023-10-02 03:39:30,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 03:39:32,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:34,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:39:34,166 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 03:39:38,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:39:38,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 03:39:38,311 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 03:39:41,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 03:39:42,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 03:39:42,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 03:39:42,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:39:42,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:39:43,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:39:43,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:39:45,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 03:39:45,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 03:39:45,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=737746.6666666666, ans=0.1 2023-10-02 03:39:47,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:39:48,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:39:48,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:50,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=737746.6666666666, ans=0.05 2023-10-02 03:39:51,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:39:51,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:39:52,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 03:39:53,346 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 03:39:55,156 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.905e+02 2.159e+02 2.393e+02 3.383e+02, threshold=4.317e+02, percent-clipped=0.0 2023-10-02 03:39:56,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:40:04,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:40:05,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 03:40:09,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:40:11,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:40:15,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:40:15,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 03:40:16,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:40:16,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:40:16,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:40:16,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:40:20,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 03:40:23,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 03:40:26,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 03:40:26,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:40:26,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 03:40:26,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:40:26,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=737880.0, ans=0.125 2023-10-02 03:40:27,281 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.65 vs. limit=6.0 2023-10-02 03:40:28,781 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.95 vs. limit=15.0 2023-10-02 03:40:29,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:40:29,562 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:40:30,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 03:40:30,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=737946.6666666666, ans=0.0 2023-10-02 03:40:30,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=737946.6666666666, ans=0.0 2023-10-02 03:40:31,976 INFO [train.py:1046] (1/4) Epoch 21, batch 4450, loss[loss=0.1716, simple_loss=0.2487, pruned_loss=0.04727, over 23247.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.251, pruned_loss=0.04932, over 4712111.59 frames. ], batch size: 119, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:40:33,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:40:33,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=737946.6666666666, ans=0.07 2023-10-02 03:40:34,239 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.57 vs. limit=15.0 2023-10-02 03:40:36,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:40:36,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:40:41,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:40:41,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:40:45,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:40:47,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:40:49,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:40:49,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:40:52,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 03:40:52,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:40:53,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:40:54,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:40:54,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:40:56,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:41:01,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:41:01,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:41:02,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=738080.0, ans=0.0 2023-10-02 03:41:02,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=738080.0, ans=0.125 2023-10-02 03:41:03,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:41:04,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:41:06,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:41:07,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=738080.0, ans=0.0 2023-10-02 03:41:07,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=738080.0, ans=0.0 2023-10-02 03:41:10,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 03:41:11,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 03:41:13,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 03:41:13,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:41:14,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:41:15,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 03:41:18,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:41:21,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:41:23,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 03:41:23,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=738146.6666666666, ans=0.05 2023-10-02 03:41:25,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:41:25,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:41:25,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:41:25,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:41:26,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:41:31,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:41:31,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 03:41:32,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:41:34,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:41:36,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:41:36,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=738213.3333333334, ans=0.125 2023-10-02 03:41:37,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:41:37,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 03:41:40,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:41:41,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 03:41:43,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:41:46,106 INFO [train.py:1046] (1/4) Epoch 21, batch 4500, loss[loss=0.1729, simple_loss=0.2456, pruned_loss=0.05013, over 24464.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2508, pruned_loss=0.04991, over 4713490.28 frames. ], batch size: 58, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:41:46,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=738280.0, ans=0.0 2023-10-02 03:41:47,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:41:48,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 03:41:48,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 03:41:51,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:41:56,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:41:56,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=738280.0, ans=0.1 2023-10-02 03:41:57,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:41:59,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:42:00,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:42:01,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:42:01,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:42:11,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:42:13,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:42:15,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:42:17,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:42:17,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:42:21,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:42:24,906 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.913e+02 2.114e+02 2.419e+02 4.024e+02, threshold=4.228e+02, percent-clipped=0.0 2023-10-02 03:42:25,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:42:29,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:42:33,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:42:34,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 03:42:35,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:42:35,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:42:39,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:42:39,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:42:40,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:42:40,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 03:42:40,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 03:42:40,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:42:44,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:42:44,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:42:47,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:42:48,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:42:48,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:42:51,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 03:42:53,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 03:42:53,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 03:42:56,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 03:42:56,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=738546.6666666666, ans=0.1 2023-10-02 03:43:00,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 03:43:01,962 INFO [train.py:1046] (1/4) Epoch 21, batch 4550, loss[loss=0.154, simple_loss=0.2284, pruned_loss=0.0398, over 24318.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.25, pruned_loss=0.04966, over 4697829.31 frames. ], batch size: 56, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:43:02,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:43:06,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:43:06,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:43:08,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:43:10,948 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:43:13,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:43:15,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:43:16,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:43:16,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:43:16,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:20,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:43:20,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:43:24,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:43:26,156 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.57 vs. limit=22.5 2023-10-02 03:43:26,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 03:43:26,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 03:43:28,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:43:30,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 03:43:34,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 03:43:35,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:43:36,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 03:43:37,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:43:40,362 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.06 vs. limit=15.0 2023-10-02 03:43:40,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:40,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:40,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:43:43,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 03:43:45,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:43:45,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=738813.3333333334, ans=0.04949747468305833 2023-10-02 03:43:45,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=738813.3333333334, ans=0.125 2023-10-02 03:43:46,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:48,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:43:49,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:43:52,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 03:43:52,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 03:43:52,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:43:54,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 03:43:54,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=738813.3333333334, ans=0.0 2023-10-02 03:43:55,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 03:43:56,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:43:56,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:43:56,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:43:58,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:58,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:44:00,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:44:02,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 03:44:03,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:44:03,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 03:44:03,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 03:44:03,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:44:04,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 03:44:07,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:44:07,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:44:10,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:44:10,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=738880.0, ans=0.0 2023-10-02 03:44:11,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:44:11,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:44:13,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:44:15,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:44:15,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=738946.6666666666, ans=0.125 2023-10-02 03:44:16,624 INFO [train.py:1046] (1/4) Epoch 21, batch 4600, loss[loss=0.1439, simple_loss=0.2294, pruned_loss=0.02917, over 22466.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2489, pruned_loss=0.0492, over 4700169.26 frames. ], batch size: 49, lr: 4.85e-03, grad_scale: 8.0 2023-10-02 03:44:18,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:18,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:44:20,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:44:20,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:44:22,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:44:23,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 03:44:24,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=738946.6666666666, ans=0.1 2023-10-02 03:44:26,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:44:30,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:44:30,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:44:33,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:39,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 03:44:40,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:42,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:42,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=739013.3333333334, ans=0.0 2023-10-02 03:44:45,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=739080.0, ans=0.04949747468305833 2023-10-02 03:44:46,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:44:46,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:44:51,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 03:44:51,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:44:52,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:44:54,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=739080.0, ans=0.125 2023-10-02 03:44:55,817 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.844e+02 1.997e+02 2.333e+02 3.874e+02, threshold=3.995e+02, percent-clipped=0.0 2023-10-02 03:44:58,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:58,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:44:58,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=739080.0, ans=0.0 2023-10-02 03:45:00,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:45:01,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=739146.6666666666, ans=0.0 2023-10-02 03:45:02,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=739146.6666666666, ans=0.0 2023-10-02 03:45:06,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 03:45:07,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 03:45:09,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=739146.6666666666, ans=0.1 2023-10-02 03:45:10,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:12,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:45:14,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=739146.6666666666, ans=0.0 2023-10-02 03:45:15,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:15,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 03:45:15,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:45:16,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 03:45:16,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:16,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:45:18,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:19,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:45:19,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:45:21,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 03:45:21,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 03:45:22,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 03:45:22,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:45:22,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:45:23,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:45:25,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:45:32,098 INFO [train.py:1046] (1/4) Epoch 21, batch 4650, loss[loss=0.165, simple_loss=0.2525, pruned_loss=0.03879, over 24616.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2488, pruned_loss=0.04895, over 4712647.03 frames. ], batch size: 68, lr: 4.85e-03, grad_scale: 8.0 2023-10-02 03:45:35,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:45:38,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:45:38,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:45:39,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:45:39,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:45:39,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:45:39,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:45:40,169 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.76 vs. limit=15.0 2023-10-02 03:45:43,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 03:45:46,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:45:47,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 03:45:47,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:45:49,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 03:45:49,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:45:51,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 03:45:51,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 03:45:51,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:51,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:45:55,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:45:56,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:45:56,848 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 03:46:00,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:46:00,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 03:46:03,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:46:03,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:46:03,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 03:46:06,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:46:09,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:46:12,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:46:17,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:46:20,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:46:22,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:46:23,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:46:26,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 03:46:26,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 03:46:28,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 03:46:28,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 03:46:29,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:46:36,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:46:36,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:46:38,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 03:46:38,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:46:39,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:46:39,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:46:41,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:46:43,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:46:43,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:46:45,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:46:46,504 INFO [train.py:1046] (1/4) Epoch 21, batch 4700, loss[loss=0.1509, simple_loss=0.2322, pruned_loss=0.03484, over 24601.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2493, pruned_loss=0.04855, over 4714689.49 frames. ], batch size: 60, lr: 4.85e-03, grad_scale: 8.0 2023-10-02 03:46:48,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:46:49,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:46:49,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:46:49,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 03:46:50,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:46:52,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 03:46:58,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:46:59,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:46:59,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:47:01,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:47:03,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 03:47:07,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=739680.0, ans=0.125 2023-10-02 03:47:08,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 03:47:08,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 03:47:12,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:47:13,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:47:13,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:47:16,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:47:16,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=739746.6666666666, ans=0.1 2023-10-02 03:47:22,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:47:22,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 03:47:22,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=739746.6666666666, ans=0.0 2023-10-02 03:47:25,319 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.835e+02 1.990e+02 2.333e+02 4.154e+02, threshold=3.980e+02, percent-clipped=1.0 2023-10-02 03:47:25,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:47:31,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 03:47:31,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:47:34,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:36,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=739813.3333333334, ans=0.05 2023-10-02 03:47:38,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 03:47:40,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:47:44,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:47:46,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 03:47:47,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:47,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:47:48,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=739880.0, ans=0.125 2023-10-02 03:47:49,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:47:50,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:47:50,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 03:47:52,203 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 03:47:53,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:47:55,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:55,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:55,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 03:47:56,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:58,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=739880.0, ans=0.0 2023-10-02 03:47:58,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=739880.0, ans=0.1 2023-10-02 03:48:00,877 INFO [train.py:1046] (1/4) Epoch 21, batch 4750, loss[loss=0.1796, simple_loss=0.2449, pruned_loss=0.0572, over 23413.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2504, pruned_loss=0.04907, over 4697192.07 frames. ], batch size: 285, lr: 4.85e-03, grad_scale: 8.0 2023-10-02 03:48:00,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 03:48:02,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:48:03,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:48:07,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:48:07,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:48:08,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 03:48:08,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:48:10,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=739946.6666666666, ans=0.125 2023-10-02 03:48:13,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 03:48:16,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:48:16,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:48:17,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:48:21,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 03:48:26,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=740013.3333333334, ans=0.125 2023-10-02 03:48:27,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:48:28,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 03:48:28,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:48:33,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:48:33,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:48:33,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:48:35,102 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 03:48:35,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 03:48:41,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 03:48:43,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:48:45,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:48:47,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:48:47,892 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 03:48:47,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:48:50,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:48:52,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:48:53,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 03:48:53,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 03:48:54,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:48:54,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:48:56,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:48:56,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:48:56,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 03:48:57,478 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.84 vs. limit=5.0 2023-10-02 03:48:59,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 03:49:00,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=740213.3333333334, ans=0.1 2023-10-02 03:49:01,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:02,554 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=8.06 vs. limit=12.0 2023-10-02 03:49:04,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:49:05,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 03:49:05,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:49:05,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:49:06,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:49:09,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:10,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:49:11,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=740213.3333333334, ans=0.125 2023-10-02 03:49:12,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:49:12,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 03:49:13,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 03:49:15,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 03:49:16,050 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.52 vs. limit=6.0 2023-10-02 03:49:17,015 INFO [train.py:1046] (1/4) Epoch 21, batch 4800, loss[loss=0.1603, simple_loss=0.2402, pruned_loss=0.04018, over 24270.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2514, pruned_loss=0.04949, over 4704550.61 frames. ], batch size: 56, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:49:17,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=740280.0, ans=0.1 2023-10-02 03:49:18,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:49:18,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:49:19,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 03:49:20,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=740280.0, ans=0.2 2023-10-02 03:49:22,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=740280.0, ans=0.125 2023-10-02 03:49:24,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:25,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:29,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:49:31,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:49:31,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:31,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 03:49:34,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:49:34,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:49:34,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:49:36,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=740346.6666666666, ans=0.0 2023-10-02 03:49:37,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:49:39,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:49:40,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:49:41,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:49:41,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 03:49:41,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:43,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:49:45,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:49:49,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:50,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:50,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:49:51,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:49:53,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:55,819 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.875e+02 2.035e+02 2.300e+02 3.149e+02, threshold=4.071e+02, percent-clipped=0.0 2023-10-02 03:49:55,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 03:49:55,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 03:49:57,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:57,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:49:57,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:49:57,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:49:57,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:49:58,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:50:00,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:50:03,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:50:07,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:07,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=740480.0, ans=0.2 2023-10-02 03:50:08,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:50:12,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=740480.0, ans=0.07 2023-10-02 03:50:13,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=740480.0, ans=0.0 2023-10-02 03:50:15,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 03:50:15,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:50:15,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:15,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:50:16,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:50:20,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:50:21,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:50:21,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:21,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:50:21,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=740546.6666666666, ans=0.125 2023-10-02 03:50:22,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:50:24,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:50:27,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:50:27,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:27,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:50:28,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 03:50:31,121 INFO [train.py:1046] (1/4) Epoch 21, batch 4850, loss[loss=0.1612, simple_loss=0.2518, pruned_loss=0.0353, over 24629.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2514, pruned_loss=0.04957, over 4711217.45 frames. ], batch size: 68, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:50:31,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 03:50:31,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:50:31,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:50:31,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:50:31,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:34,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:50:34,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=740613.3333333334, ans=0.1 2023-10-02 03:50:43,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 03:50:45,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:50:48,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=740680.0, ans=0.5 2023-10-02 03:50:49,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:50:49,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:50:50,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:54,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:50:54,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=740680.0, ans=0.125 2023-10-02 03:50:55,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:50:56,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:50:56,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 03:51:01,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:51:02,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:51:02,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:51:03,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:51:03,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 03:51:07,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:51:07,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:51:10,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:51:10,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 03:51:10,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 03:51:11,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:51:11,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=740746.6666666666, ans=0.125 2023-10-02 03:51:16,458 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.17 vs. limit=22.5 2023-10-02 03:51:19,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:51:20,611 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.45 vs. limit=22.5 2023-10-02 03:51:21,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 03:51:21,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:51:21,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:51:21,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=740813.3333333334, ans=0.0 2023-10-02 03:51:22,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:51:25,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 03:51:25,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:51:27,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 03:51:27,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:51:28,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:51:28,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 03:51:38,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:51:41,104 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.77 vs. limit=6.0 2023-10-02 03:51:43,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:51:43,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:51:44,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=740880.0, ans=0.05 2023-10-02 03:51:46,617 INFO [train.py:1046] (1/4) Epoch 21, batch 4900, loss[loss=0.1785, simple_loss=0.2481, pruned_loss=0.05444, over 23667.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2504, pruned_loss=0.04928, over 4705509.04 frames. ], batch size: 149, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:51:48,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 03:51:48,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:51:52,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:51:53,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:51:53,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:51:57,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 03:51:57,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=740946.6666666666, ans=0.0 2023-10-02 03:52:01,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 03:52:04,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 03:52:05,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 03:52:05,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:52:05,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:52:05,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:52:05,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:52:05,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:52:07,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 03:52:09,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=741013.3333333334, ans=0.1 2023-10-02 03:52:12,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 03:52:14,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:52:15,217 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.83 vs. limit=22.5 2023-10-02 03:52:15,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:52:17,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:52:18,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:52:20,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:52:20,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:52:20,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 03:52:22,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:52:23,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:52:23,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 03:52:23,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 03:52:25,622 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.942e+02 2.183e+02 2.601e+02 5.042e+02, threshold=4.365e+02, percent-clipped=7.0 2023-10-02 03:52:27,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 03:52:29,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:52:31,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:52:31,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:52:33,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:52:33,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 03:52:33,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:52:33,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 03:52:34,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=741146.6666666666, ans=0.5 2023-10-02 03:52:35,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:52:37,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 03:52:39,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:52:42,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 03:52:44,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:52:44,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 03:52:45,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 03:52:53,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:52:54,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:52:56,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 03:52:56,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:52:56,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:52:57,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:53:01,703 INFO [train.py:1046] (1/4) Epoch 21, batch 4950, loss[loss=0.1758, simple_loss=0.2617, pruned_loss=0.04493, over 24519.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2495, pruned_loss=0.04863, over 4704573.79 frames. ], batch size: 71, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:53:01,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:53:01,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:53:03,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:53:03,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 03:53:05,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:53:07,167 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.57 vs. limit=15.0 2023-10-02 03:53:07,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:53:07,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:53:11,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 03:53:11,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 03:53:11,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:53:12,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 03:53:12,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:12,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:53:14,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:53:14,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:53:16,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:53:17,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:53:19,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:53:21,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:53:22,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:22,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:53:25,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:53:29,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:29,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=741346.6666666666, ans=0.1 2023-10-02 03:53:30,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:53:32,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:33,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:53:35,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:53:35,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 03:53:35,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 03:53:38,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:53:41,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:53:41,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:53:43,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:53:43,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:53:44,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:53:47,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:53:49,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:53:51,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:53:52,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:53,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:53:54,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 03:53:54,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:53:55,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:53:58,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:54:00,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:54:00,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:54:00,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:54:00,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:54:01,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=741546.6666666666, ans=0.2 2023-10-02 03:54:02,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:54:03,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:54:05,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:54:05,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:54:06,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 03:54:11,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:54:15,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 03:54:15,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 03:54:18,269 INFO [train.py:1046] (1/4) Epoch 21, batch 5000, loss[loss=0.1844, simple_loss=0.2353, pruned_loss=0.06673, over 19196.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2491, pruned_loss=0.04927, over 4699751.93 frames. ], batch size: 388, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:54:22,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:54:22,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:54:23,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 03:54:25,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 03:54:26,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:54:29,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 03:54:29,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:54:30,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:54:30,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 03:54:32,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:54:32,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:54:33,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 03:54:33,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:54:33,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:54:34,250 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.42 vs. limit=6.0 2023-10-02 03:54:36,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 03:54:38,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 03:54:38,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:54:38,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 03:54:38,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:54:39,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:54:40,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:54:40,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 03:54:40,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 03:54:43,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 03:54:43,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:54:45,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:54:45,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 03:54:45,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:54:47,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:54:47,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=741746.6666666666, ans=0.2 2023-10-02 03:54:49,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:54:50,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 03:54:51,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 03:54:53,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:54:53,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:54:55,823 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.891e+02 2.105e+02 2.416e+02 3.579e+02, threshold=4.211e+02, percent-clipped=0.0 2023-10-02 03:54:57,360 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 03:55:00,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:55:01,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:55:01,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:04,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 03:55:05,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:55:05,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:55:05,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:55:08,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 03:55:08,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:55:10,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:55:11,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:55:17,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 03:55:21,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:30,914 INFO [train.py:1046] (1/4) Epoch 21, batch 5050, loss[loss=0.1776, simple_loss=0.2535, pruned_loss=0.0508, over 23750.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.25, pruned_loss=0.04917, over 4713592.78 frames. ], batch size: 149, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 03:55:31,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:55:32,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:32,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:55:32,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:55:32,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:55:33,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:55:33,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:38,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:38,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 03:55:40,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:55:42,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:55:44,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:55:44,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 03:55:45,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:55:47,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:55:48,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:55:48,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:55:48,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=742013.3333333334, ans=0.2 2023-10-02 03:55:50,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:55:59,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 03:56:01,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 03:56:01,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:56:02,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 03:56:03,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:56:03,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:56:04,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:56:04,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:56:04,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 03:56:04,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=742080.0, ans=0.0 2023-10-02 03:56:05,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 03:56:06,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:56:09,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:56:12,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:56:12,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 03:56:15,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:56:18,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 03:56:19,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:56:19,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:56:19,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:56:19,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:56:21,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:56:25,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:56:25,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:56:25,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:56:25,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:56:27,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 03:56:28,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:56:29,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:56:33,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:56:33,901 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 03:56:33,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 03:56:34,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=742213.3333333334, ans=0.04949747468305833 2023-10-02 03:56:35,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:56:35,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:56:35,315 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 03:56:36,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:56:36,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 03:56:36,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:56:40,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:56:42,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:56:42,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 03:56:44,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 03:56:45,285 INFO [train.py:1046] (1/4) Epoch 21, batch 5100, loss[loss=0.1745, simple_loss=0.26, pruned_loss=0.04456, over 24383.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2509, pruned_loss=0.0494, over 4705177.25 frames. ], batch size: 77, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 03:56:46,082 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.52 vs. limit=22.5 2023-10-02 03:56:46,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:56:46,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:56:46,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:56:49,683 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 03:56:52,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:56:54,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 03:56:55,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 03:56:55,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:56:57,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:56:59,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=742346.6666666666, ans=0.04949747468305833 2023-10-02 03:57:00,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:57:00,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 03:57:01,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 03:57:04,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:57:06,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:57:06,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=742346.6666666666, ans=0.0 2023-10-02 03:57:06,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=742346.6666666666, ans=0.1 2023-10-02 03:57:08,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:57:12,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 03:57:13,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:57:14,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:57:14,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 03:57:16,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:57:19,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:57:19,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 03:57:19,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=742413.3333333334, ans=0.1 2023-10-02 03:57:20,592 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 03:57:20,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:57:21,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 03:57:21,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 03:57:23,361 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.859e+02 2.079e+02 2.348e+02 3.705e+02, threshold=4.159e+02, percent-clipped=0.0 2023-10-02 03:57:25,204 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.97 vs. limit=6.0 2023-10-02 03:57:25,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:57:29,467 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.34 vs. limit=22.5 2023-10-02 03:57:35,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:57:38,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 03:57:38,326 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 03:57:38,333 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 03:57:39,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 03:57:39,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:57:41,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 03:57:45,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 03:57:46,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=742546.6666666666, ans=0.125 2023-10-02 03:57:47,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:57:47,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=742546.6666666666, ans=0.125 2023-10-02 03:57:48,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:57:51,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 03:57:52,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 03:57:53,381 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.93 vs. limit=15.0 2023-10-02 03:57:54,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 03:57:54,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=742546.6666666666, ans=0.09899494936611666 2023-10-02 03:57:59,334 INFO [train.py:1046] (1/4) Epoch 21, batch 5150, loss[loss=0.1515, simple_loss=0.2248, pruned_loss=0.0391, over 24476.00 frames. ], tot_loss[loss=0.175, simple_loss=0.251, pruned_loss=0.04945, over 4712427.95 frames. ], batch size: 58, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 03:57:59,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:57:59,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:57:59,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:58:00,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:58:02,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:58:03,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:58:04,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 03:58:04,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 03:58:04,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 03:58:04,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:58:06,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 03:58:06,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:58:06,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 03:58:07,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:58:09,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:58:12,540 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.24 vs. limit=15.0 2023-10-02 03:58:14,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 03:58:14,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 03:58:16,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:58:16,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:58:19,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:58:19,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:58:19,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:58:19,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:58:19,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:58:19,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 03:58:22,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:58:22,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:58:25,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:58:27,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 03:58:27,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:58:33,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:58:33,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 03:58:38,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:58:40,167 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.66 vs. limit=12.0 2023-10-02 03:58:46,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:58:46,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:58:49,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:58:49,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:58:51,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 03:58:57,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:58:57,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=742880.0, ans=0.125 2023-10-02 03:58:58,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:58:58,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:59:00,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:59:01,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:59:03,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 03:59:08,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:59:09,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:59:10,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:59:10,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:59:10,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=742880.0, ans=0.1 2023-10-02 03:59:11,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 03:59:11,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:59:11,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:59:13,704 INFO [train.py:1046] (1/4) Epoch 21, batch 5200, loss[loss=0.1776, simple_loss=0.2526, pruned_loss=0.05132, over 23217.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2512, pruned_loss=0.04891, over 4702783.69 frames. ], batch size: 105, lr: 4.84e-03, grad_scale: 32.0 2023-10-02 03:59:13,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:59:16,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:59:18,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:59:20,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:59:23,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 03:59:24,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:59:26,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:59:27,223 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.62 vs. limit=6.0 2023-10-02 03:59:28,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:59:29,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:59:29,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:59:31,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 03:59:34,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=743013.3333333334, ans=0.5 2023-10-02 03:59:35,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:59:35,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:59:38,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 03:59:40,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:59:41,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:59:42,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 03:59:42,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 03:59:44,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 03:59:46,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:59:46,096 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 03:59:46,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:59:47,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:59:47,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:59:48,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 03:59:48,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:59:51,337 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.817e+02 2.050e+02 2.419e+02 3.713e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-02 03:59:51,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:59:53,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 03:59:53,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 03:59:54,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 04:00:00,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 04:00:00,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:00:07,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:00:07,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:00:08,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 04:00:08,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:00:09,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 04:00:09,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:00:09,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:00:12,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:00:14,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:00:18,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:00:20,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:00:20,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:00:24,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:00:24,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 04:00:25,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:00:25,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:00:27,644 INFO [train.py:1046] (1/4) Epoch 21, batch 5250, loss[loss=0.1725, simple_loss=0.2384, pruned_loss=0.05337, over 23796.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2504, pruned_loss=0.04897, over 4698294.86 frames. ], batch size: 212, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 04:00:27,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:00:28,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=743280.0, ans=0.0 2023-10-02 04:00:29,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:00:29,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:00:32,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:00:34,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:00:35,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:00:36,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:00:41,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:00:42,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:00:44,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:00:47,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:00:48,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 04:00:48,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:00:50,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:00:53,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=743346.6666666666, ans=0.125 2023-10-02 04:00:53,663 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.15 vs. limit=22.5 2023-10-02 04:00:53,695 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.90 vs. limit=10.0 2023-10-02 04:01:09,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=743480.0, ans=0.125 2023-10-02 04:01:30,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=743546.6666666666, ans=0.2 2023-10-02 04:01:32,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=743546.6666666666, ans=0.125 2023-10-02 04:01:33,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=743546.6666666666, ans=0.1 2023-10-02 04:01:37,132 INFO [train.py:1046] (1/4) Epoch 21, batch 5300, loss[loss=0.1832, simple_loss=0.271, pruned_loss=0.04763, over 24431.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2485, pruned_loss=0.04901, over 4692094.24 frames. ], batch size: 69, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 04:01:42,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=743613.3333333334, ans=0.125 2023-10-02 04:01:51,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:01:51,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 04:01:51,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 04:01:51,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:01:51,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:01:51,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:01:51,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:01:51,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:01:51,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:01:51,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:01:51,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 04:01:52,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:01:52,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 04:01:52,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 04:01:52,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 04:01:52,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:01:52,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 04:01:52,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 04:01:52,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:01:53,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:01:53,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:01:53,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:01:53,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:01:53,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:01:53,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:01:53,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:01:53,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:01:53,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:01:53,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:01:53,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:01:53,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:01:54,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 04:01:54,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:01:54,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:01:54,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 04:01:54,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 04:01:55,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:01:55,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:01:55,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 04:01:55,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 04:01:55,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 04:01:55,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:01:55,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:01:55,916 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 04:01:55,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 04:01:56,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 04:01:56,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:01:56,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 04:01:56,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 04:01:56,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 04:01:56,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 04:02:03,125 INFO [train.py:1046] (1/4) Epoch 22, batch 0, loss[loss=0.19, simple_loss=0.2714, pruned_loss=0.05427, over 23277.00 frames. ], tot_loss[loss=0.19, simple_loss=0.2714, pruned_loss=0.05427, over 23277.00 frames. ], batch size: 93, lr: 4.73e-03, grad_scale: 32.0 2023-10-02 04:02:03,125 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 04:02:16,027 INFO [train.py:1078] (1/4) Epoch 22, validation: loss=0.3002, simple_loss=0.2661, pruned_loss=0.1671, over 1125622.00 frames. 2023-10-02 04:02:16,028 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20364MB 2023-10-02 04:02:17,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 04:02:19,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:02:20,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:02:20,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=743693.3333333334, ans=0.125 2023-10-02 04:02:25,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:02:25,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:02:26,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:02:26,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 04:02:28,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 04:02:31,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:02:31,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:02:34,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:02:34,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:02:35,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:02:35,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:02:38,931 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 2.142e+02 2.497e+02 3.157e+02 5.918e+02, threshold=4.995e+02, percent-clipped=12.0 2023-10-02 04:02:39,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 04:02:40,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:02:48,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:02:48,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:02:48,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_ff2.min_abs, batch_count=743826.6666666666, ans=0.1 2023-10-02 04:02:49,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 04:02:55,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:02:55,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:02:58,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:03:01,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:03:03,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=743893.3333333334, ans=0.0 2023-10-02 04:03:05,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=743893.3333333334, ans=0.125 2023-10-02 04:03:06,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:03:10,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 04:03:14,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 04:03:15,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:03:15,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:03:15,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:03:16,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:03:18,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 04:03:19,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:03:22,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:03:22,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=743960.0, ans=0.1 2023-10-02 04:03:27,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:03:29,124 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 04:03:30,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:03:31,747 INFO [train.py:1046] (1/4) Epoch 22, batch 50, loss[loss=0.1494, simple_loss=0.2381, pruned_loss=0.0303, over 24447.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2538, pruned_loss=0.04792, over 1065391.94 frames. ], batch size: 66, lr: 4.72e-03, grad_scale: 32.0 2023-10-02 04:03:35,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:03:36,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:03:36,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 04:03:36,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:03:36,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:03:39,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:03:39,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:03:42,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:03:44,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=744026.6666666666, ans=0.125 2023-10-02 04:03:46,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 04:03:46,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:03:52,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:03:54,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 04:03:55,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 04:03:57,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:04:00,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:04:00,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:04:00,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:04:01,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:04:03,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 04:04:03,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:04:09,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:04:09,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=744160.0, ans=0.07 2023-10-02 04:04:12,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:04:12,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:04:12,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 04:04:14,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:04:16,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:04:16,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 04:04:16,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:04:17,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 04:04:25,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:04:25,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:04:26,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:04:30,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:04:30,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:04:31,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 04:04:31,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 04:04:33,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:04:33,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:04:34,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:04:36,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:04:37,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 04:04:37,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 04:04:38,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 04:04:41,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:04:41,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:04:41,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 04:04:41,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 04:04:43,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:04:43,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:04:43,890 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.63 vs. limit=12.0 2023-10-02 04:04:44,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 04:04:45,988 INFO [train.py:1046] (1/4) Epoch 22, batch 100, loss[loss=0.1855, simple_loss=0.2523, pruned_loss=0.05939, over 23796.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2515, pruned_loss=0.04789, over 1888679.60 frames. ], batch size: 164, lr: 4.72e-03, grad_scale: 32.0 2023-10-02 04:04:46,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:04:49,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:04:49,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=744360.0, ans=0.09899494936611666 2023-10-02 04:04:52,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:04:54,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:04:55,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 04:04:55,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:04:59,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:04:59,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:04:59,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:04:59,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:04:59,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:05:01,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 04:05:03,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 04:05:04,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:05:04,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:05:04,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:05:04,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=744426.6666666666, ans=0.125 2023-10-02 04:05:04,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=744426.6666666666, ans=0.125 2023-10-02 04:05:06,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=744426.6666666666, ans=0.125 2023-10-02 04:05:08,460 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.848e+02 2.160e+02 2.576e+02 4.696e+02, threshold=4.321e+02, percent-clipped=0.0 2023-10-02 04:05:08,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 04:05:08,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:05:09,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:05:11,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:05:13,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:05:17,229 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 04:05:17,256 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 04:05:17,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=744493.3333333334, ans=0.125 2023-10-02 04:05:18,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:05:18,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:05:22,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:05:23,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:05:23,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:27,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=744493.3333333334, ans=10.0 2023-10-02 04:05:29,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:31,013 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 04:05:33,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 04:05:36,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=744560.0, ans=0.125 2023-10-02 04:05:37,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:05:37,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:05:39,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:43,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:05:46,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:05:47,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:05:47,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=744626.6666666666, ans=0.1 2023-10-02 04:05:50,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:50,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:05:50,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=744626.6666666666, ans=0.5 2023-10-02 04:05:53,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:05:53,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:05:53,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:54,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 04:05:54,943 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 04:05:54,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:05:56,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:05:56,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:05:56,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:05:56,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 04:05:56,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 04:05:56,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 04:05:58,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:05:58,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=744626.6666666666, ans=0.0 2023-10-02 04:05:59,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:05:59,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:06:00,984 INFO [train.py:1046] (1/4) Epoch 22, batch 150, loss[loss=0.1806, simple_loss=0.2695, pruned_loss=0.04586, over 24290.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2516, pruned_loss=0.04828, over 2513964.24 frames. ], batch size: 74, lr: 4.72e-03, grad_scale: 32.0 2023-10-02 04:06:01,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:06:02,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:06:05,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:06:08,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:06:08,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:06:08,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:11,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:06:11,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:11,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=744693.3333333334, ans=0.125 2023-10-02 04:06:14,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:06:15,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:20,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 04:06:20,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 04:06:20,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 04:06:23,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:06:23,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:06:24,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:06:26,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:06:26,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:06:26,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:26,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:27,749 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 04:06:30,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:06:35,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:06:38,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:06:39,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 04:06:43,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:06:44,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:06:44,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=744893.3333333334, ans=0.0 2023-10-02 04:06:45,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:06:46,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:06:48,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:06:50,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:06:50,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:06:50,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 04:06:54,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:06:55,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:06:55,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:06:55,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:06:59,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:07:00,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 04:07:01,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:07:02,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=744960.0, ans=0.0 2023-10-02 04:07:03,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:07:05,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:07:07,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:07:08,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 04:07:08,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:07:08,498 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 04:07:12,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:07:15,455 INFO [train.py:1046] (1/4) Epoch 22, batch 200, loss[loss=0.1603, simple_loss=0.2354, pruned_loss=0.04257, over 24441.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2539, pruned_loss=0.04961, over 3002064.26 frames. ], batch size: 58, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:07:15,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:07:15,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:07:17,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=745026.6666666666, ans=0.125 2023-10-02 04:07:20,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 04:07:20,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:07:21,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:07:23,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 04:07:23,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=745026.6666666666, ans=0.0 2023-10-02 04:07:24,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:07:25,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:07:27,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:07:31,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:07:31,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:07:31,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:07:40,074 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.425e+02 1.856e+02 2.062e+02 2.415e+02 5.556e+02, threshold=4.124e+02, percent-clipped=1.0 2023-10-02 04:07:50,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:07:51,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:07:52,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:07:53,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:07:53,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=745160.0, ans=0.125 2023-10-02 04:07:53,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=745160.0, ans=0.1 2023-10-02 04:07:54,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 04:07:54,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:07:54,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:07:56,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:07:57,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:07:57,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:07:59,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 04:07:59,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 04:07:59,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:08:05,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:08:05,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=745226.6666666666, ans=0.125 2023-10-02 04:08:10,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:08:16,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:16,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:08:25,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:28,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 04:08:28,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:08:28,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:08:29,823 INFO [train.py:1046] (1/4) Epoch 22, batch 250, loss[loss=0.1703, simple_loss=0.2583, pruned_loss=0.04111, over 24675.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2535, pruned_loss=0.04981, over 3388663.49 frames. ], batch size: 73, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:08:29,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:08:29,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:08:31,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 04:08:31,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:08:31,445 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 04:08:33,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:36,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:08:36,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:38,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:08:39,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:08:40,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:42,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:08:45,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:08:54,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=745426.6666666666, ans=0.09899494936611666 2023-10-02 04:08:57,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:09:00,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:09:00,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:09:06,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:09:06,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:09:08,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:09:08,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:09:10,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:09:10,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:09:10,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:09:15,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:09:18,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 04:09:18,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:09:18,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=745560.0, ans=0.125 2023-10-02 04:09:19,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:09:19,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:09:19,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:09:19,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:09:20,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:09:21,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:09:22,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:09:24,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:09:25,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:09:27,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:09:31,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:09:34,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:09:39,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:09:41,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:09:42,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=745626.6666666666, ans=0.2 2023-10-02 04:09:45,226 INFO [train.py:1046] (1/4) Epoch 22, batch 300, loss[loss=0.1715, simple_loss=0.2561, pruned_loss=0.04342, over 24670.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2511, pruned_loss=0.04927, over 3668220.51 frames. ], batch size: 65, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:09:45,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 04:09:45,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:09:45,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:09:48,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 04:09:48,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 04:09:50,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:09:50,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 04:09:54,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:09:55,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:09:59,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:10:00,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 04:10:01,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:10:03,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:10:03,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 04:10:03,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:10:08,535 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.896e+02 2.196e+02 2.526e+02 3.714e+02, threshold=4.393e+02, percent-clipped=0.0 2023-10-02 04:10:08,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 04:10:11,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:10:11,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 04:10:12,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=745760.0, ans=0.125 2023-10-02 04:10:12,416 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.55 vs. limit=15.0 2023-10-02 04:10:15,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 04:10:15,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:15,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=745826.6666666666, ans=0.1 2023-10-02 04:10:15,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=745826.6666666666, ans=0.2 2023-10-02 04:10:18,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:10:20,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:20,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 04:10:20,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:10:22,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:10:24,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:10:24,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:10:27,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 04:10:27,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 04:10:28,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:10:31,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:33,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 04:10:33,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:10:37,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:10:40,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:10:40,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 04:10:45,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:45,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:10:47,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:48,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:10:50,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 04:10:50,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 04:10:50,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:10:51,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 04:10:52,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:52,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:10:54,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:10:55,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=745960.0, ans=0.125 2023-10-02 04:10:56,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:10:56,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:00,120 INFO [train.py:1046] (1/4) Epoch 22, batch 350, loss[loss=0.1823, simple_loss=0.2593, pruned_loss=0.05259, over 23323.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2488, pruned_loss=0.04881, over 3901340.79 frames. ], batch size: 93, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:11:01,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:11:01,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 04:11:04,605 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:11:05,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:08,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:11:13,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:11:13,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:16,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 04:11:18,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:11:18,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 04:11:20,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=746093.3333333334, ans=0.1 2023-10-02 04:11:21,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:21,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 04:11:22,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:11:24,606 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.03 vs. limit=10.0 2023-10-02 04:11:24,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=746093.3333333334, ans=15.0 2023-10-02 04:11:25,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 04:11:25,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=746093.3333333334, ans=0.0 2023-10-02 04:11:28,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:11:30,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:11:31,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:11:32,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:11:32,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:11:32,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:11:32,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:11:32,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:11:33,190 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:11:35,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:11:35,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:42,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:11:42,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:11:42,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:11:42,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:11:48,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 04:11:48,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:52,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:11:52,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:11:52,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:11:53,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 04:11:55,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:11:55,903 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 04:11:57,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 04:11:57,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=746226.6666666666, ans=0.1 2023-10-02 04:11:58,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:12:01,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:12:01,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 04:12:03,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:12:05,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:12:08,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:12:09,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:12:09,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:12:11,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:12:13,738 INFO [train.py:1046] (1/4) Epoch 22, batch 400, loss[loss=0.1701, simple_loss=0.252, pruned_loss=0.04412, over 23649.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2486, pruned_loss=0.04861, over 4084043.11 frames. ], batch size: 85, lr: 4.72e-03, grad_scale: 32.0 2023-10-02 04:12:13,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:12:15,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:12:17,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 04:12:17,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:12:18,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:12:18,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:12:19,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=746360.0, ans=15.0 2023-10-02 04:12:20,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:12:22,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:12:24,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:12:25,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 04:12:28,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 04:12:28,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:12:30,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 04:12:31,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:12:34,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:12:34,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:12:34,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 04:12:35,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:12:35,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:12:36,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=746426.6666666666, ans=0.125 2023-10-02 04:12:37,214 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.706e+02 1.907e+02 2.140e+02 3.847e+02, threshold=3.815e+02, percent-clipped=0.0 2023-10-02 04:12:37,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:12:37,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:12:37,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=746426.6666666666, ans=0.1 2023-10-02 04:12:38,742 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 04:12:40,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 04:12:43,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=746493.3333333334, ans=0.0 2023-10-02 04:12:44,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:12:46,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:12:46,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 04:12:49,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 04:12:50,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=746493.3333333334, ans=0.125 2023-10-02 04:12:52,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:12:53,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:12:56,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=746560.0, ans=0.0 2023-10-02 04:12:59,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 04:13:02,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:13:05,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 04:13:06,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:13:08,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:13:08,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 04:13:10,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:13:13,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:13:15,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:13:18,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:13:18,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 04:13:20,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 04:13:24,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 04:13:25,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:13:25,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:13:27,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 04:13:27,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:13:28,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:13:28,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=746693.3333333334, ans=0.0 2023-10-02 04:13:30,145 INFO [train.py:1046] (1/4) Epoch 22, batch 450, loss[loss=0.1496, simple_loss=0.2353, pruned_loss=0.032, over 24327.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2491, pruned_loss=0.04872, over 4218287.68 frames. ], batch size: 61, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:13:30,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 04:13:31,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 04:13:33,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:13:33,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:13:34,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:13:34,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 04:13:34,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:13:36,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:13:37,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=746693.3333333334, ans=0.125 2023-10-02 04:13:38,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:13:45,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:13:47,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:13:49,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 04:13:49,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 04:13:52,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:13:55,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:13:56,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:14:02,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:14:03,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:14:04,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 04:14:06,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 04:14:08,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 04:14:08,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:14:09,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:14:10,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:14:11,102 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 04:14:11,110 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 04:14:12,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:14:13,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:14:13,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 04:14:18,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 04:14:18,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:14:19,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 04:14:19,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 04:14:21,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:14:23,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:14:23,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:14:23,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=746893.3333333334, ans=0.1 2023-10-02 04:14:24,124 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.60 vs. limit=15.0 2023-10-02 04:14:25,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 04:14:29,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:14:30,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 04:14:30,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 04:14:31,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:14:36,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:14:39,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:14:40,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:14:40,770 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 04:14:43,475 INFO [train.py:1046] (1/4) Epoch 22, batch 500, loss[loss=0.1724, simple_loss=0.2427, pruned_loss=0.05106, over 23652.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2493, pruned_loss=0.04853, over 4330250.30 frames. ], batch size: 149, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:14:43,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:14:44,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:14:44,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:14:45,005 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 04:14:46,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 04:14:46,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:14:48,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=747026.6666666666, ans=0.2 2023-10-02 04:14:51,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 04:14:54,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 04:14:57,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:14:58,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:14:58,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:15:00,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:02,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=747093.3333333334, ans=0.0 2023-10-02 04:15:03,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=747093.3333333334, ans=0.125 2023-10-02 04:15:09,057 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.808e+02 1.987e+02 2.210e+02 3.214e+02, threshold=3.973e+02, percent-clipped=0.0 2023-10-02 04:15:09,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:15:10,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:15:10,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 04:15:10,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:15:11,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 04:15:11,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:15:16,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:15:17,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:15:17,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:15:17,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:15:18,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 04:15:19,431 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.37 vs. limit=10.0 2023-10-02 04:15:22,837 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:15:23,925 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 04:15:25,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:15:25,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=747160.0, ans=0.125 2023-10-02 04:15:26,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:28,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:28,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:30,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:15:31,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=747226.6666666666, ans=0.0 2023-10-02 04:15:32,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 04:15:34,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=747226.6666666666, ans=0.125 2023-10-02 04:15:35,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:15:37,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:15:40,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:15:44,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:50,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:15:51,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 04:15:51,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:15:51,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:15:53,829 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.86 vs. limit=15.0 2023-10-02 04:15:55,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 04:15:55,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=747293.3333333334, ans=0.09899494936611666 2023-10-02 04:15:57,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 04:15:58,491 INFO [train.py:1046] (1/4) Epoch 22, batch 550, loss[loss=0.2337, simple_loss=0.2966, pruned_loss=0.08542, over 19569.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2513, pruned_loss=0.04927, over 4415011.01 frames. ], batch size: 388, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:15:58,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:16:01,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 04:16:03,849 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.83 vs. limit=15.0 2023-10-02 04:16:04,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 04:16:04,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:16:04,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 04:16:04,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:16:05,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:16:05,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:06,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:06,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:16:08,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:16:10,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:16:12,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 04:16:12,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:16:17,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:16:17,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:19,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:16:20,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:25,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 04:16:25,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 04:16:27,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:16:33,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:16:33,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:16:34,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:16:38,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:16:38,822 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 04:16:40,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:41,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 04:16:43,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:16:44,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:16:44,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:16:46,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:16:46,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 04:16:47,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 04:16:49,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:16:49,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:16:49,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:16:49,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:16:51,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:16:53,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:16:56,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:16:56,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:16:58,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 04:16:59,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:17:00,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:17:02,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:17:02,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:17:05,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 04:17:05,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 04:17:10,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 04:17:13,296 INFO [train.py:1046] (1/4) Epoch 22, batch 600, loss[loss=0.2426, simple_loss=0.3024, pruned_loss=0.09147, over 19651.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2525, pruned_loss=0.05014, over 4473856.58 frames. ], batch size: 389, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:17:14,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 04:17:15,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:17:15,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:17:16,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:17:18,029 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.04 vs. limit=15.0 2023-10-02 04:17:20,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=747693.3333333334, ans=0.125 2023-10-02 04:17:21,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:17:24,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:17:26,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 04:17:27,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:17:31,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:17:32,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:17:35,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 04:17:35,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:17:38,705 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.855e+02 2.187e+02 2.560e+02 3.889e+02, threshold=4.374e+02, percent-clipped=0.0 2023-10-02 04:17:42,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 04:17:44,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:17:44,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:17:46,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:17:50,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:17:50,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:17:51,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:17:58,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:17:58,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=6.49 vs. limit=12.0 2023-10-02 04:18:01,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:18:01,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:18:01,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:18:08,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 04:18:13,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 04:18:15,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:18:17,566 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.42 vs. limit=15.0 2023-10-02 04:18:18,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 04:18:18,455 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:18:19,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:18:22,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 04:18:22,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:18:22,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:18:24,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=747960.0, ans=0.0 2023-10-02 04:18:28,669 INFO [train.py:1046] (1/4) Epoch 22, batch 650, loss[loss=0.1562, simple_loss=0.2325, pruned_loss=0.03991, over 24355.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2515, pruned_loss=0.04967, over 4530656.53 frames. ], batch size: 56, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:18:28,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 04:18:30,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 04:18:31,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=748026.6666666666, ans=0.0 2023-10-02 04:18:32,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:18:34,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:18:35,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:18:37,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 04:18:39,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:18:39,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=748026.6666666666, ans=0.1 2023-10-02 04:18:45,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:18:45,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:18:45,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=748093.3333333334, ans=0.125 2023-10-02 04:18:49,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:18:52,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 04:18:53,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:18:55,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:18:58,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:18:58,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 04:19:01,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:19:01,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:01,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 04:19:02,701 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.03 vs. limit=15.0 2023-10-02 04:19:04,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:04,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:19:07,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:19:07,513 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 04:19:07,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:19:07,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:19:11,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:12,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:19:13,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:19:13,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:19:13,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 04:19:16,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:19:16,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:19:18,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:19:18,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:19:19,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:19:20,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 04:19:22,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 04:19:22,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:22,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:19:23,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:19:23,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:19:24,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:19:25,834 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.95 vs. limit=15.0 2023-10-02 04:19:31,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:31,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:19:33,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:19:34,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:19:34,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:19:36,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:19:42,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:19:42,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:19:42,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:19:42,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:19:43,632 INFO [train.py:1046] (1/4) Epoch 22, batch 700, loss[loss=0.176, simple_loss=0.2419, pruned_loss=0.05507, over 23880.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.25, pruned_loss=0.04879, over 4572333.49 frames. ], batch size: 195, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:19:47,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 04:19:47,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 04:19:49,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 04:19:50,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:51,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:19:54,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 04:19:58,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:20:02,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:20:03,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:20:06,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:20:06,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:20:08,873 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 1.977e+02 2.354e+02 2.667e+02 4.737e+02, threshold=4.709e+02, percent-clipped=1.0 2023-10-02 04:20:09,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:20:13,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 04:20:13,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:20:13,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 04:20:17,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 04:20:21,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:20:21,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:20:24,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:20:26,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:20:28,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 04:20:31,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:20:31,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:20:33,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 04:20:35,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:20:35,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=748560.0, ans=0.0 2023-10-02 04:20:36,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:20:39,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:20:44,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:20:45,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 04:20:48,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 04:20:48,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 04:20:51,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:20:54,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:20:55,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:20:55,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:20:55,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 04:20:58,676 INFO [train.py:1046] (1/4) Epoch 22, batch 750, loss[loss=0.1565, simple_loss=0.2365, pruned_loss=0.03826, over 24602.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2487, pruned_loss=0.0486, over 4603812.18 frames. ], batch size: 60, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:21:00,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 04:21:00,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 04:21:00,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 04:21:02,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 04:21:02,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 04:21:02,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:21:03,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 04:21:04,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:21:04,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:21:05,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=748693.3333333334, ans=0.0 2023-10-02 04:21:06,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:21:07,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:21:09,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 04:21:09,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:21:13,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:21:15,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:21:16,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:21:19,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:21:19,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:21:20,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 04:21:22,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:21:22,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:21:24,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:21:25,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 04:21:27,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 04:21:27,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:21:29,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 04:21:29,186 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 04:21:30,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 04:21:30,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:21:31,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 04:21:33,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:21:35,454 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:21:36,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=748826.6666666666, ans=0.125 2023-10-02 04:21:39,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:21:39,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:21:39,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:21:39,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:21:42,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:21:42,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 04:21:43,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:21:43,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 04:21:45,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:21:49,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:21:51,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 04:21:51,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:21:55,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:21:56,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:21:58,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:21:58,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=748960.0, ans=0.1 2023-10-02 04:22:00,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=748960.0, ans=0.125 2023-10-02 04:22:01,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:22:06,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 04:22:06,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:22:06,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:22:07,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:22:09,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:22:10,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:22:11,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:22:13,162 INFO [train.py:1046] (1/4) Epoch 22, batch 800, loss[loss=0.1692, simple_loss=0.2448, pruned_loss=0.04681, over 23485.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2498, pruned_loss=0.04879, over 4625609.12 frames. ], batch size: 134, lr: 4.71e-03, grad_scale: 32.0 2023-10-02 04:22:18,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:22:18,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:20,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:22:20,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:22:22,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:22,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:24,207 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:22:25,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:28,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:22:29,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:22:31,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 04:22:32,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:32,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:22:32,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:22:33,351 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.70 vs. limit=22.5 2023-10-02 04:22:34,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:22:34,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 04:22:35,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:22:35,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 04:22:38,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:40,034 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.751e+02 1.954e+02 2.249e+02 3.221e+02, threshold=3.908e+02, percent-clipped=0.0 2023-10-02 04:22:41,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:22:44,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:22:45,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:22:46,480 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.48 vs. limit=15.0 2023-10-02 04:22:48,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:48,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:49,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:22:51,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:22:51,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 04:22:55,176 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 04:22:55,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 04:22:55,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:22:56,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:22:57,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:58,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:23:02,673 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 04:23:02,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 04:23:04,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:23:05,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:23:08,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:23:09,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=749226.6666666666, ans=0.125 2023-10-02 04:23:11,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:23:13,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 04:23:14,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:23:15,056 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.54 vs. limit=22.5 2023-10-02 04:23:15,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 04:23:23,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:23:26,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:23:26,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 04:23:26,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:23:28,148 INFO [train.py:1046] (1/4) Epoch 22, batch 850, loss[loss=0.1424, simple_loss=0.2169, pruned_loss=0.03397, over 24443.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.251, pruned_loss=0.0496, over 4640250.69 frames. ], batch size: 58, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:23:28,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:23:28,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=749360.0, ans=0.0 2023-10-02 04:23:29,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 04:23:29,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:23:31,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:23:33,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:23:34,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:23:35,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:23:35,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 04:23:37,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 04:23:37,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 04:23:39,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:23:39,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:23:41,469 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.91 vs. limit=15.0 2023-10-02 04:23:41,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:23:41,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:23:41,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:23:46,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:23:46,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:23:46,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 04:23:48,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 04:23:53,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:23:53,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 04:23:55,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 04:23:57,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 04:24:00,382 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 04:24:00,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:24:00,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:24:00,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 04:24:04,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:24:04,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:24:06,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 04:24:06,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:24:07,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:24:07,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:24:08,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:24:10,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:24:11,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 04:24:11,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 04:24:14,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:24:14,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:24:14,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=749560.0, ans=0.1 2023-10-02 04:24:16,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:24:16,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:24:18,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:24:21,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:24:22,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:24:26,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:24:26,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:24:28,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:24:36,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=749626.6666666666, ans=10.0 2023-10-02 04:24:37,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 04:24:39,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:24:39,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 04:24:39,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:24:39,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:24:42,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 04:24:43,315 INFO [train.py:1046] (1/4) Epoch 22, batch 900, loss[loss=0.159, simple_loss=0.2437, pruned_loss=0.03719, over 24512.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2516, pruned_loss=0.0493, over 4660309.90 frames. ], batch size: 63, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:24:48,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:24:52,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:24:52,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 04:24:55,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:24:55,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 04:24:57,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 04:24:58,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:24:58,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:24:58,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:24:59,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:25:10,082 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.859e+02 2.170e+02 2.493e+02 3.460e+02, threshold=4.340e+02, percent-clipped=0.0 2023-10-02 04:25:10,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:25:10,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:25:10,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:25:10,926 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.93 vs. limit=15.0 2023-10-02 04:25:11,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:25:15,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 04:25:17,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:25:20,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:25:21,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:25:21,585 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 04:25:22,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 04:25:23,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=749826.6666666666, ans=0.2 2023-10-02 04:25:30,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:25:30,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:25:30,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:25:38,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:25:38,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:25:39,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 04:25:40,320 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.23 vs. limit=15.0 2023-10-02 04:25:40,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:25:41,544 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.19 vs. limit=15.0 2023-10-02 04:25:42,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 04:25:45,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:25:45,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:25:46,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:25:46,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:25:52,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 04:25:53,389 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 04:25:53,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 04:25:53,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 04:25:56,135 INFO [train.py:1046] (1/4) Epoch 22, batch 950, loss[loss=0.1795, simple_loss=0.2606, pruned_loss=0.04923, over 23695.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2521, pruned_loss=0.04951, over 4682098.73 frames. ], batch size: 85, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:25:56,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:26:00,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 04:26:05,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:26:08,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:26:08,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:26:10,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 04:26:12,957 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 04:26:14,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:26:15,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:26:16,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=750093.3333333334, ans=0.125 2023-10-02 04:26:17,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:26:17,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:26:17,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 04:26:17,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 04:26:18,208 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.24 vs. limit=15.0 2023-10-02 04:26:19,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:26:20,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=750093.3333333334, ans=0.0 2023-10-02 04:26:21,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 04:26:21,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:26:24,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:26:24,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:26:24,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:26:24,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 04:26:27,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 04:26:28,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:26:28,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=750160.0, ans=0.125 2023-10-02 04:26:30,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:26:35,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:26:35,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:26:40,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 04:26:43,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 04:26:43,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:26:43,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:26:44,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:26:44,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:26:48,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 04:26:48,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:26:51,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:26:52,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:26:52,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 04:26:52,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:26:52,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:26:52,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 04:26:53,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=750226.6666666666, ans=0.125 2023-10-02 04:26:56,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:26:59,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:27:04,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:27:05,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 04:27:05,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 04:27:10,307 INFO [train.py:1046] (1/4) Epoch 22, batch 1000, loss[loss=0.1516, simple_loss=0.2285, pruned_loss=0.03732, over 24596.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2513, pruned_loss=0.04897, over 4698112.52 frames. ], batch size: 60, lr: 4.70e-03, grad_scale: 8.0 2023-10-02 04:27:13,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:27:15,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=750360.0, ans=0.2 2023-10-02 04:27:16,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 04:27:16,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:27:18,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=750360.0, ans=0.0 2023-10-02 04:27:20,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:27:22,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 04:27:22,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 04:27:26,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:27:26,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:27:28,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:27:30,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 04:27:33,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 04:27:35,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=750426.6666666666, ans=0.125 2023-10-02 04:27:36,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 04:27:36,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:27:37,361 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.63 vs. limit=22.5 2023-10-02 04:27:38,292 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.816e+02 2.103e+02 2.458e+02 3.993e+02, threshold=4.207e+02, percent-clipped=0.0 2023-10-02 04:27:40,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 04:27:40,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 04:27:41,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 04:27:43,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:27:43,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:27:51,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:27:52,069 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=750493.3333333334, ans=0.125 2023-10-02 04:27:53,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:27:53,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:27:53,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:27:53,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 04:27:54,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:27:55,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:27:55,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:27:56,048 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 04:27:58,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=750560.0, ans=0.125 2023-10-02 04:28:00,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 04:28:01,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 04:28:02,368 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.50 vs. limit=12.0 2023-10-02 04:28:02,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 04:28:04,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:28:09,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:28:09,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:28:09,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:28:11,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:28:13,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 04:28:13,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:28:13,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 04:28:14,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 04:28:14,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:28:14,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:28:19,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:28:20,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:28:21,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=750626.6666666666, ans=0.1 2023-10-02 04:28:23,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:28:24,506 INFO [train.py:1046] (1/4) Epoch 22, batch 1050, loss[loss=0.1541, simple_loss=0.2326, pruned_loss=0.0378, over 24618.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2509, pruned_loss=0.04875, over 4720647.73 frames. ], batch size: 60, lr: 4.70e-03, grad_scale: 8.0 2023-10-02 04:28:25,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:28:28,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:28:28,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=750693.3333333334, ans=0.125 2023-10-02 04:28:30,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:28:31,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:28:31,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:28:36,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:28:37,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:28:39,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:28:41,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:28:41,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:28:41,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:28:43,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 04:28:43,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:28:43,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=750760.0, ans=0.0 2023-10-02 04:28:44,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 04:28:47,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:28:47,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 04:28:47,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:28:48,736 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.20 vs. limit=22.5 2023-10-02 04:28:50,846 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:28:54,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:28:56,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:28:56,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:28:57,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 04:28:59,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 04:28:59,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:29:01,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 04:29:02,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 04:29:04,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:29:06,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 04:29:08,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 04:29:09,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:29:09,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:29:15,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:29:18,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 04:29:19,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 04:29:19,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 04:29:19,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:29:20,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:29:21,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=750893.3333333334, ans=0.0 2023-10-02 04:29:22,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 04:29:25,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:29:26,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:29:26,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:29:26,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:29:26,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:29:26,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=750960.0, ans=0.2 2023-10-02 04:29:29,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=750960.0, ans=0.125 2023-10-02 04:29:30,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:29:30,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 04:29:33,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:29:33,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 04:29:33,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 04:29:35,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:29:38,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:29:39,814 INFO [train.py:1046] (1/4) Epoch 22, batch 1100, loss[loss=0.1825, simple_loss=0.2702, pruned_loss=0.04743, over 24648.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2497, pruned_loss=0.04881, over 4711929.69 frames. ], batch size: 73, lr: 4.70e-03, grad_scale: 8.0 2023-10-02 04:29:45,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:29:47,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=751026.6666666666, ans=0.0 2023-10-02 04:29:50,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:29:52,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:29:52,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:29:52,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 04:29:54,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:29:55,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:29:57,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:29:57,527 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=751093.3333333334, ans=0.1 2023-10-02 04:29:59,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:29:59,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 04:30:02,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 04:30:02,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=751093.3333333334, ans=0.0 2023-10-02 04:30:03,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:30:03,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:30:06,498 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.819e+02 2.068e+02 2.421e+02 4.208e+02, threshold=4.136e+02, percent-clipped=1.0 2023-10-02 04:30:06,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:30:07,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:30:07,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=751160.0, ans=0.125 2023-10-02 04:30:07,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=751160.0, ans=0.05 2023-10-02 04:30:12,981 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.23 vs. limit=15.0 2023-10-02 04:30:13,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:30:15,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=751160.0, ans=0.0 2023-10-02 04:30:17,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 04:30:17,089 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 04:30:18,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:30:19,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:30:21,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 04:30:21,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:30:22,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 04:30:24,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:30:24,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:30:24,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:30:25,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:30:25,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 04:30:29,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:30:29,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 04:30:32,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:30:35,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:30:38,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 04:30:38,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 04:30:40,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:30:41,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:30:43,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:30:43,641 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.97 vs. limit=15.0 2023-10-02 04:30:44,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 04:30:46,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:30:46,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:30:46,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 04:30:46,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=751293.3333333334, ans=0.125 2023-10-02 04:30:47,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:30:47,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 04:30:49,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:30:49,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:30:49,775 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.28 vs. limit=10.0 2023-10-02 04:30:50,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:30:53,562 INFO [train.py:1046] (1/4) Epoch 22, batch 1150, loss[loss=0.1838, simple_loss=0.2512, pruned_loss=0.05826, over 23455.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2499, pruned_loss=0.04864, over 4717205.91 frames. ], batch size: 285, lr: 4.70e-03, grad_scale: 8.0 2023-10-02 04:30:53,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:30:56,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:30:59,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:30:59,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:30:59,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 04:31:00,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:31:03,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 04:31:06,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:31:06,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:31:12,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 04:31:14,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:31:18,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:31:19,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:31:19,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 04:31:20,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:31:20,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:31:24,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 04:31:25,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:31:26,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:31:35,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:31:41,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:31:42,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 04:31:42,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:31:43,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:31:46,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=751560.0, ans=10.0 2023-10-02 04:31:47,841 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 04:31:49,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:31:55,277 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 04:31:55,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=751626.6666666666, ans=0.1 2023-10-02 04:32:00,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:32:02,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:32:02,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:32:02,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:32:02,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=751626.6666666666, ans=0.125 2023-10-02 04:32:05,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:32:06,281 INFO [train.py:1046] (1/4) Epoch 22, batch 1200, loss[loss=0.1716, simple_loss=0.2426, pruned_loss=0.05033, over 24432.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2512, pruned_loss=0.0491, over 4720289.22 frames. ], batch size: 58, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:32:09,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:32:09,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:32:11,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:32:11,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:32:12,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:32:13,621 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.44 vs. limit=15.0 2023-10-02 04:32:16,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:32:17,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:32:17,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:32:17,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:32:20,438 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 04:32:23,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 04:32:24,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=751760.0, ans=15.0 2023-10-02 04:32:25,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:32:26,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:32:27,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=751760.0, ans=0.0 2023-10-02 04:32:28,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:32:31,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:32:31,023 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 04:32:32,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:32:32,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=751760.0, ans=0.0 2023-10-02 04:32:34,978 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.785e+02 1.972e+02 2.130e+02 2.698e+02, threshold=3.944e+02, percent-clipped=0.0 2023-10-02 04:32:39,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 04:32:39,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:32:39,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 04:32:40,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:32:44,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 04:32:45,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=751826.6666666666, ans=0.1 2023-10-02 04:32:46,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=751826.6666666666, ans=0.125 2023-10-02 04:32:48,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 04:32:48,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:32:49,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:32:51,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:32:51,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:32:51,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=751893.3333333334, ans=0.1 2023-10-02 04:32:53,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:32:53,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:32:53,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:32:54,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 04:32:56,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:32:56,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:32:56,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:32:56,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=751893.3333333334, ans=0.1 2023-10-02 04:32:57,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=751893.3333333334, ans=0.95 2023-10-02 04:32:58,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:32:58,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:33:03,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 04:33:04,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:33:07,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 04:33:12,018 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 04:33:15,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:33:16,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:33:17,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:33:19,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:33:20,647 INFO [train.py:1046] (1/4) Epoch 22, batch 1250, loss[loss=0.162, simple_loss=0.2426, pruned_loss=0.04069, over 24667.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.252, pruned_loss=0.04966, over 4712406.26 frames. ], batch size: 65, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:33:22,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 04:33:26,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:33:26,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:33:28,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 04:33:28,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:33:29,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:33:32,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:33:33,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:33:33,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:33:33,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:33:35,823 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.31 vs. limit=6.0 2023-10-02 04:33:37,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:33:42,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 04:33:42,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:33:42,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:33:44,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:33:46,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:33:49,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:33:49,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:33:50,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=752160.0, ans=0.025 2023-10-02 04:33:53,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=752160.0, ans=0.125 2023-10-02 04:33:54,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 04:33:54,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:33:56,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:33:56,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 04:33:58,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:33:58,398 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 04:33:58,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:33:58,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:34:01,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:34:03,130 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.74 vs. limit=6.0 2023-10-02 04:34:03,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:34:05,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:34:06,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 04:34:06,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 04:34:08,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 04:34:11,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:34:11,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 04:34:11,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:34:16,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 04:34:16,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:34:17,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 04:34:17,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 04:34:19,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:34:19,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 04:34:20,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:34:21,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 04:34:24,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:34:24,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:34:26,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:34:27,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:34:32,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:34:33,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 04:34:34,839 INFO [train.py:1046] (1/4) Epoch 22, batch 1300, loss[loss=0.1869, simple_loss=0.2544, pruned_loss=0.05965, over 23748.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2526, pruned_loss=0.05004, over 4705835.39 frames. ], batch size: 212, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:34:36,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:34:36,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 04:34:37,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:34:39,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:34:40,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:34:42,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 04:34:50,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:34:50,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:34:51,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 04:34:55,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:34:58,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:34:59,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:35:00,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:35:01,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:35:03,257 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.912e+02 2.109e+02 2.278e+02 3.691e+02, threshold=4.217e+02, percent-clipped=0.0 2023-10-02 04:35:03,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:35:03,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 04:35:03,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 04:35:08,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:35:09,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:35:10,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=752493.3333333334, ans=0.1 2023-10-02 04:35:11,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 04:35:14,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 04:35:15,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:35:18,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:35:18,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 04:35:18,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:35:20,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 04:35:21,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:35:27,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:35:27,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:35:29,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 04:35:30,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 04:35:30,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=752560.0, ans=0.1 2023-10-02 04:35:31,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 04:35:32,140 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.27 vs. limit=15.0 2023-10-02 04:35:33,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=752626.6666666666, ans=0.0 2023-10-02 04:35:37,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:35:40,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 04:35:41,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:35:47,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 04:35:49,086 INFO [train.py:1046] (1/4) Epoch 22, batch 1350, loss[loss=0.1584, simple_loss=0.2365, pruned_loss=0.04018, over 24465.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2513, pruned_loss=0.04941, over 4698951.41 frames. ], batch size: 63, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:35:52,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:35:55,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:35:56,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:35:57,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:35:58,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:35:59,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:36:03,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:36:04,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 04:36:05,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:36:06,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:36:09,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 04:36:10,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=752760.0, ans=0.0 2023-10-02 04:36:11,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:36:12,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:36:12,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 04:36:13,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 04:36:15,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 04:36:15,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=752760.0, ans=0.0 2023-10-02 04:36:18,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:36:18,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 04:36:27,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:36:32,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=752893.3333333334, ans=0.125 2023-10-02 04:36:36,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:36:36,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:36:36,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 04:36:39,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:36:41,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 04:36:41,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:36:41,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:36:45,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:36:47,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 04:36:47,732 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.49 vs. limit=15.0 2023-10-02 04:36:48,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:36:54,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 04:36:55,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 04:37:01,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 04:37:01,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:37:02,559 INFO [train.py:1046] (1/4) Epoch 22, batch 1400, loss[loss=0.1517, simple_loss=0.232, pruned_loss=0.03569, over 24299.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.249, pruned_loss=0.04863, over 4697835.18 frames. ], batch size: 61, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:37:04,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:37:04,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:37:10,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 04:37:11,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 04:37:20,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:37:21,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:37:24,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:37:24,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 04:37:28,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=753093.3333333334, ans=0.125 2023-10-02 04:37:29,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:37:29,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 04:37:30,877 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.430e+02 1.782e+02 2.050e+02 2.266e+02 3.252e+02, threshold=4.099e+02, percent-clipped=0.0 2023-10-02 04:37:31,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=753160.0, ans=0.125 2023-10-02 04:37:38,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=753160.0, ans=0.125 2023-10-02 04:37:39,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:37:39,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:37:44,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 04:37:45,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:37:45,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:37:47,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:37:49,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:37:49,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:37:49,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:37:49,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:37:52,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 04:37:52,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:37:56,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:37:59,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:38:07,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 04:38:07,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:38:09,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:38:09,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=753293.3333333334, ans=0.125 2023-10-02 04:38:11,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 04:38:13,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:38:15,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:38:17,295 INFO [train.py:1046] (1/4) Epoch 22, batch 1450, loss[loss=0.1604, simple_loss=0.2446, pruned_loss=0.03812, over 24429.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2488, pruned_loss=0.04834, over 4706512.18 frames. ], batch size: 69, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:38:18,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:38:21,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:38:21,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:21,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 04:38:27,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:38:27,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:38:28,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:38:28,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=753360.0, ans=0.0 2023-10-02 04:38:29,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 04:38:29,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:38:31,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 04:38:32,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:32,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:38:32,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 04:38:33,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:38:35,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:38:35,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 04:38:35,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:38:36,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:38:39,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:42,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:38:45,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:38:45,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:38:48,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:38:48,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:50,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:38:51,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:38:51,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:51,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:38:57,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 04:38:58,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:39:00,266 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:39:01,428 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 04:39:02,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:39:04,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:39:05,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:39:07,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=753560.0, ans=0.125 2023-10-02 04:39:08,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 04:39:09,017 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.44 vs. limit=15.0 2023-10-02 04:39:10,647 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.24 vs. limit=15.0 2023-10-02 04:39:11,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:39:12,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 04:39:14,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 04:39:15,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:39:17,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=753626.6666666666, ans=0.2 2023-10-02 04:39:18,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:39:18,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:39:20,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 04:39:23,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 04:39:25,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 04:39:25,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:39:26,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:39:32,470 INFO [train.py:1046] (1/4) Epoch 22, batch 1500, loss[loss=0.231, simple_loss=0.2843, pruned_loss=0.08881, over 19904.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2495, pruned_loss=0.04831, over 4715209.80 frames. ], batch size: 388, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:39:36,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 04:39:36,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:39:36,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:39:38,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:39:38,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:39:39,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:39:40,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 04:39:42,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:39:42,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:39:42,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:39:43,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:39:44,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:39:46,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:39:47,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=753760.0, ans=0.0 2023-10-02 04:39:51,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:39:51,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 04:39:52,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:39:52,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:39:53,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:39:56,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 04:40:00,496 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.944e+02 2.195e+02 2.609e+02 5.119e+02, threshold=4.390e+02, percent-clipped=1.0 2023-10-02 04:40:00,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 04:40:01,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:40:02,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 04:40:03,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=753826.6666666666, ans=0.125 2023-10-02 04:40:04,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 04:40:07,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:40:08,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=753826.6666666666, ans=22.5 2023-10-02 04:40:08,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:40:08,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:40:10,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 04:40:10,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:40:11,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:40:11,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 04:40:11,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:40:16,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:40:16,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 04:40:18,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=753893.3333333334, ans=0.125 2023-10-02 04:40:22,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:40:23,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:40:27,407 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 04:40:28,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:40:28,803 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 04:40:30,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=753960.0, ans=0.05 2023-10-02 04:40:31,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:40:31,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:40:32,903 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 04:40:33,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=753960.0, ans=0.0 2023-10-02 04:40:33,591 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.79 vs. limit=15.0 2023-10-02 04:40:34,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:40:37,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 04:40:37,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:40:41,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:40:41,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:40:42,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:40:42,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:40:42,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:40:45,335 INFO [train.py:1046] (1/4) Epoch 22, batch 1550, loss[loss=0.1781, simple_loss=0.25, pruned_loss=0.05309, over 23869.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2497, pruned_loss=0.04841, over 4733895.64 frames. ], batch size: 179, lr: 4.69e-03, grad_scale: 8.0 2023-10-02 04:40:45,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 04:40:47,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 04:40:47,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:40:48,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 04:40:48,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 04:40:50,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:40:50,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=754026.6666666666, ans=0.0 2023-10-02 04:40:51,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:40:51,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:40:52,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:40:52,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:40:54,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:40:58,395 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 04:40:58,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:40:58,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:40:59,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:41:02,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:41:02,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 04:41:02,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:41:03,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 04:41:05,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 04:41:05,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 04:41:05,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:41:06,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:41:11,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:41:11,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=754093.3333333334, ans=0.0 2023-10-02 04:41:12,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 04:41:12,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 04:41:15,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=754160.0, ans=0.0 2023-10-02 04:41:16,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=754160.0, ans=0.125 2023-10-02 04:41:20,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:41:24,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:41:26,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:41:26,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:41:27,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 04:41:34,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:41:34,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:41:37,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:41:38,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:41:39,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:41:39,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 04:41:41,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:41:42,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:41:42,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:41:43,283 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.11 vs. limit=15.0 2023-10-02 04:41:44,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 04:41:44,017 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 04:41:46,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:41:51,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 04:41:57,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:41:58,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:41:59,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 04:42:00,689 INFO [train.py:1046] (1/4) Epoch 22, batch 1600, loss[loss=0.1684, simple_loss=0.2594, pruned_loss=0.03871, over 24592.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2511, pruned_loss=0.04887, over 4736917.44 frames. ], batch size: 71, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:42:01,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:42:02,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:42:02,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:42:02,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:42:04,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:42:08,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:42:08,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 04:42:09,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 04:42:11,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=754360.0, ans=0.0 2023-10-02 04:42:12,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 04:42:13,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:42:15,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 04:42:15,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:42:18,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:42:22,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:42:24,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 04:42:28,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:42:29,799 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.849e+02 2.015e+02 2.329e+02 4.994e+02, threshold=4.030e+02, percent-clipped=1.0 2023-10-02 04:42:29,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 04:42:29,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:42:31,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 04:42:32,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=754493.3333333334, ans=0.2 2023-10-02 04:42:38,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 04:42:40,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=754493.3333333334, ans=0.0 2023-10-02 04:42:41,467 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.33 vs. limit=6.0 2023-10-02 04:42:43,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:42:45,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 04:42:46,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:42:46,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:42:46,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:42:47,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 04:42:52,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=754560.0, ans=0.5 2023-10-02 04:42:53,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 04:42:54,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:42:54,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:42:55,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:42:56,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:42:59,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:42:59,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:43:00,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:43:08,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:43:08,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:43:11,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 04:43:11,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:43:12,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=754626.6666666666, ans=0.125 2023-10-02 04:43:14,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 04:43:14,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=754693.3333333334, ans=0.125 2023-10-02 04:43:14,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=754693.3333333334, ans=0.0 2023-10-02 04:43:15,453 INFO [train.py:1046] (1/4) Epoch 22, batch 1650, loss[loss=0.1721, simple_loss=0.2357, pruned_loss=0.05424, over 23711.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2515, pruned_loss=0.04946, over 4717405.54 frames. ], batch size: 232, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:43:17,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=754693.3333333334, ans=0.125 2023-10-02 04:43:18,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:43:18,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:43:19,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:43:19,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 04:43:19,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 04:43:19,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 04:43:19,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 04:43:23,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:43:24,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:43:25,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:43:25,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:43:27,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:43:31,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 04:43:33,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:43:33,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:43:33,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:43:33,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:43:34,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 04:43:34,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 04:43:39,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=754760.0, ans=0.2 2023-10-02 04:43:41,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:43:41,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=754760.0, ans=0.0 2023-10-02 04:43:43,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:43:48,780 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.18 vs. limit=15.0 2023-10-02 04:43:50,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 04:43:50,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:43:52,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 04:43:55,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:43:57,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:43:57,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:43:59,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:43:59,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:43:59,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:44:02,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:44:03,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:44:03,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:44:06,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:44:07,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:44:09,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:44:11,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:44:13,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 04:44:14,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:44:14,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 04:44:16,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 04:44:17,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 04:44:17,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:44:17,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:44:17,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:44:18,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:44:18,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 04:44:20,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=754960.0, ans=0.125 2023-10-02 04:44:21,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:44:23,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:44:23,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:44:25,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 04:44:28,644 INFO [train.py:1046] (1/4) Epoch 22, batch 1700, loss[loss=0.1475, simple_loss=0.2244, pruned_loss=0.03534, over 24408.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.25, pruned_loss=0.04878, over 4717528.87 frames. ], batch size: 58, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:44:30,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:44:30,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:44:31,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 04:44:33,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:44:33,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:44:33,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:44:35,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:44:36,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:44:36,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 04:44:40,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:44:43,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=755093.3333333334, ans=0.125 2023-10-02 04:44:44,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:44:47,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:44:52,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:44:52,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:44:52,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:44:54,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:44:57,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 04:44:58,519 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.468e+02 1.824e+02 2.037e+02 2.362e+02 3.685e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-02 04:44:58,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:44:58,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:00,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:45:02,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:45:04,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 04:45:05,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 04:45:06,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:08,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 04:45:09,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:45:17,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:45:19,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:45:20,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:45:20,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 04:45:20,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 04:45:20,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:45:23,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:23,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 04:45:23,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:45:23,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:45:24,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:24,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:45:27,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:45:27,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:45:29,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:45:30,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:45:30,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:45:33,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:45:35,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 04:45:39,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:45:40,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:45:43,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 04:45:44,595 INFO [train.py:1046] (1/4) Epoch 22, batch 1750, loss[loss=0.1694, simple_loss=0.251, pruned_loss=0.04392, over 24666.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2494, pruned_loss=0.04856, over 4717310.83 frames. ], batch size: 65, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:45:46,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=755360.0, ans=0.0 2023-10-02 04:45:48,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:45:49,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=755360.0, ans=0.1 2023-10-02 04:45:51,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:45:51,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 04:45:53,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 04:45:53,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:55,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:45:55,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:46:00,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 04:46:00,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=755426.6666666666, ans=0.125 2023-10-02 04:46:02,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:46:05,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 04:46:05,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:46:07,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:46:10,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 04:46:10,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 04:46:13,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:46:13,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 04:46:20,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:46:23,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:46:23,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:46:26,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:46:26,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:46:27,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:46:29,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:46:32,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:46:32,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:46:34,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 04:46:36,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:46:39,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 04:46:40,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:46:42,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:46:43,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:46:47,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:46:48,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 04:46:48,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:46:49,151 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.14 vs. limit=22.5 2023-10-02 04:46:50,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:46:50,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=755626.6666666666, ans=0.125 2023-10-02 04:46:54,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:46:55,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:46:57,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:46:57,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 04:46:58,093 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.85 vs. limit=6.0 2023-10-02 04:46:58,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:46:58,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:46:58,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:46:58,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:47:00,001 INFO [train.py:1046] (1/4) Epoch 22, batch 1800, loss[loss=0.1806, simple_loss=0.2711, pruned_loss=0.04509, over 24574.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2495, pruned_loss=0.04847, over 4725683.77 frames. ], batch size: 71, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:47:00,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:47:00,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:47:01,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=755693.3333333334, ans=0.0 2023-10-02 04:47:03,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:47:03,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:47:05,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 04:47:06,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:47:11,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 04:47:13,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:47:15,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:47:17,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:47:19,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:47:19,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:47:20,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:47:20,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 04:47:22,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:47:24,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:47:26,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=755760.0, ans=0.2 2023-10-02 04:47:27,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 04:47:29,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=755826.6666666666, ans=0.0 2023-10-02 04:47:30,294 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.502e+02 1.851e+02 2.122e+02 2.393e+02 3.759e+02, threshold=4.245e+02, percent-clipped=0.0 2023-10-02 04:47:30,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 04:47:30,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 04:47:31,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:47:33,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:47:33,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:47:35,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:47:40,184 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 04:47:41,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:47:43,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:47:46,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 04:47:47,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 04:47:47,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:47:48,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:47:50,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:47:53,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=755893.3333333334, ans=0.1 2023-10-02 04:47:54,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 04:47:59,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:47:59,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 04:48:01,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:48:01,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:48:03,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:48:03,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 04:48:04,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:48:04,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:48:06,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 04:48:06,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:48:08,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=755960.0, ans=0.125 2023-10-02 04:48:10,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:48:11,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:48:11,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:48:12,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:48:12,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:48:13,545 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.82 vs. limit=15.0 2023-10-02 04:48:14,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:48:14,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:48:16,103 INFO [train.py:1046] (1/4) Epoch 22, batch 1850, loss[loss=0.1943, simple_loss=0.2718, pruned_loss=0.05841, over 23663.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.25, pruned_loss=0.04873, over 4725579.36 frames. ], batch size: 85, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:48:18,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:48:18,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:48:25,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:48:25,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 04:48:28,079 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.37 vs. limit=15.0 2023-10-02 04:48:28,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 04:48:31,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 04:48:36,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:48:36,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 04:48:37,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 04:48:45,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:48:47,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 04:48:49,597 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.37 vs. limit=6.0 2023-10-02 04:48:50,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:48:50,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:48:54,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 04:48:55,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:48:55,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 04:48:57,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:48:58,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:49:02,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:49:06,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:49:06,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:49:06,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 04:49:06,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:49:07,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:49:08,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:49:13,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 04:49:14,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:49:18,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:49:19,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:49:19,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 04:49:19,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 04:49:23,135 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 04:49:23,209 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 04:49:24,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:49:24,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:49:24,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:49:24,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:49:25,980 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 04:49:26,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:49:26,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:49:27,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:49:28,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:49:29,253 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:49:30,111 INFO [train.py:1046] (1/4) Epoch 22, batch 1900, loss[loss=0.1912, simple_loss=0.2645, pruned_loss=0.05895, over 23865.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2503, pruned_loss=0.04845, over 4734785.70 frames. ], batch size: 212, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:49:30,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:49:30,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 04:49:31,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:49:31,653 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 04:49:31,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:49:31,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=756360.0, ans=0.125 2023-10-02 04:49:32,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:49:33,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=756360.0, ans=0.2 2023-10-02 04:49:38,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:49:40,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:49:41,829 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 04:49:43,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 04:49:45,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:49:45,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:49:47,151 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 04:49:47,175 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 04:49:50,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 04:49:53,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:49:56,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=756426.6666666666, ans=0.0 2023-10-02 04:49:57,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 04:49:57,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 04:50:00,302 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.830e+02 1.988e+02 2.362e+02 3.579e+02, threshold=3.977e+02, percent-clipped=0.0 2023-10-02 04:50:03,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=756493.3333333334, ans=0.125 2023-10-02 04:50:07,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 04:50:08,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=756493.3333333334, ans=0.125 2023-10-02 04:50:10,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 04:50:10,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:50:10,752 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 04:50:10,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 04:50:10,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=756493.3333333334, ans=0.5 2023-10-02 04:50:12,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 04:50:12,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 04:50:12,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:50:14,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 04:50:18,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:50:20,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:50:20,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 04:50:23,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:50:25,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=756560.0, ans=0.125 2023-10-02 04:50:26,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 04:50:27,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:50:35,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:50:35,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:50:35,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:50:35,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:50:36,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:50:38,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 04:50:39,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:50:40,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:50:40,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:50:43,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:50:43,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:50:44,915 INFO [train.py:1046] (1/4) Epoch 22, batch 1950, loss[loss=0.1775, simple_loss=0.2676, pruned_loss=0.04374, over 24467.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2513, pruned_loss=0.04921, over 4729725.80 frames. ], batch size: 69, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:50:45,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:50:46,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:50:49,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:50:51,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=756693.3333333334, ans=0.04949747468305833 2023-10-02 04:50:52,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:50:52,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:50:52,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:50:57,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 04:50:57,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 04:50:57,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:50:58,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:51:01,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:51:01,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:01,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:04,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:51:07,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:51:07,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:51:07,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:51:08,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:08,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=756760.0, ans=0.1 2023-10-02 04:51:11,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:14,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:51:14,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:14,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 04:51:14,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 04:51:15,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 04:51:15,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:51:17,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:51:21,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:21,596 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.30 vs. limit=22.5 2023-10-02 04:51:23,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:51:28,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:51:32,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:51:32,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:51:32,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 04:51:34,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:51:38,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:51:38,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:51:39,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:51:43,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=756960.0, ans=0.125 2023-10-02 04:51:43,719 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.93 vs. limit=6.0 2023-10-02 04:51:47,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:47,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:49,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:50,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=756960.0, ans=0.0 2023-10-02 04:51:51,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:51:54,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:51:54,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:51:55,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 04:51:55,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:51:57,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:59,250 INFO [train.py:1046] (1/4) Epoch 22, batch 2000, loss[loss=0.1621, simple_loss=0.2431, pruned_loss=0.04049, over 24311.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2519, pruned_loss=0.04944, over 4723314.44 frames. ], batch size: 61, lr: 4.68e-03, grad_scale: 32.0 2023-10-02 04:51:59,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 04:52:00,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:52:04,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:52:05,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:52:05,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:52:08,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:52:08,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=757026.6666666666, ans=0.125 2023-10-02 04:52:09,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:52:11,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 04:52:13,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:52:14,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:52:16,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 04:52:18,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:52:18,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:52:21,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:52:23,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 04:52:23,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:25,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:26,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:26,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 04:52:26,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=757093.3333333334, ans=0.125 2023-10-02 04:52:27,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 04:52:29,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 04:52:29,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:52:30,915 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.905e+02 2.121e+02 2.548e+02 4.469e+02, threshold=4.243e+02, percent-clipped=4.0 2023-10-02 04:52:32,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:52:34,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:52:34,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:34,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:52:35,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:52:37,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 04:52:39,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 04:52:39,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:52:39,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:52:44,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:52:45,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:52:45,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:52:45,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:52:47,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:52:48,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:52:50,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:52:50,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:52:50,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:53,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:52:53,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 04:52:59,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:52:59,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:04,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:04,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:53:06,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:08,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:53:08,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:10,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:53:10,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:53:10,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=757293.3333333334, ans=0.0 2023-10-02 04:53:14,692 INFO [train.py:1046] (1/4) Epoch 22, batch 2050, loss[loss=0.1954, simple_loss=0.2765, pruned_loss=0.05719, over 24384.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2518, pruned_loss=0.04954, over 4713808.93 frames. ], batch size: 77, lr: 4.68e-03, grad_scale: 16.0 2023-10-02 04:53:14,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:16,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:18,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:53:18,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:23,157 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.96 vs. limit=10.0 2023-10-02 04:53:23,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=757360.0, ans=0.1 2023-10-02 04:53:24,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:53:26,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:53:26,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:27,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:53:29,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 04:53:29,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:53:30,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:53:31,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:53:40,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:53:40,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:42,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 04:53:43,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:44,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=757493.3333333334, ans=0.125 2023-10-02 04:53:45,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 04:53:45,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:53:48,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:53:51,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:53:52,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:53:52,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:53:55,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:53:56,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:53:56,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:54:00,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:54:00,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=757560.0, ans=0.125 2023-10-02 04:54:01,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:54:03,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:54:03,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=757560.0, ans=0.0 2023-10-02 04:54:04,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:54:08,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:54:14,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:54:16,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 04:54:19,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=757626.6666666666, ans=0.1 2023-10-02 04:54:20,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:54:22,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:54:22,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=757626.6666666666, ans=0.035 2023-10-02 04:54:25,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:54:25,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 04:54:29,490 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 04:54:29,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:54:29,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:54:30,801 INFO [train.py:1046] (1/4) Epoch 22, batch 2100, loss[loss=0.1496, simple_loss=0.2348, pruned_loss=0.0322, over 24509.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2501, pruned_loss=0.0493, over 4703137.70 frames. ], batch size: 63, lr: 4.68e-03, grad_scale: 16.0 2023-10-02 04:54:30,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:54:30,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:54:30,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 04:54:32,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 04:54:35,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:54:36,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=757693.3333333334, ans=0.125 2023-10-02 04:54:37,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:54:37,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:54:41,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:54:41,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:54:41,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 04:54:43,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:54:44,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 04:54:44,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 04:54:47,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:54:47,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:54:47,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 04:54:47,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 04:54:49,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=757760.0, ans=0.0 2023-10-02 04:54:53,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 04:54:53,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:54:55,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:54:55,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:54:56,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=757760.0, ans=0.125 2023-10-02 04:55:01,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:55:01,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 04:55:02,524 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.868e+02 2.085e+02 2.437e+02 3.685e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-02 04:55:02,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:55:02,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 04:55:03,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 04:55:04,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:55:04,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 04:55:05,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 04:55:05,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 04:55:06,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:55:09,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:55:11,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:55:12,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:55:14,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:55:16,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=757893.3333333334, ans=0.2 2023-10-02 04:55:17,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:55:17,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 04:55:17,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:55:17,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:55:17,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:55:18,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 04:55:20,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 04:55:20,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 04:55:24,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:55:28,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:55:28,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 04:55:34,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:55:35,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:55:36,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:55:36,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:55:36,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 04:55:37,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:55:38,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:55:38,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:55:38,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=757960.0, ans=0.1 2023-10-02 04:55:40,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=757960.0, ans=0.0 2023-10-02 04:55:41,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:55:41,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:55:42,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 04:55:44,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 04:55:44,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:55:45,749 INFO [train.py:1046] (1/4) Epoch 22, batch 2150, loss[loss=0.1446, simple_loss=0.2188, pruned_loss=0.03524, over 24306.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2482, pruned_loss=0.04871, over 4696187.71 frames. ], batch size: 56, lr: 4.68e-03, grad_scale: 16.0 2023-10-02 04:55:45,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:55:45,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:55:45,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:55:47,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:55:52,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 04:55:53,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:55:54,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:55:56,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:55:56,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:55:58,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:56:01,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:56:01,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:56:01,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:56:04,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:05,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 04:56:08,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:56:11,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:56:11,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=758093.3333333334, ans=0.0 2023-10-02 04:56:13,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:13,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:56:13,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:13,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:56:14,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:56:14,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:56:16,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:56:16,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 04:56:18,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:56:18,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=758160.0, ans=0.1 2023-10-02 04:56:19,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:56:19,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:56:20,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:56:20,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:56:23,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:56:24,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:56:25,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=758160.0, ans=0.125 2023-10-02 04:56:26,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:56:26,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 04:56:26,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:56:28,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:56:29,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:31,094 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.39 vs. limit=15.0 2023-10-02 04:56:31,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:56:32,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:56:32,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:56:34,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:34,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 04:56:36,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 04:56:36,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:56:38,336 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 04:56:38,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:56:38,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:56:39,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 04:56:39,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:56:39,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 04:56:41,036 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 04:56:41,036 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 04:56:41,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 04:56:43,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:56:43,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:56:43,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:56:44,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:45,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 04:56:47,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:56:47,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:53,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:56:53,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 04:56:59,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:57:00,787 INFO [train.py:1046] (1/4) Epoch 22, batch 2200, loss[loss=0.1643, simple_loss=0.2372, pruned_loss=0.04565, over 24298.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2487, pruned_loss=0.04862, over 4703413.14 frames. ], batch size: 56, lr: 4.68e-03, grad_scale: 8.0 2023-10-02 04:57:02,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:57:03,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:57:04,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:57:05,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:57:08,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:57:09,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:57:09,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 04:57:09,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=758360.0, ans=0.0 2023-10-02 04:57:14,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 04:57:17,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:57:23,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 04:57:24,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:57:24,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:57:26,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:57:29,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:57:29,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 04:57:32,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:57:34,194 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.806e+02 1.968e+02 2.209e+02 3.586e+02, threshold=3.937e+02, percent-clipped=0.0 2023-10-02 04:57:34,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=758493.3333333334, ans=0.0 2023-10-02 04:57:35,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:57:36,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 04:57:38,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=758493.3333333334, ans=0.1 2023-10-02 04:57:39,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:57:41,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:57:42,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:57:44,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:57:46,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 04:57:47,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:57:48,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 04:57:52,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:57:52,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 04:57:52,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:57:54,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:57:54,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:57:54,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:57:54,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:57:55,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=758560.0, ans=0.125 2023-10-02 04:57:56,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:57:56,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:57:56,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=758560.0, ans=0.0 2023-10-02 04:57:59,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 04:58:02,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 04:58:03,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:58:05,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:58:06,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=758626.6666666666, ans=0.2 2023-10-02 04:58:07,307 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 04:58:08,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:58:08,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=758626.6666666666, ans=0.125 2023-10-02 04:58:10,308 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 04:58:10,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 04:58:11,645 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 04:58:13,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:58:15,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 04:58:15,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:58:16,409 INFO [train.py:1046] (1/4) Epoch 22, batch 2250, loss[loss=0.1873, simple_loss=0.2564, pruned_loss=0.05911, over 23788.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2491, pruned_loss=0.0485, over 4707573.67 frames. ], batch size: 212, lr: 4.68e-03, grad_scale: 8.0 2023-10-02 04:58:16,524 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 04:58:19,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:58:21,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:58:26,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:58:27,348 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.27 vs. limit=6.0 2023-10-02 04:58:28,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:58:30,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:58:31,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:58:32,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:58:33,023 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.85 vs. limit=6.0 2023-10-02 04:58:33,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 04:58:33,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:58:33,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:58:36,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 04:58:37,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:58:37,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:58:39,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:58:42,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:58:44,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 04:58:46,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:58:47,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 04:58:49,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:58:53,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:58:56,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:58:57,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:58:59,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:58:59,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:59:02,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:59:03,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:59:08,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:59:10,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:59:14,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 04:59:14,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:59:14,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=758960.0, ans=0.125 2023-10-02 04:59:15,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:59:17,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=758960.0, ans=0.1 2023-10-02 04:59:21,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 04:59:22,648 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.43 vs. limit=15.0 2023-10-02 04:59:23,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 04:59:23,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 04:59:23,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:59:24,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:59:24,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=758960.0, ans=0.0 2023-10-02 04:59:28,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 04:59:30,854 INFO [train.py:1046] (1/4) Epoch 22, batch 2300, loss[loss=0.1654, simple_loss=0.2462, pruned_loss=0.04231, over 24461.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2494, pruned_loss=0.0489, over 4710158.30 frames. ], batch size: 63, lr: 4.68e-03, grad_scale: 8.0 2023-10-02 04:59:30,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:59:30,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:59:37,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:59:37,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:59:39,545 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=13.90 vs. limit=15.0 2023-10-02 04:59:39,900 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 04:59:41,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:59:48,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:59:49,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:59:49,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:59:50,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:59:50,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 04:59:50,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:59:51,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=759093.3333333334, ans=0.1 2023-10-02 04:59:53,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:59:53,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:59:56,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:59:58,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:00:02,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:00:03,916 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.999e+02 2.256e+02 2.584e+02 4.812e+02, threshold=4.513e+02, percent-clipped=1.0 2023-10-02 05:00:06,275 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.77 vs. limit=22.5 2023-10-02 05:00:08,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:00:08,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:00:10,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=759160.0, ans=0.1 2023-10-02 05:00:11,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:00:15,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:00:19,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:00:19,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:00:21,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:00:21,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 05:00:25,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 05:00:25,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:00:26,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:00:26,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:00:28,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:00:28,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 05:00:28,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:00:28,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=759226.6666666666, ans=0.125 2023-10-02 05:00:29,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 05:00:29,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:00:29,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:00:31,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 05:00:35,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:00:36,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=759293.3333333334, ans=0.1 2023-10-02 05:00:39,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:00:39,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=759293.3333333334, ans=0.0 2023-10-02 05:00:44,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:00:44,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:00:44,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:00:45,923 INFO [train.py:1046] (1/4) Epoch 22, batch 2350, loss[loss=0.1726, simple_loss=0.2545, pruned_loss=0.04531, over 23289.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.25, pruned_loss=0.04858, over 4721828.72 frames. ], batch size: 105, lr: 4.68e-03, grad_scale: 8.0 2023-10-02 05:00:47,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:00:47,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:00:47,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:00:48,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 05:00:52,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=759360.0, ans=0.125 2023-10-02 05:00:52,892 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=759360.0, ans=0.125 2023-10-02 05:00:54,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:00:54,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 05:00:55,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=759360.0, ans=0.0 2023-10-02 05:01:00,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 05:01:03,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:01:06,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:01:06,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:01:06,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:01:06,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:01:08,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 05:01:08,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=759426.6666666666, ans=0.0 2023-10-02 05:01:08,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=759426.6666666666, ans=0.0 2023-10-02 05:01:08,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=759426.6666666666, ans=0.1 2023-10-02 05:01:11,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:01:15,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 05:01:17,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:01:20,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:01:20,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:01:23,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:01:24,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 05:01:24,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:01:26,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=759493.3333333334, ans=0.125 2023-10-02 05:01:28,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:01:28,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:01:28,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:01:31,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:01:32,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 05:01:33,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:01:36,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:01:38,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:01:39,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 05:01:39,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:01:42,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 05:01:42,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:01:47,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 05:01:52,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 05:01:53,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:01:53,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:01:53,712 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 05:01:55,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 05:01:56,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 05:01:59,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:02:01,014 INFO [train.py:1046] (1/4) Epoch 22, batch 2400, loss[loss=0.1729, simple_loss=0.2575, pruned_loss=0.04418, over 24039.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2504, pruned_loss=0.04872, over 4721490.72 frames. ], batch size: 80, lr: 4.68e-03, grad_scale: 16.0 2023-10-02 05:02:04,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:02:06,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:02:07,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:02:08,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 05:02:08,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 05:02:14,756 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:02:16,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:02:16,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:02:17,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 05:02:19,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:02:20,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:02:20,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 05:02:28,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:02:29,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 05:02:32,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:02:34,043 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.808e+02 2.044e+02 2.322e+02 3.355e+02, threshold=4.088e+02, percent-clipped=0.0 2023-10-02 05:02:36,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 05:02:39,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:02:41,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:02:44,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:02:46,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 05:02:47,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:02:52,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:02:55,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:02:58,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:03:00,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:03:00,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 05:03:00,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:03:00,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:03:00,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=759960.0, ans=0.2 2023-10-02 05:03:01,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:03:01,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:03:05,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:03:06,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:03:06,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 05:03:07,377 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.11 vs. limit=15.0 2023-10-02 05:03:08,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 05:03:09,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:03:09,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:03:11,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 05:03:11,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 05:03:12,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 05:03:12,978 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 05:03:13,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 05:03:14,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:03:15,812 INFO [train.py:1046] (1/4) Epoch 22, batch 2450, loss[loss=0.1834, simple_loss=0.2557, pruned_loss=0.05559, over 23764.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2493, pruned_loss=0.04863, over 4720993.23 frames. ], batch size: 212, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:03:15,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:03:15,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:03:17,816 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 05:03:17,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:03:17,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:03:22,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:03:22,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:03:25,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:03:25,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:03:27,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 05:03:29,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=760026.6666666666, ans=0.0 2023-10-02 05:03:31,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:03:32,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:03:34,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:03:34,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:03:35,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:03:35,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 05:03:37,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=760093.3333333334, ans=0.2 2023-10-02 05:03:39,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:03:41,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=760093.3333333334, ans=0.125 2023-10-02 05:03:43,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:03:43,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:03:46,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:03:46,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:03:48,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:03:48,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:03:50,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 05:03:51,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:04:00,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:04:01,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:04:02,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:04:03,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:04:03,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:04:04,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:04:04,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 05:04:07,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:04:08,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:04:11,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:04:12,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:04:17,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:04:17,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 05:04:17,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:04:18,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:04:18,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 05:04:20,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:04:20,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:04:24,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:04:28,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:04:28,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:04:31,462 INFO [train.py:1046] (1/4) Epoch 22, batch 2500, loss[loss=0.1778, simple_loss=0.254, pruned_loss=0.05079, over 23617.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2487, pruned_loss=0.04829, over 4730624.71 frames. ], batch size: 135, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:04:31,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 05:04:33,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:04:38,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:04:48,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:04:48,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:04:49,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:04:49,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 05:04:54,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=760426.6666666666, ans=0.2 2023-10-02 05:04:54,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=760426.6666666666, ans=0.125 2023-10-02 05:04:56,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:04:58,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:04:58,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 05:04:58,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:04:59,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 05:05:00,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:05:01,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:05:01,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 05:05:01,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:05:03,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 05:05:03,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:05:03,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=760493.3333333334, ans=0.125 2023-10-02 05:05:04,481 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.863e+02 2.107e+02 2.380e+02 3.578e+02, threshold=4.214e+02, percent-clipped=0.0 2023-10-02 05:05:07,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:05:07,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:05:10,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:05:10,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 05:05:10,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:05:11,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:05:15,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:05:20,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:05:21,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:05:26,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 05:05:28,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 05:05:30,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:05:30,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 05:05:31,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:05:31,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:05:33,349 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 05:05:33,350 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 05:05:33,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 05:05:35,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:05:37,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 05:05:38,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 05:05:38,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:05:40,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 05:05:42,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 05:05:46,207 INFO [train.py:1046] (1/4) Epoch 22, batch 2550, loss[loss=0.1552, simple_loss=0.2317, pruned_loss=0.03931, over 24378.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2493, pruned_loss=0.04878, over 4722885.00 frames. ], batch size: 56, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:05:46,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:05:47,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:05:47,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:05:50,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:05:50,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 05:05:52,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:05:55,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 05:05:56,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:05:58,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:00,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=760760.0, ans=0.125 2023-10-02 05:06:01,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:06:01,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 05:06:01,939 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:06:03,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:06:03,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:06:03,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:06:06,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:06:06,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 05:06:06,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=760760.0, ans=0.125 2023-10-02 05:06:07,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 05:06:07,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:07,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 05:06:13,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=760760.0, ans=0.125 2023-10-02 05:06:17,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:06:20,179 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.37 vs. limit=15.0 2023-10-02 05:06:23,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:06:23,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:23,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:06:23,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:06:30,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:06:32,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:06:33,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:06:33,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:06:34,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 05:06:34,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:06:37,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:06:37,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:42,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:06:42,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 05:06:42,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:06:44,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:45,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 05:06:47,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:06:49,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:06:54,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:06:57,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:07:01,063 INFO [train.py:1046] (1/4) Epoch 22, batch 2600, loss[loss=0.1567, simple_loss=0.2401, pruned_loss=0.03663, over 24454.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2492, pruned_loss=0.0483, over 4728273.18 frames. ], batch size: 66, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:07:01,131 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 05:07:03,237 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 05:07:03,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:07:03,279 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 05:07:03,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 05:07:04,587 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 05:07:06,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:07:06,065 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 05:07:07,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 05:07:08,981 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 05:07:11,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:07:14,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 05:07:15,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 05:07:17,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 05:07:17,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 05:07:20,928 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 05:07:20,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 05:07:27,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:07:27,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:07:29,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:07:29,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 05:07:29,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=761160.0, ans=0.025 2023-10-02 05:07:30,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:07:33,967 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.850e+02 2.100e+02 2.387e+02 3.462e+02, threshold=4.200e+02, percent-clipped=0.0 2023-10-02 05:07:38,653 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 05:07:44,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:07:45,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:07:45,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 05:07:47,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:07:47,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:07:47,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 05:07:48,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:07:50,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:07:51,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:07:56,347 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 05:07:56,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:07:57,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:08:03,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:08:03,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:08:03,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 05:08:04,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:08:06,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:08:08,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:08:13,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 05:08:15,237 INFO [train.py:1046] (1/4) Epoch 22, batch 2650, loss[loss=0.1821, simple_loss=0.2601, pruned_loss=0.05201, over 23383.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2495, pruned_loss=0.04813, over 4736515.31 frames. ], batch size: 105, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:08:15,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:08:17,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:08:19,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 05:08:19,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:08:20,831 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.35 vs. limit=15.0 2023-10-02 05:08:21,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:08:23,091 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 05:08:23,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:08:24,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:08:26,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:08:27,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:08:30,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:08:30,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 05:08:30,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:08:32,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:08:35,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 05:08:36,334 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 05:08:38,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:08:40,069 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=761426.6666666666, ans=0.125 2023-10-02 05:08:42,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 05:08:44,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:08:44,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 05:08:48,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:08:48,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:08:48,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:08:49,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:08:51,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=761493.3333333334, ans=0.0 2023-10-02 05:08:52,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 05:08:52,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 05:08:56,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:09:01,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 05:09:01,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:09:01,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:02,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:09:03,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:09:03,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:09:04,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:09:05,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:09:05,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:09:07,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:09:08,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:09:11,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:09:11,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:09:11,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=761560.0, ans=0.125 2023-10-02 05:09:12,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:09:13,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:09:13,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:09:14,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=761626.6666666666, ans=0.125 2023-10-02 05:09:16,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:17,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:09:17,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:09:19,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 05:09:19,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=761626.6666666666, ans=0.125 2023-10-02 05:09:20,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:09:22,267 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=761626.6666666666, ans=0.125 2023-10-02 05:09:23,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:24,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:26,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:09:28,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:09:28,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:09:29,693 INFO [train.py:1046] (1/4) Epoch 22, batch 2700, loss[loss=0.1832, simple_loss=0.2685, pruned_loss=0.04899, over 24340.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2505, pruned_loss=0.04887, over 4729459.48 frames. ], batch size: 77, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:09:31,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:09:31,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 05:09:33,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:09:35,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 05:09:38,043 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.51 vs. limit=22.5 2023-10-02 05:09:38,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:09:38,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:09:38,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:09:38,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:09:39,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:09:39,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:09:39,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 05:09:39,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 05:09:39,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:09:41,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:09:43,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:09:44,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:47,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:09:47,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 05:09:49,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:09:53,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:09:53,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:09:59,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:09:59,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:09:59,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:10:01,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:10:02,473 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.897e+02 2.080e+02 2.344e+02 3.157e+02, threshold=4.159e+02, percent-clipped=0.0 2023-10-02 05:10:03,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:10:04,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=761826.6666666666, ans=0.125 2023-10-02 05:10:05,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=761826.6666666666, ans=0.0 2023-10-02 05:10:06,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:10:06,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:10:06,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:10:12,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:10:12,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:10:19,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:10:20,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:10:20,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=761893.3333333334, ans=0.2 2023-10-02 05:10:22,691 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.88 vs. limit=12.0 2023-10-02 05:10:23,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:10:23,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:10:24,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:10:26,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:10:28,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:10:28,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:10:29,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:10:29,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:10:31,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=761960.0, ans=0.0 2023-10-02 05:10:34,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:10:35,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:10:35,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:10:38,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 05:10:38,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:10:38,827 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.86 vs. limit=15.0 2023-10-02 05:10:40,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:10:40,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 05:10:41,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 05:10:42,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:10:44,758 INFO [train.py:1046] (1/4) Epoch 22, batch 2750, loss[loss=0.1842, simple_loss=0.2682, pruned_loss=0.05008, over 24330.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2498, pruned_loss=0.04882, over 4732658.69 frames. ], batch size: 74, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:10:46,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:10:46,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:10:47,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:10:48,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:10:48,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:10:53,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:10:53,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 05:10:53,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:10:54,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:10:54,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 05:10:54,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:10:54,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:11:00,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 05:11:02,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:11:02,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:11:03,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:11:03,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:11:05,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:11:06,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:11:07,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:11:07,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:11:13,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:11:13,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:11:13,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:11:15,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:11:17,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 05:11:17,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=762160.0, ans=0.0 2023-10-02 05:11:17,446 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:11:18,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=762160.0, ans=0.125 2023-10-02 05:11:22,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:11:25,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:11:25,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:11:30,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:11:30,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:11:30,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:11:31,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.44 vs. limit=12.0 2023-10-02 05:11:36,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:11:36,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:11:36,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 05:11:42,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:11:43,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 05:11:49,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 05:11:51,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:11:51,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 05:11:52,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:11:52,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:11:54,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 05:11:54,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:11:57,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 05:11:58,732 INFO [train.py:1046] (1/4) Epoch 22, batch 2800, loss[loss=0.1605, simple_loss=0.2494, pruned_loss=0.03579, over 24626.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2489, pruned_loss=0.04871, over 4718466.00 frames. ], batch size: 73, lr: 4.67e-03, grad_scale: 32.0 2023-10-02 05:11:58,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:11:58,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:12:00,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 05:12:00,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:00,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:12:02,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:03,545 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 05:12:03,546 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 05:12:06,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:12:09,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:12:09,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:12:12,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:12:13,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 05:12:15,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 05:12:16,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 05:12:19,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:12:19,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:12:19,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:12:23,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:12:24,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:12:24,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 05:12:25,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:12:31,712 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.904e+02 2.151e+02 2.380e+02 3.525e+02, threshold=4.302e+02, percent-clipped=0.0 2023-10-02 05:12:34,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:12:36,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:12:37,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:39,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:12:39,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:12:43,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:12:43,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 05:12:44,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:12:45,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=762560.0, ans=0.125 2023-10-02 05:12:46,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:12:46,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:12:50,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:12:52,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:53,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=762560.0, ans=0.125 2023-10-02 05:12:55,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:12:57,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:12:57,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:57,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:12:58,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 05:12:58,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:13:00,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:13:00,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 05:13:01,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:13:02,278 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.11 vs. limit=10.0 2023-10-02 05:13:03,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:13:03,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:13:04,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 05:13:05,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:13:05,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:13:08,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:13:08,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 05:13:10,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=762626.6666666666, ans=0.125 2023-10-02 05:13:12,331 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.22 vs. limit=15.0 2023-10-02 05:13:13,621 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.03 vs. limit=15.0 2023-10-02 05:13:14,072 INFO [train.py:1046] (1/4) Epoch 22, batch 2850, loss[loss=0.1839, simple_loss=0.2532, pruned_loss=0.05723, over 23832.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2485, pruned_loss=0.04836, over 4721127.74 frames. ], batch size: 164, lr: 4.67e-03, grad_scale: 32.0 2023-10-02 05:13:14,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:13:14,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:13:15,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:13:18,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:13:21,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:13:22,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:13:22,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:13:25,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:13:26,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:13:27,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:13:27,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=762760.0, ans=0.125 2023-10-02 05:13:28,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 05:13:33,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 05:13:33,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:13:34,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 05:13:36,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:13:36,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=762760.0, ans=0.125 2023-10-02 05:13:37,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 05:13:39,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 05:13:39,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=762760.0, ans=0.125 2023-10-02 05:13:42,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:13:44,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=762826.6666666666, ans=0.1 2023-10-02 05:13:44,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=762826.6666666666, ans=0.125 2023-10-02 05:13:45,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=762826.6666666666, ans=0.95 2023-10-02 05:13:52,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:13:53,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:13:54,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:13:55,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 05:13:55,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:13:55,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:13:55,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=762826.6666666666, ans=0.0 2023-10-02 05:13:57,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:13:57,377 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:13:58,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 05:14:00,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:14:00,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:14:00,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:14:01,202 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.33 vs. limit=15.0 2023-10-02 05:14:01,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:14:03,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=762893.3333333334, ans=0.125 2023-10-02 05:14:04,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:14:05,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:14:07,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:14:09,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:14:11,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:14:11,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:14:13,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:14:15,380 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.50 vs. limit=22.5 2023-10-02 05:14:15,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:14:18,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:14:20,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 05:14:21,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 05:14:23,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:14:23,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:14:23,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 05:14:24,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:14:25,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:14:25,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:14:26,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:14:26,928 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 05:14:26,977 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 05:14:26,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:14:27,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:14:28,341 INFO [train.py:1046] (1/4) Epoch 22, batch 2900, loss[loss=0.1951, simple_loss=0.2704, pruned_loss=0.05987, over 23375.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2488, pruned_loss=0.04807, over 4736181.24 frames. ], batch size: 93, lr: 4.67e-03, grad_scale: 32.0 2023-10-02 05:14:30,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=763026.6666666666, ans=0.0 2023-10-02 05:14:31,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 05:14:32,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:14:32,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:14:34,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 05:14:34,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=763026.6666666666, ans=0.2 2023-10-02 05:14:38,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:14:38,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 05:14:39,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 05:14:41,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:14:41,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:14:43,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:14:43,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:14:43,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=763093.3333333334, ans=0.125 2023-10-02 05:14:47,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:14:47,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:14:49,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=763093.3333333334, ans=0.125 2023-10-02 05:14:50,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:14:50,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=763093.3333333334, ans=0.0 2023-10-02 05:14:51,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 05:14:51,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:14:53,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:14:57,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 05:14:57,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 05:14:59,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:14:59,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 05:15:00,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:15:02,607 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.846e+02 2.042e+02 2.296e+02 2.937e+02, threshold=4.085e+02, percent-clipped=0.0 2023-10-02 05:15:03,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:15:04,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 05:15:06,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:15:07,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:15:10,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:15:13,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:15:14,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 05:15:14,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 05:15:14,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:15:18,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:15:20,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 05:15:21,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:15:26,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:15:35,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:15:35,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:15:37,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 05:15:41,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:15:41,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 05:15:41,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:15:41,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:15:42,581 INFO [train.py:1046] (1/4) Epoch 22, batch 2950, loss[loss=0.1942, simple_loss=0.2619, pruned_loss=0.06324, over 23807.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2496, pruned_loss=0.04825, over 4747713.99 frames. ], batch size: 195, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:15:46,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:15:48,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 05:15:50,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:15:50,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:15:51,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:15:54,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:15:54,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 05:15:55,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 05:15:55,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:15:55,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:16:02,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:16:05,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:16:07,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:16:08,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:16:11,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:16:11,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:16:14,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:16:14,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:16:14,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:16:17,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 05:16:21,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 05:16:22,015 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 05:16:22,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:16:23,495 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 05:16:26,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 05:16:26,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:16:26,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:16:26,270 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 05:16:26,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:16:29,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 05:16:29,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:16:29,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:16:32,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:16:34,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:16:34,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:16:34,379 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 05:16:35,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:16:35,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 05:16:36,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=763560.0, ans=0.0 2023-10-02 05:16:36,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=763560.0, ans=22.5 2023-10-02 05:16:39,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=763560.0, ans=0.025 2023-10-02 05:16:43,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:16:44,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:16:44,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 05:16:44,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:16:45,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 05:16:50,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:16:52,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:16:53,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:16:53,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:16:53,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 05:16:54,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:16:57,526 INFO [train.py:1046] (1/4) Epoch 22, batch 3000, loss[loss=0.1688, simple_loss=0.2592, pruned_loss=0.03924, over 24638.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2498, pruned_loss=0.0482, over 4752025.77 frames. ], batch size: 73, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:16:57,526 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 05:17:15,408 INFO [train.py:1078] (1/4) Epoch 22, validation: loss=0.3452, simple_loss=0.2763, pruned_loss=0.2071, over 1125622.00 frames. 2023-10-02 05:17:15,409 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20387MB 2023-10-02 05:17:15,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:17:15,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:17:15,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:17:15,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:17:16,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:17:16,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:17:16,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 05:17:18,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:17:20,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_na.min_abs, batch_count=763693.3333333334, ans=0.02 2023-10-02 05:17:21,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:17:21,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:17:24,072 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 05:17:25,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 05:17:27,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:17:28,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:17:28,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 05:17:30,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:17:36,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:17:45,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:17:49,822 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.840e+02 2.064e+02 2.389e+02 3.388e+02, threshold=4.128e+02, percent-clipped=0.0 2023-10-02 05:17:52,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 05:17:52,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:17:55,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:17:56,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:17:56,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:17:59,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:17:59,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 05:18:00,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 05:18:01,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:18:03,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 05:18:04,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:18:06,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:18:06,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:18:06,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:18:09,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:18:10,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:18:10,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:18:13,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:18:15,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 05:18:17,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:18:17,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:18:17,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:18:21,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:18:22,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:18:22,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 05:18:22,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 05:18:24,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:18:24,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 05:18:25,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:18:27,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 05:18:29,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:18:29,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:18:29,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 05:18:29,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 05:18:29,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 05:18:30,962 INFO [train.py:1046] (1/4) Epoch 22, batch 3050, loss[loss=0.1764, simple_loss=0.2645, pruned_loss=0.04415, over 24425.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2508, pruned_loss=0.04875, over 4740076.15 frames. ], batch size: 69, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:18:31,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:18:32,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:18:32,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 05:18:32,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:18:33,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:18:35,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 05:18:40,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:18:41,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:18:41,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:18:44,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:18:47,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 05:18:48,523 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.00 vs. limit=10.0 2023-10-02 05:18:53,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 05:18:53,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 05:18:54,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:18:57,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:19:00,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:19:00,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:19:00,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:19:00,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=764160.0, ans=0.1 2023-10-02 05:19:03,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:19:05,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:19:05,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:19:05,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:19:05,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:19:08,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:19:08,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=764160.0, ans=0.025 2023-10-02 05:19:09,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:19:13,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:19:13,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 05:19:14,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:19:14,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:19:18,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:19:18,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:19:19,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:19:19,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:19:21,634 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.38 vs. limit=22.5 2023-10-02 05:19:22,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=764226.6666666666, ans=0.2 2023-10-02 05:19:25,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:19:26,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:19:30,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:19:30,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:19:30,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:19:32,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:19:33,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:19:33,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:19:35,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 05:19:38,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:19:38,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:19:38,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 05:19:39,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:19:45,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:19:46,784 INFO [train.py:1046] (1/4) Epoch 22, batch 3100, loss[loss=0.1549, simple_loss=0.2285, pruned_loss=0.04066, over 24451.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2507, pruned_loss=0.04852, over 4731819.89 frames. ], batch size: 58, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:19:48,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:19:49,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:19:52,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 05:19:52,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=764360.0, ans=0.0 2023-10-02 05:19:55,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 05:19:55,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 05:19:58,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:20:00,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:20:00,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:20:02,903 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.31 vs. limit=22.5 2023-10-02 05:20:03,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 05:20:07,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:20:10,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=764426.6666666666, ans=0.0 2023-10-02 05:20:11,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 05:20:18,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 05:20:19,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:19,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:20:20,665 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.816e+02 1.968e+02 2.176e+02 3.408e+02, threshold=3.937e+02, percent-clipped=0.0 2023-10-02 05:20:20,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:20:20,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 05:20:20,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:20:20,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 05:20:20,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:20:22,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:20:23,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 05:20:25,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:20:29,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:20:29,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 05:20:30,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 05:20:30,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:30,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:20:34,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:20:34,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:35,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:20:35,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:20:35,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:20:36,349 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.92 vs. limit=12.0 2023-10-02 05:20:38,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:20:38,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:20:38,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:38,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 05:20:42,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:20:44,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 05:20:47,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:20:48,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 05:20:49,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:20:49,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:49,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 05:20:53,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=764626.6666666666, ans=0.0 2023-10-02 05:20:55,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=764626.6666666666, ans=0.1 2023-10-02 05:20:59,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 05:21:00,692 INFO [train.py:1046] (1/4) Epoch 22, batch 3150, loss[loss=0.1656, simple_loss=0.2301, pruned_loss=0.05055, over 23706.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2493, pruned_loss=0.04826, over 4712878.11 frames. ], batch size: 232, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:21:02,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:21:02,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:21:04,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:21:04,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:21:04,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 05:21:05,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:21:05,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 05:21:06,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 05:21:09,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:21:11,372 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 05:21:14,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 05:21:14,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:21:15,530 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 05:21:17,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 05:21:19,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 05:21:20,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 05:21:20,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 05:21:20,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:21:20,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:21:22,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:21:23,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 05:21:24,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:21:25,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:21:26,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:21:27,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 05:21:30,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=764826.6666666666, ans=0.125 2023-10-02 05:21:31,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 05:21:31,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:21:35,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:21:36,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:21:36,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 05:21:36,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=764826.6666666666, ans=0.125 2023-10-02 05:21:38,558 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.98 vs. limit=15.0 2023-10-02 05:21:39,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 05:21:41,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:21:41,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 05:21:41,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 05:21:42,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:21:42,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:21:45,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:21:45,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:21:45,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 05:21:46,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:21:46,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:21:47,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:21:48,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:21:48,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 05:21:50,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:21:52,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 05:21:52,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:21:52,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 05:21:54,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 05:21:55,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:21:55,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:21:56,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 05:21:58,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 05:21:58,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:21:58,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=764960.0, ans=0.125 2023-10-02 05:21:59,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:22:01,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:22:02,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:22:07,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:22:08,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:22:08,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=764960.0, ans=0.0 2023-10-02 05:22:10,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 05:22:14,643 INFO [train.py:1046] (1/4) Epoch 22, batch 3200, loss[loss=0.16, simple_loss=0.2499, pruned_loss=0.03507, over 24647.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2488, pruned_loss=0.04797, over 4719251.37 frames. ], batch size: 73, lr: 4.66e-03, grad_scale: 32.0 2023-10-02 05:22:16,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:22:16,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 05:22:21,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:22:21,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:22:22,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 05:22:22,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=765026.6666666666, ans=0.0 2023-10-02 05:22:25,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:22:28,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:22:29,232 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.31 vs. limit=22.5 2023-10-02 05:22:31,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:22:40,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:22:49,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 05:22:50,473 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.635e+02 1.957e+02 2.150e+02 2.421e+02 3.635e+02, threshold=4.301e+02, percent-clipped=0.0 2023-10-02 05:22:50,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:22:53,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 05:22:54,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 05:22:58,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:22:58,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:22:59,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:23:03,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 05:23:05,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 05:23:06,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 05:23:06,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=765226.6666666666, ans=0.0 2023-10-02 05:23:09,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 05:23:10,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=765226.6666666666, ans=0.125 2023-10-02 05:23:13,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:23:18,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:23:18,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:23:19,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:23:20,038 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 05:23:20,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:23:20,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=765293.3333333334, ans=0.1 2023-10-02 05:23:25,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:23:26,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 05:23:26,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 05:23:28,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 05:23:29,516 INFO [train.py:1046] (1/4) Epoch 22, batch 3250, loss[loss=0.1636, simple_loss=0.2388, pruned_loss=0.04417, over 23473.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2487, pruned_loss=0.0477, over 4720377.60 frames. ], batch size: 134, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:23:29,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 05:23:31,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:23:33,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:23:33,865 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 05:23:33,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:23:33,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:23:35,367 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 05:23:39,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:23:41,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:23:48,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:23:48,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 05:23:49,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:23:51,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:23:51,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:23:53,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:23:53,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:23:56,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:23:56,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:23:56,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:23:56,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:23:56,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:23:56,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:24:00,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:24:03,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:24:05,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:24:05,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:24:06,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:24:06,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:24:06,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:24:11,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 05:24:11,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:24:11,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:24:12,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:24:12,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=765560.0, ans=0.125 2023-10-02 05:24:12,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=765560.0, ans=0.1 2023-10-02 05:24:14,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:24:19,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:24:27,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:24:27,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:24:27,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 05:24:27,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:24:27,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 05:24:29,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:24:31,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 05:24:31,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 05:24:31,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:24:33,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:24:33,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:24:33,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=765626.6666666666, ans=0.125 2023-10-02 05:24:34,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 05:24:34,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:24:39,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:24:39,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:24:40,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 05:24:40,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:24:43,106 INFO [train.py:1046] (1/4) Epoch 22, batch 3300, loss[loss=0.1733, simple_loss=0.2437, pruned_loss=0.05144, over 23588.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2493, pruned_loss=0.04832, over 4716751.98 frames. ], batch size: 256, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:24:43,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:24:43,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 05:24:45,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:24:47,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 05:24:48,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 05:24:50,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 05:24:51,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:24:54,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:24:54,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:24:56,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:24:57,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 05:24:57,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:24:58,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=765760.0, ans=0.1 2023-10-02 05:25:01,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:25:01,480 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.51 vs. limit=6.0 2023-10-02 05:25:02,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:25:02,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=765760.0, ans=0.125 2023-10-02 05:25:03,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=765760.0, ans=0.1 2023-10-02 05:25:06,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 05:25:06,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:25:06,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:25:09,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:09,968 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.19 vs. limit=22.5 2023-10-02 05:25:10,665 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 05:25:12,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:25:13,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:25:13,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:25:13,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:25:13,463 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 05:25:18,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:25:18,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:25:19,560 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.834e+02 2.092e+02 2.306e+02 3.229e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-02 05:25:19,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:19,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 05:25:21,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 05:25:21,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:23,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:25:24,468 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 05:25:26,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 05:25:26,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:25:26,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=765826.6666666666, ans=0.07 2023-10-02 05:25:27,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 05:25:31,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:25:31,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=765893.3333333334, ans=0.125 2023-10-02 05:25:34,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 05:25:34,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:25:38,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:25:38,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:25:38,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:25:39,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:25:42,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:25:42,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:43,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:25:46,459 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 05:25:46,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 05:25:47,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:25:48,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:25:48,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:25:49,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:25:49,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:25:51,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:25:53,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:25:53,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:25:54,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:54,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:25:57,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 05:25:57,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:25:59,257 INFO [train.py:1046] (1/4) Epoch 22, batch 3350, loss[loss=0.1682, simple_loss=0.2523, pruned_loss=0.04209, over 24441.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2502, pruned_loss=0.04847, over 4729301.64 frames. ], batch size: 69, lr: 4.66e-03, grad_scale: 8.0 2023-10-02 05:25:59,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:25:59,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=766026.6666666666, ans=0.0 2023-10-02 05:26:02,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:26:02,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:26:03,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:26:06,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:26:06,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:08,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:26:09,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:11,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:26:11,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=766026.6666666666, ans=0.125 2023-10-02 05:26:12,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:15,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:26:16,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:26:16,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:26:18,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 05:26:20,219 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 05:26:20,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:26:24,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 05:26:24,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 05:26:25,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:26:26,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:26:27,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:26:27,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 05:26:27,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:27,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:26:30,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:30,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:30,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:32,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:26:32,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=766160.0, ans=0.125 2023-10-02 05:26:34,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=766160.0, ans=0.125 2023-10-02 05:26:37,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:26:40,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:40,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:26:44,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:26:45,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:46,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:47,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:26:47,472 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.55 vs. limit=15.0 2023-10-02 05:26:48,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:26:49,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 05:26:49,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:26:49,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 05:26:51,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:26:51,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 05:26:52,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:26:53,229 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:26:55,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:56,795 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=766226.6666666666, ans=0.2 2023-10-02 05:27:00,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:27:02,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 05:27:02,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:27:02,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:27:04,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:27:04,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=766293.3333333334, ans=0.1 2023-10-02 05:27:07,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:27:10,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 05:27:10,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:27:10,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:27:10,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=766293.3333333334, ans=0.125 2023-10-02 05:27:11,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:27:13,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 05:27:14,635 INFO [train.py:1046] (1/4) Epoch 22, batch 3400, loss[loss=0.1503, simple_loss=0.2264, pruned_loss=0.03711, over 24471.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2502, pruned_loss=0.04823, over 4731651.40 frames. ], batch size: 58, lr: 4.66e-03, grad_scale: 8.0 2023-10-02 05:27:14,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:27:14,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 05:27:16,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:27:18,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:27:18,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:27:19,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:27:19,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 05:27:25,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 05:27:25,417 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 05:27:25,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:27:28,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:27:28,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:27:29,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:27:31,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:27:37,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:27:38,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 05:27:40,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=766426.6666666666, ans=0.2 2023-10-02 05:27:43,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:27:45,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:27:46,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:27:47,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 05:27:54,016 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.880e+02 2.089e+02 2.300e+02 3.158e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 05:27:54,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:27:55,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 05:28:01,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:28:02,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:28:02,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 05:28:03,100 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.13 vs. limit=6.0 2023-10-02 05:28:03,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:28:03,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:28:05,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:28:05,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:28:07,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:28:12,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:28:12,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:28:16,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:28:17,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 05:28:19,670 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.90 vs. limit=15.0 2023-10-02 05:28:22,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:28:22,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=766626.6666666666, ans=0.125 2023-10-02 05:28:27,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 05:28:30,081 INFO [train.py:1046] (1/4) Epoch 22, batch 3450, loss[loss=0.1611, simple_loss=0.2451, pruned_loss=0.03853, over 24647.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2504, pruned_loss=0.04883, over 4702862.22 frames. ], batch size: 65, lr: 4.65e-03, grad_scale: 4.0 2023-10-02 05:28:30,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 05:28:30,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:28:33,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:28:33,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 05:28:33,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:28:36,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:28:43,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:28:45,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:28:46,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:28:46,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:28:48,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:28:54,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 05:28:59,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 05:28:59,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 05:28:59,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:29:02,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:29:07,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 05:29:08,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:29:11,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:29:11,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:29:13,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:29:15,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:29:16,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 05:29:16,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:29:18,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:29:20,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:29:23,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 05:29:29,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:29:34,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:29:35,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:29:39,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:29:42,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:29:43,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:29:43,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:29:44,734 INFO [train.py:1046] (1/4) Epoch 22, batch 3500, loss[loss=0.1536, simple_loss=0.2408, pruned_loss=0.0332, over 24511.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2491, pruned_loss=0.04818, over 4710342.39 frames. ], batch size: 63, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:29:44,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:29:47,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=767026.6666666666, ans=0.1 2023-10-02 05:29:49,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:29:52,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:29:53,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 05:29:55,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:29:57,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 05:29:57,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=767026.6666666666, ans=0.0 2023-10-02 05:29:59,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:29:59,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 05:30:04,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:30:04,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:30:06,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:30:06,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:30:06,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=767093.3333333334, ans=0.1 2023-10-02 05:30:07,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 05:30:07,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:09,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:30:09,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 05:30:12,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:13,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 05:30:15,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:30:18,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:18,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=767160.0, ans=0.0 2023-10-02 05:30:19,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 05:30:19,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:30:20,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:30:23,754 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.866e+02 2.009e+02 2.229e+02 3.626e+02, threshold=4.019e+02, percent-clipped=0.0 2023-10-02 05:30:23,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:30:25,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:26,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:30:26,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:30:27,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 05:30:28,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 05:30:29,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 05:30:31,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:30:32,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:32,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:30:32,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:30:35,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=767226.6666666666, ans=0.125 2023-10-02 05:30:36,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 05:30:37,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:30:41,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:30:43,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 05:30:43,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 05:30:43,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:30:46,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:30:47,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:30:49,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:50,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 05:30:50,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:30:50,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=767293.3333333334, ans=0.125 2023-10-02 05:30:52,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:30:53,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 05:30:54,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 05:30:56,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:56,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=767293.3333333334, ans=0.2 2023-10-02 05:30:56,933 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.72 vs. limit=12.0 2023-10-02 05:30:57,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:30:57,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:30:57,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:30:59,168 INFO [train.py:1046] (1/4) Epoch 22, batch 3550, loss[loss=0.163, simple_loss=0.2517, pruned_loss=0.03712, over 24676.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2482, pruned_loss=0.04814, over 4712907.13 frames. ], batch size: 73, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:31:01,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:31:06,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=767360.0, ans=0.07 2023-10-02 05:31:09,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:31:10,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 05:31:10,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=767360.0, ans=0.0 2023-10-02 05:31:14,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:31:14,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:31:17,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:31:17,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:31:18,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:31:21,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:31:22,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:31:22,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:31:24,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 05:31:24,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:31:30,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:31:31,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:31:31,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:31:31,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:31:33,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:31:33,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 05:31:33,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:31:34,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:31:35,550 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.43 vs. limit=10.0 2023-10-02 05:31:36,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 05:31:42,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:31:42,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:31:44,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:31:46,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 05:31:47,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:31:48,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 05:31:48,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:31:49,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:31:49,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:31:54,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 05:31:55,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:31:57,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=767626.6666666666, ans=0.0 2023-10-02 05:31:59,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:32:00,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 05:32:02,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:32:06,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:32:07,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 05:32:13,856 INFO [train.py:1046] (1/4) Epoch 22, batch 3600, loss[loss=0.1551, simple_loss=0.2382, pruned_loss=0.03602, over 24488.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.248, pruned_loss=0.04808, over 4703570.23 frames. ], batch size: 63, lr: 4.65e-03, grad_scale: 16.0 2023-10-02 05:32:15,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 05:32:15,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:32:15,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:32:16,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:32:18,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:32:19,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:32:22,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:32:23,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:32:25,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:32:25,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:32:25,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:32:26,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 05:32:28,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:32:29,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:32:31,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:32:33,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=767760.0, ans=0.125 2023-10-02 05:32:34,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:32:35,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:32:35,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:32:35,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 05:32:36,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=767760.0, ans=0.0 2023-10-02 05:32:37,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:32:40,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:32:41,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:32:43,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:32:45,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=767826.6666666666, ans=0.07 2023-10-02 05:32:46,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:32:47,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:32:49,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 05:32:49,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=767826.6666666666, ans=0.0 2023-10-02 05:32:52,055 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.859e+02 2.035e+02 2.316e+02 3.119e+02, threshold=4.069e+02, percent-clipped=0.0 2023-10-02 05:32:54,205 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.47 vs. limit=22.5 2023-10-02 05:32:54,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:32:56,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:32:56,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 05:33:00,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:33:07,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:33:08,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:33:12,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:33:12,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:33:12,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 05:33:13,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 05:33:15,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 05:33:18,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:33:18,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:33:18,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 05:33:19,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:33:19,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:33:19,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:33:20,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 05:33:23,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 05:33:25,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:33:25,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 05:33:28,014 INFO [train.py:1046] (1/4) Epoch 22, batch 3650, loss[loss=0.1626, simple_loss=0.2393, pruned_loss=0.04294, over 23320.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2485, pruned_loss=0.04791, over 4712626.12 frames. ], batch size: 119, lr: 4.65e-03, grad_scale: 16.0 2023-10-02 05:33:30,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 05:33:31,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:33:36,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 05:33:38,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 05:33:43,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:33:43,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:33:44,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:33:48,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:33:48,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:33:49,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 05:33:50,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:33:50,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:33:51,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 05:33:52,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 05:33:52,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:33:52,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:33:52,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=768093.3333333334, ans=0.1 2023-10-02 05:33:55,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:33:57,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 05:33:59,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 05:33:59,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=768160.0, ans=0.2 2023-10-02 05:34:00,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:34:01,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 05:34:03,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:34:03,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:34:09,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:34:10,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:34:10,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:34:12,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:34:14,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:34:16,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:34:19,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:34:20,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:34:20,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:34:23,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:34:24,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:34:24,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:34:30,007 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 05:34:32,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:34:32,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:34:35,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:34:35,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:34:37,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:34:38,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:34:40,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 05:34:40,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:34:42,048 INFO [train.py:1046] (1/4) Epoch 22, batch 3700, loss[loss=0.178, simple_loss=0.2655, pruned_loss=0.04526, over 24316.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2488, pruned_loss=0.04804, over 4709720.14 frames. ], batch size: 74, lr: 4.65e-03, grad_scale: 16.0 2023-10-02 05:34:42,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:34:44,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:34:46,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:34:49,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:34:49,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 05:34:49,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:34:49,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 05:34:51,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:34:51,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=768360.0, ans=0.2 2023-10-02 05:34:53,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=768360.0, ans=0.0 2023-10-02 05:34:53,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=768360.0, ans=0.09899494936611666 2023-10-02 05:34:56,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:34:57,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:34:57,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:34:59,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:35:00,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:35:00,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 05:35:00,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=768426.6666666666, ans=0.0 2023-10-02 05:35:00,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=768426.6666666666, ans=0.125 2023-10-02 05:35:03,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:35:04,625 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 05:35:10,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=768493.3333333334, ans=0.125 2023-10-02 05:35:11,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:35:13,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 05:35:14,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:35:14,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 05:35:14,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:35:18,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:35:18,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 05:35:19,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=768493.3333333334, ans=0.09899494936611666 2023-10-02 05:35:20,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:35:21,392 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.818e+02 2.091e+02 2.478e+02 3.925e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-02 05:35:21,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:35:23,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=768493.3333333334, ans=0.0 2023-10-02 05:35:24,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:35:24,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:35:27,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 05:35:31,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:35:31,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 05:35:33,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:35:33,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 05:35:36,383 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.72 vs. limit=12.0 2023-10-02 05:35:37,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:35:37,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:35:39,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:35:39,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 05:35:43,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:35:43,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:35:43,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:35:44,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:35:48,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:35:49,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 05:35:49,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 05:35:50,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:35:50,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:35:50,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:35:51,203 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:35:52,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:35:55,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:35:56,805 INFO [train.py:1046] (1/4) Epoch 22, batch 3750, loss[loss=0.1651, simple_loss=0.255, pruned_loss=0.03761, over 24632.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2505, pruned_loss=0.04901, over 4708668.96 frames. ], batch size: 68, lr: 4.65e-03, grad_scale: 16.0 2023-10-02 05:35:56,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:35:58,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:36:01,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 05:36:01,816 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.20 vs. limit=15.0 2023-10-02 05:36:02,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 05:36:03,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 05:36:05,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 05:36:05,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:36:06,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:36:06,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:36:09,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:36:13,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:36:16,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:36:18,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:36:20,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:36:25,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:36:25,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 05:36:26,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:36:28,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:36:28,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:36:30,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 05:36:33,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 05:36:35,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:36:36,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:36:36,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2.whitening_limit, batch_count=768826.6666666666, ans=15.0 2023-10-02 05:36:39,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:36:43,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:36:45,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 05:36:46,124 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.78 vs. limit=12.0 2023-10-02 05:36:48,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 05:36:48,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=768893.3333333334, ans=0.125 2023-10-02 05:36:50,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:36:51,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=768893.3333333334, ans=0.125 2023-10-02 05:36:53,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:36:53,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:36:58,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:37:02,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 05:37:03,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 05:37:05,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:37:05,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=768960.0, ans=0.2 2023-10-02 05:37:06,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:37:09,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:37:10,675 INFO [train.py:1046] (1/4) Epoch 22, batch 3800, loss[loss=0.1755, simple_loss=0.2513, pruned_loss=0.04981, over 23456.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2501, pruned_loss=0.04881, over 4725473.40 frames. ], batch size: 134, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:37:16,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:37:18,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=769026.6666666666, ans=0.2 2023-10-02 05:37:19,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:37:21,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 05:37:22,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 05:37:24,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:37:24,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:37:26,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 05:37:29,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 05:37:29,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:37:29,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:37:30,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:37:30,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:37:31,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:37:33,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 05:37:36,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 05:37:36,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:37:38,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:37:41,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:37:41,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:37:43,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:37:43,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:37:43,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=769160.0, ans=0.1 2023-10-02 05:37:46,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:37:46,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:37:51,711 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.862e+02 2.088e+02 2.377e+02 3.435e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-02 05:37:53,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:37:53,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 05:37:53,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:38:01,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:38:05,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:38:06,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 05:38:08,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 05:38:08,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:38:10,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:38:12,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:38:12,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 05:38:14,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=769293.3333333334, ans=0.2 2023-10-02 05:38:17,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 05:38:17,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 05:38:17,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:38:17,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=769293.3333333334, ans=0.125 2023-10-02 05:38:20,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:38:25,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=769360.0, ans=0.0 2023-10-02 05:38:27,285 INFO [train.py:1046] (1/4) Epoch 22, batch 3850, loss[loss=0.1805, simple_loss=0.261, pruned_loss=0.05004, over 23183.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2487, pruned_loss=0.04876, over 4701574.56 frames. ], batch size: 93, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:38:27,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:38:27,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:38:31,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:38:31,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 05:38:33,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:38:33,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:38:35,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:38:37,962 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.48 vs. limit=15.0 2023-10-02 05:38:38,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:38:40,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 05:38:41,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 05:38:47,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:38:50,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:38:51,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=769426.6666666666, ans=0.125 2023-10-02 05:38:52,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:38:53,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:38:54,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:38:55,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:38:55,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:38:55,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:38:57,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:38:58,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:38:58,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:39:00,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:39:00,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 05:39:00,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 05:39:01,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:39:01,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:39:04,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:04,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:39:05,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 05:39:07,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 05:39:09,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:12,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 05:39:13,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 05:39:19,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:20,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:39:22,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=769560.0, ans=0.0 2023-10-02 05:39:23,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:25,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 05:39:27,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 05:39:29,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:39:31,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:39:32,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:39:32,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:39:33,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:33,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:33,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:39:33,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 05:39:35,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:39:36,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 05:39:36,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:36,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:39:39,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:39:40,609 INFO [train.py:1046] (1/4) Epoch 22, batch 3900, loss[loss=0.194, simple_loss=0.2751, pruned_loss=0.05647, over 24091.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2471, pruned_loss=0.04859, over 4694655.33 frames. ], batch size: 80, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:39:40,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:42,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:39:42,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:39:42,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:42,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:39:43,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 05:39:44,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:48,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:39:49,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:39:49,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:39:51,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:39:52,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:39:52,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=769693.3333333334, ans=0.0 2023-10-02 05:39:53,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:55,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:39:57,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 05:39:57,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:39:59,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 05:39:59,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:40:00,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 05:40:02,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 05:40:04,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:40:05,722 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.94 vs. limit=12.0 2023-10-02 05:40:06,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:40:06,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:40:07,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:40:09,627 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.78 vs. limit=6.0 2023-10-02 05:40:13,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:40:14,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:40:19,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:40:19,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:40:20,554 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.829e+02 1.934e+02 2.273e+02 3.864e+02, threshold=3.868e+02, percent-clipped=0.0 2023-10-02 05:40:20,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:40:20,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=769826.6666666666, ans=0.125 2023-10-02 05:40:25,682 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.35 vs. limit=15.0 2023-10-02 05:40:26,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:40:26,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:40:26,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=769893.3333333334, ans=0.1 2023-10-02 05:40:33,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:40:34,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:40:43,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:40:45,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:40:47,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 05:40:47,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 05:40:48,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:40:48,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 05:40:50,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:40:51,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 05:40:54,155 INFO [train.py:1046] (1/4) Epoch 22, batch 3950, loss[loss=0.1774, simple_loss=0.2674, pruned_loss=0.04365, over 24312.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2476, pruned_loss=0.04824, over 4709592.60 frames. ], batch size: 74, lr: 4.64e-03, grad_scale: 8.0 2023-10-02 05:40:56,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:40:58,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 05:40:58,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:41:01,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:41:04,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:41:08,140 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.30 vs. limit=15.0 2023-10-02 05:41:10,431 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 05:41:10,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:41:10,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 05:41:11,783 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 05:41:11,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:41:14,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:41:14,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:41:14,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:41:17,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 05:41:18,995 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.51 vs. limit=10.0 2023-10-02 05:41:19,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:41:21,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:41:21,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:41:21,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:41:22,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:41:25,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.97 vs. limit=6.0 2023-10-02 05:41:31,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:41:33,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:41:38,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 05:41:40,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=770226.6666666666, ans=0.125 2023-10-02 05:41:43,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 05:41:43,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 05:41:43,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:41:44,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:41:52,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:41:53,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:41:54,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:41:54,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:41:54,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 05:41:58,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:42:00,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:42:04,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 05:42:09,563 INFO [train.py:1046] (1/4) Epoch 22, batch 4000, loss[loss=0.176, simple_loss=0.264, pruned_loss=0.04398, over 24429.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2486, pruned_loss=0.0483, over 4725568.71 frames. ], batch size: 69, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:42:14,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:42:14,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=770360.0, ans=0.125 2023-10-02 05:42:20,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:42:26,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:42:26,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:42:27,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:42:27,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 05:42:27,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:42:28,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 05:42:28,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:42:28,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 05:42:32,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:42:34,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:42:34,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:42:34,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:42:34,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:42:34,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 05:42:36,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:42:38,221 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 05:42:38,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:42:39,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:42:41,062 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 05:42:42,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:42:42,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:42:43,205 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.40 vs. limit=15.0 2023-10-02 05:42:45,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=770493.3333333334, ans=0.125 2023-10-02 05:42:47,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 05:42:48,840 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.409e+02 1.823e+02 2.125e+02 2.454e+02 3.151e+02, threshold=4.250e+02, percent-clipped=0.0 2023-10-02 05:42:48,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:42:51,226 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.19 vs. limit=15.0 2023-10-02 05:42:53,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:42:54,705 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 05:42:56,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:42:58,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 05:42:58,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:42:59,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:42:59,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:43:00,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:43:00,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:43:02,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:43:02,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 05:43:02,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:43:04,044 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 05:43:08,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:43:08,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=770626.6666666666, ans=0.0 2023-10-02 05:43:10,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=770626.6666666666, ans=0.125 2023-10-02 05:43:12,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 05:43:14,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:43:14,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:43:15,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:43:15,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:43:20,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:43:21,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=770693.3333333334, ans=0.0 2023-10-02 05:43:22,904 INFO [train.py:1046] (1/4) Epoch 22, batch 4050, loss[loss=0.1813, simple_loss=0.2616, pruned_loss=0.0505, over 24013.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2487, pruned_loss=0.04838, over 4724617.49 frames. ], batch size: 80, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:43:24,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 05:43:26,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 05:43:26,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:43:26,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=770693.3333333334, ans=0.125 2023-10-02 05:43:28,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:43:28,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:43:29,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:43:29,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:43:33,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:43:37,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=770760.0, ans=0.2 2023-10-02 05:43:38,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:43:39,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:43:41,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:43:43,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:43:46,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:43:46,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:43:48,357 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.53 vs. limit=22.5 2023-10-02 05:43:49,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 05:43:50,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 05:43:51,756 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 05:43:54,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:43:54,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=770826.6666666666, ans=0.125 2023-10-02 05:44:02,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 05:44:02,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=770826.6666666666, ans=0.125 2023-10-02 05:44:03,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:44:08,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:44:11,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:44:11,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:44:12,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:44:14,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=770893.3333333334, ans=0.1 2023-10-02 05:44:16,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:44:17,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 05:44:17,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 05:44:19,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:44:20,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 05:44:22,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=770960.0, ans=0.125 2023-10-02 05:44:24,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:44:31,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 05:44:32,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:44:32,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:44:34,726 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.70 vs. limit=15.0 2023-10-02 05:44:35,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 05:44:35,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 05:44:35,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:44:38,158 INFO [train.py:1046] (1/4) Epoch 22, batch 4100, loss[loss=0.1656, simple_loss=0.2546, pruned_loss=0.03832, over 24333.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2494, pruned_loss=0.0481, over 4723529.71 frames. ], batch size: 77, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:44:38,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:44:39,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:44:39,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:44:47,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 05:44:48,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 05:44:49,147 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:44:50,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 05:44:51,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 05:44:51,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:44:51,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:44:53,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:44:53,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:44:54,585 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 05:44:57,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:44:57,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:44:57,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:44:58,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:45:01,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:45:02,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=771093.3333333334, ans=0.0 2023-10-02 05:45:02,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=771093.3333333334, ans=0.125 2023-10-02 05:45:03,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:45:03,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:45:05,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 05:45:05,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:45:05,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:45:05,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:45:05,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:45:05,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=771093.3333333334, ans=0.125 2023-10-02 05:45:06,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 05:45:09,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:45:09,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 05:45:10,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:45:12,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:45:12,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 05:45:14,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:45:16,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:45:17,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:45:18,691 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.910e+02 2.108e+02 2.331e+02 3.295e+02, threshold=4.216e+02, percent-clipped=0.0 2023-10-02 05:45:18,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 05:45:20,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:45:20,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=771160.0, ans=0.2 2023-10-02 05:45:21,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:45:23,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 05:45:24,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:45:24,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:45:27,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:45:27,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=771226.6666666666, ans=0.0 2023-10-02 05:45:34,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:45:35,489 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.99 vs. limit=15.0 2023-10-02 05:45:38,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:45:39,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:45:45,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=771293.3333333334, ans=0.0 2023-10-02 05:45:46,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:45:46,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:45:51,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:45:51,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:45:52,871 INFO [train.py:1046] (1/4) Epoch 22, batch 4150, loss[loss=0.1732, simple_loss=0.2458, pruned_loss=0.05036, over 23509.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2502, pruned_loss=0.04799, over 4735183.15 frames. ], batch size: 120, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:45:54,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:45:55,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:45:55,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:45:55,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:45:58,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 05:45:58,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:46:00,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 05:46:00,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 05:46:00,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 05:46:03,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:46:03,825 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.99 vs. limit=22.5 2023-10-02 05:46:05,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=771360.0, ans=0.125 2023-10-02 05:46:06,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:46:06,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:46:10,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:46:12,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:46:14,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:46:14,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 05:46:14,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:46:16,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 05:46:18,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.whiten.whitening_limit, batch_count=771426.6666666666, ans=12.0 2023-10-02 05:46:19,341 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:46:20,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:46:23,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:46:24,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 05:46:26,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 05:46:26,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:46:28,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 05:46:28,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:46:28,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:46:32,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:46:33,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:46:36,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 05:46:39,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:46:41,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:46:41,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 05:46:42,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:46:44,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 05:46:46,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:46:46,921 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.82 vs. limit=15.0 2023-10-02 05:46:47,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:46:47,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=771560.0, ans=0.2 2023-10-02 05:46:49,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:46:50,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 05:46:50,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:46:50,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:46:50,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=771560.0, ans=0.1 2023-10-02 05:46:52,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:46:54,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 05:46:56,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:46:56,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:46:56,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:46:56,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 05:46:56,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:46:56,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:46:57,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:46:59,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:46:59,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 05:47:00,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:47:06,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:47:06,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 05:47:08,511 INFO [train.py:1046] (1/4) Epoch 22, batch 4200, loss[loss=0.1695, simple_loss=0.2538, pruned_loss=0.04258, over 24464.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2486, pruned_loss=0.04767, over 4735556.49 frames. ], batch size: 69, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:47:09,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:47:12,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:47:12,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:47:14,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:47:14,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:47:17,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 05:47:20,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 05:47:20,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:47:23,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:47:26,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:47:28,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 05:47:28,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:47:28,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:47:30,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 05:47:30,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:47:32,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:47:33,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:47:33,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:47:34,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:47:37,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 05:47:37,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=771826.6666666666, ans=0.035 2023-10-02 05:47:39,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:47:40,393 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.15 vs. limit=15.0 2023-10-02 05:47:43,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:47:43,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:47:46,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:47:46,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:47:48,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=771826.6666666666, ans=0.0 2023-10-02 05:47:50,520 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.871e+02 2.019e+02 2.222e+02 3.341e+02, threshold=4.039e+02, percent-clipped=0.0 2023-10-02 05:47:50,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:47:50,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 05:47:50,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:47:52,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:47:57,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 05:47:57,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=771893.3333333334, ans=0.1 2023-10-02 05:47:59,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:47:59,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=771893.3333333334, ans=0.1 2023-10-02 05:48:05,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:48:05,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=771893.3333333334, ans=0.2 2023-10-02 05:48:07,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 05:48:09,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:48:11,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=771960.0, ans=0.1 2023-10-02 05:48:14,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 05:48:14,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:48:17,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 05:48:20,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:48:21,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=771960.0, ans=0.025 2023-10-02 05:48:24,002 INFO [train.py:1046] (1/4) Epoch 22, batch 4250, loss[loss=0.1705, simple_loss=0.258, pruned_loss=0.04146, over 24300.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2478, pruned_loss=0.04718, over 4719496.88 frames. ], batch size: 74, lr: 4.64e-03, grad_scale: 8.0 2023-10-02 05:48:24,659 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.62 vs. limit=15.0 2023-10-02 05:48:25,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:48:26,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:48:29,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:48:33,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:48:33,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 05:48:33,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=772026.6666666666, ans=0.2 2023-10-02 05:48:34,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:48:37,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:48:39,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:48:43,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:48:44,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:48:46,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:48:46,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:48:47,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:48:48,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:48:49,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=772093.3333333334, ans=0.0 2023-10-02 05:48:50,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:48:52,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:48:54,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:48:55,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 05:48:59,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 05:48:59,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:48:59,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:48:59,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:49:01,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:49:01,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:49:01,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:49:05,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 05:49:06,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=772160.0, ans=0.125 2023-10-02 05:49:07,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:49:10,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:49:11,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:49:12,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 05:49:13,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:49:14,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 05:49:16,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:49:17,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:49:20,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:49:20,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:49:22,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 05:49:22,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=772293.3333333334, ans=0.125 2023-10-02 05:49:24,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:49:24,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:49:28,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:49:31,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:49:31,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:49:32,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:49:33,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=772293.3333333334, ans=0.125 2023-10-02 05:49:34,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:49:36,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:49:36,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:49:36,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 05:49:37,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:49:38,946 INFO [train.py:1046] (1/4) Epoch 22, batch 4300, loss[loss=0.1736, simple_loss=0.2344, pruned_loss=0.05642, over 19418.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.247, pruned_loss=0.04697, over 4716461.99 frames. ], batch size: 388, lr: 4.64e-03, grad_scale: 8.0 2023-10-02 05:49:43,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:49:44,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:49:46,569 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=772360.0, ans=0.0 2023-10-02 05:49:47,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:49:50,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=772360.0, ans=0.1 2023-10-02 05:49:56,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:49:56,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 05:49:56,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=772426.6666666666, ans=0.125 2023-10-02 05:49:58,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:49:59,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:49:59,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:49:59,587 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 05:50:03,021 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.56 vs. limit=6.0 2023-10-02 05:50:03,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:50:05,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:50:09,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 05:50:09,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:50:09,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 05:50:12,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 05:50:12,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:50:14,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:50:14,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:50:16,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:50:18,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:50:18,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:50:19,958 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.860e+02 2.141e+02 2.356e+02 3.803e+02, threshold=4.281e+02, percent-clipped=0.0 2023-10-02 05:50:20,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 05:50:20,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 05:50:21,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:50:21,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=772560.0, ans=0.0 2023-10-02 05:50:21,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=772560.0, ans=0.125 2023-10-02 05:50:24,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:50:24,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 05:50:24,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:50:24,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:50:24,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 05:50:24,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 05:50:26,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 05:50:28,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:50:28,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 05:50:28,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 05:50:32,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:50:33,653 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 05:50:33,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:50:36,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:50:36,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:50:36,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=772626.6666666666, ans=0.125 2023-10-02 05:50:38,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=772626.6666666666, ans=0.125 2023-10-02 05:50:39,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 05:50:39,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:50:39,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:50:40,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:50:40,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:50:41,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:50:43,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:50:45,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:50:46,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:50:47,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:50:51,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 05:50:52,862 INFO [train.py:1046] (1/4) Epoch 22, batch 4350, loss[loss=0.1971, simple_loss=0.272, pruned_loss=0.06105, over 24633.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2481, pruned_loss=0.04754, over 4712189.45 frames. ], batch size: 68, lr: 4.64e-03, grad_scale: 8.0 2023-10-02 05:50:52,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 05:50:58,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:51:00,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:51:02,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:51:02,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:51:02,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=772693.3333333334, ans=0.0 2023-10-02 05:51:06,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:51:09,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:51:11,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:51:11,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:51:15,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:51:17,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:51:18,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:51:23,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 05:51:25,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:51:25,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:51:31,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:51:34,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 05:51:35,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:51:37,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 05:51:42,028 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 05:51:42,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=772893.3333333334, ans=0.2 2023-10-02 05:51:43,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:51:43,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:51:43,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=772893.3333333334, ans=0.0 2023-10-02 05:51:44,869 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 05:51:46,868 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 05:51:46,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:51:46,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:51:46,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:51:48,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:51:49,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:51:49,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:51:52,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 05:51:52,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:51:52,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:51:54,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:51:54,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 05:51:57,061 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 05:51:57,068 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 05:51:57,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 05:52:00,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:52:00,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:52:00,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:52:01,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:52:03,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 05:52:06,169 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 05:52:06,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:52:08,851 INFO [train.py:1046] (1/4) Epoch 22, batch 4400, loss[loss=0.1863, simple_loss=0.2708, pruned_loss=0.05094, over 24139.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2496, pruned_loss=0.04794, over 4714208.97 frames. ], batch size: 80, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:52:08,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:52:08,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:52:13,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:52:14,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 05:52:14,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 05:52:16,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 05:52:16,381 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 05:52:16,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=773026.6666666666, ans=0.09899494936611666 2023-10-02 05:52:17,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 05:52:17,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:52:21,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 05:52:24,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:52:24,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=773093.3333333334, ans=0.2 2023-10-02 05:52:25,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:52:25,337 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 05:52:26,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:52:26,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 05:52:26,887 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 05:52:27,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=773093.3333333334, ans=0.125 2023-10-02 05:52:29,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 05:52:30,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=773093.3333333334, ans=0.125 2023-10-02 05:52:31,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 05:52:31,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 05:52:31,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:52:33,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:52:33,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:52:36,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:52:36,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 05:52:36,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 05:52:37,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:52:38,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=773160.0, ans=0.125 2023-10-02 05:52:39,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:52:39,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:52:40,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:52:42,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:52:42,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 05:52:42,199 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 05:52:44,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:52:48,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=773160.0, ans=0.07 2023-10-02 05:52:50,694 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.791e+02 2.025e+02 2.337e+02 3.385e+02, threshold=4.050e+02, percent-clipped=0.0 2023-10-02 05:52:50,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=773160.0, ans=0.125 2023-10-02 05:52:52,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:52:55,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 05:52:59,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:53:01,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:53:03,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:53:05,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 05:53:05,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:53:05,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:53:05,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:53:06,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 05:53:10,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 05:53:13,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 05:53:14,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 05:53:14,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:53:14,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 05:53:16,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:53:16,967 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=8.83 vs. limit=22.5 2023-10-02 05:53:21,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:53:24,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 05:53:25,749 INFO [train.py:1046] (1/4) Epoch 22, batch 4450, loss[loss=0.181, simple_loss=0.2509, pruned_loss=0.05555, over 23285.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2505, pruned_loss=0.04864, over 4714051.90 frames. ], batch size: 119, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:53:29,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:53:30,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:53:32,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:53:37,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:53:37,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:53:40,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:53:43,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:53:44,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:53:44,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:53:47,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 05:53:47,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:53:47,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:53:47,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:53:47,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:53:50,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 05:53:50,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=773426.6666666666, ans=0.1 2023-10-02 05:53:57,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:53:57,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:53:59,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:54:00,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:54:01,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:54:05,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 05:54:07,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 05:54:07,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 05:54:07,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:54:09,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:54:11,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 05:54:15,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 05:54:18,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:54:18,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 05:54:18,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:54:18,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:54:20,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:54:20,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:54:21,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:54:24,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:54:24,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 05:54:26,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:54:27,311 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.99 vs. limit=15.0 2023-10-02 05:54:27,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:54:29,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:54:30,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:54:30,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 05:54:33,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:54:36,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 05:54:38,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:54:41,145 INFO [train.py:1046] (1/4) Epoch 22, batch 4500, loss[loss=0.1791, simple_loss=0.2639, pruned_loss=0.04712, over 24043.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2507, pruned_loss=0.04828, over 4720860.42 frames. ], batch size: 86, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:54:42,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:54:42,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 05:54:42,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 05:54:42,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=773693.3333333334, ans=0.0 2023-10-02 05:54:45,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:54:50,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:54:51,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:54:51,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:54:53,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:54:53,735 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.85 vs. limit=15.0 2023-10-02 05:54:54,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:54:54,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:55:05,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=773760.0, ans=0.125 2023-10-02 05:55:08,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:55:10,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:55:12,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:55:12,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:55:14,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:55:18,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:55:22,910 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.844e+02 2.070e+02 2.423e+02 3.586e+02, threshold=4.140e+02, percent-clipped=0.0 2023-10-02 05:55:24,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:55:27,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:55:29,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=773893.3333333334, ans=0.125 2023-10-02 05:55:30,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:55:30,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 05:55:31,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:55:32,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:55:33,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:55:34,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:55:35,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:55:35,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 05:55:35,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 05:55:35,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:55:35,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=773893.3333333334, ans=0.125 2023-10-02 05:55:40,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:55:40,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:55:43,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:55:43,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=773960.0, ans=0.125 2023-10-02 05:55:47,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:55:47,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:55:48,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 05:55:50,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 05:55:50,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 05:55:53,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 05:55:55,885 INFO [train.py:1046] (1/4) Epoch 22, batch 4550, loss[loss=0.1571, simple_loss=0.2372, pruned_loss=0.03853, over 24457.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.25, pruned_loss=0.04855, over 4712434.23 frames. ], batch size: 63, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:55:56,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 05:55:57,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:56:00,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:56:01,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:56:05,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:56:08,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=774026.6666666666, ans=0.05 2023-10-02 05:56:11,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:56:13,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:56:15,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:56:15,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:56:15,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:17,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:56:17,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:56:20,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:56:23,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 05:56:23,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 05:56:23,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:56:24,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 05:56:29,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 05:56:29,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:56:32,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 05:56:34,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:56:35,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:35,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:35,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:56:38,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 05:56:41,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:56:44,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:44,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:56:45,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:56:47,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 05:56:47,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 05:56:47,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:56:48,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 05:56:48,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 05:56:50,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:56:52,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:56:52,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:56:53,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:53,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:56:55,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:56:56,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 05:56:56,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:56:56,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 05:56:58,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 05:56:58,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:56:58,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 05:57:02,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:57:02,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:57:04,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:57:05,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:57:05,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:57:07,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:57:09,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:57:11,873 INFO [train.py:1046] (1/4) Epoch 22, batch 4600, loss[loss=0.1669, simple_loss=0.2283, pruned_loss=0.05275, over 22792.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2479, pruned_loss=0.04825, over 4692958.59 frames. ], batch size: 322, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:57:11,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:12,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:57:13,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=774360.0, ans=0.125 2023-10-02 05:57:14,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:57:14,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:57:14,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:57:16,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 05:57:18,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:57:23,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:57:24,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:57:29,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:36,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 05:57:37,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:41,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:44,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:57:44,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:57:49,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 05:57:49,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:57:50,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:57:53,314 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.872e+02 2.041e+02 2.269e+02 3.286e+02, threshold=4.083e+02, percent-clipped=0.0 2023-10-02 05:57:56,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:56,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:57:57,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:58:01,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 05:58:04,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 05:58:09,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:10,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:58:13,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:13,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 05:58:13,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:58:13,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 05:58:13,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=774626.6666666666, ans=0.125 2023-10-02 05:58:13,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=774626.6666666666, ans=0.125 2023-10-02 05:58:14,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:15,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:58:15,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:16,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:58:17,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:58:17,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 05:58:18,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 05:58:18,524 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.29 vs. limit=12.0 2023-10-02 05:58:19,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 05:58:19,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:58:20,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:58:21,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:58:22,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:58:25,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=774693.3333333334, ans=0.015 2023-10-02 05:58:26,302 INFO [train.py:1046] (1/4) Epoch 22, batch 4650, loss[loss=0.1692, simple_loss=0.2602, pruned_loss=0.0391, over 24664.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2485, pruned_loss=0.0481, over 4711880.14 frames. ], batch size: 73, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:58:29,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=774693.3333333334, ans=0.0 2023-10-02 05:58:31,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:58:32,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:58:34,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:58:34,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:58:34,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:58:34,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:58:35,743 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.49 vs. limit=15.0 2023-10-02 05:58:37,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:58:40,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 05:58:42,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=774760.0, ans=0.125 2023-10-02 05:58:44,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:58:47,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 05:58:47,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:58:49,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 05:58:49,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:58:49,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 05:58:49,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 05:58:49,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:50,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:58:53,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:58:54,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:58:54,808 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 05:58:57,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:59:00,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 05:59:01,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:59:01,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:59:03,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 05:59:04,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=774826.6666666666, ans=0.125 2023-10-02 05:59:05,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:59:08,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:59:11,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:59:15,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:59:19,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:59:20,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:59:20,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:59:23,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 05:59:23,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 05:59:23,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 05:59:23,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 05:59:24,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:59:31,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:59:31,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:59:31,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=774960.0, ans=0.0 2023-10-02 05:59:33,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 05:59:33,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:59:33,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=774960.0, ans=0.2 2023-10-02 05:59:34,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:59:34,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:59:35,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:59:39,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:59:39,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:59:40,747 INFO [train.py:1046] (1/4) Epoch 22, batch 4700, loss[loss=0.1911, simple_loss=0.2604, pruned_loss=0.06092, over 23887.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2494, pruned_loss=0.04828, over 4712017.80 frames. ], batch size: 195, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:59:40,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:59:43,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:59:43,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:59:43,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:59:43,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=775026.6666666666, ans=0.0 2023-10-02 05:59:45,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 05:59:45,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:59:45,428 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:59:46,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 05:59:53,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:59:55,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:59:55,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:59:56,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:59:58,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:00:02,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 06:00:02,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 06:00:05,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:00:06,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:00:06,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:00:09,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=775160.0, ans=0.1 2023-10-02 06:00:12,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:00:17,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:00:19,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 06:00:21,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:00:22,598 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.844e+02 2.076e+02 2.631e+02 3.750e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-02 06:00:26,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 06:00:26,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:00:29,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:33,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 06:00:35,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:00:38,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:00:40,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 06:00:42,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:42,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:00:46,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:00:46,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:00:46,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 06:00:47,613 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 06:00:48,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:00:52,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:52,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:52,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 06:00:52,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:54,909 INFO [train.py:1046] (1/4) Epoch 22, batch 4750, loss[loss=0.19, simple_loss=0.2674, pruned_loss=0.05632, over 23332.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2501, pruned_loss=0.04814, over 4717206.84 frames. ], batch size: 93, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 06:00:56,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 06:00:58,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=775360.0, ans=0.125 2023-10-02 06:00:59,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:01:00,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:01:03,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:01:03,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:01:05,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 06:01:05,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:01:08,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=775426.6666666666, ans=0.1 2023-10-02 06:01:09,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 06:01:09,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:01:09,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:01:09,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff2.min_abs, batch_count=775426.6666666666, ans=0.1 2023-10-02 06:01:12,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:01:13,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=775426.6666666666, ans=0.2 2023-10-02 06:01:15,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=775426.6666666666, ans=0.0 2023-10-02 06:01:17,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 06:01:18,626 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.25 vs. limit=15.0 2023-10-02 06:01:20,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:01:21,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=775426.6666666666, ans=0.1 2023-10-02 06:01:24,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 06:01:24,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:01:26,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:01:26,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:01:26,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:01:29,481 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 06:01:29,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 06:01:29,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=775493.3333333334, ans=0.025 2023-10-02 06:01:31,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=775493.3333333334, ans=0.0 2023-10-02 06:01:32,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 06:01:33,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:01:37,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:01:38,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:01:38,646 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 06:01:38,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:01:38,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=775560.0, ans=0.125 2023-10-02 06:01:42,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:01:46,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:01:48,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 06:01:48,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 06:01:50,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:01:50,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:01:52,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:01:52,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 06:01:52,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 06:01:52,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=775560.0, ans=0.07 2023-10-02 06:01:55,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 06:01:57,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:01:58,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=775626.6666666666, ans=0.07 2023-10-02 06:01:59,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:01:59,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 06:01:59,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:02:00,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:02:02,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:02:02,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:03,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:02:06,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:02:08,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 06:02:08,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 06:02:09,632 INFO [train.py:1046] (1/4) Epoch 22, batch 4800, loss[loss=0.1739, simple_loss=0.245, pruned_loss=0.0514, over 23761.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2506, pruned_loss=0.04846, over 4722081.81 frames. ], batch size: 135, lr: 4.63e-03, grad_scale: 32.0 2023-10-02 06:02:09,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 06:02:12,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:02:14,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:02:15,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 06:02:20,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:20,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:02:21,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=775693.3333333334, ans=0.125 2023-10-02 06:02:25,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:02:27,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:02:27,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:27,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 06:02:28,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:02:28,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:02:29,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:02:33,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.93 vs. limit=15.0 2023-10-02 06:02:35,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:02:36,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:02:36,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:02:36,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=775760.0, ans=0.0 2023-10-02 06:02:38,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:02:38,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 06:02:38,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:02:40,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:02:41,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:02:42,274 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.87 vs. limit=10.0 2023-10-02 06:02:44,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:02:48,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:02:48,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:02:49,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 06:02:52,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:52,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 06:02:54,241 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.869e+02 2.092e+02 2.385e+02 4.135e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-02 06:02:54,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 06:02:54,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:54,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:02:55,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:02:55,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:02:55,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:02:57,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:02:57,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:02:58,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=775893.3333333334, ans=0.0 2023-10-02 06:03:01,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:03:01,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=775893.3333333334, ans=0.1 2023-10-02 06:03:04,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:04,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:03:08,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 06:03:08,770 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:03:08,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=775960.0, ans=0.0 2023-10-02 06:03:10,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:03:10,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:10,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:03:10,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:03:14,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:03:16,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:03:16,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:17,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:03:17,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:03:18,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:03:20,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=775960.0, ans=0.125 2023-10-02 06:03:22,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:03:22,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:22,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:03:25,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 06:03:26,635 INFO [train.py:1046] (1/4) Epoch 22, batch 4850, loss[loss=0.2321, simple_loss=0.2903, pruned_loss=0.08692, over 19593.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2508, pruned_loss=0.04868, over 4714810.91 frames. ], batch size: 388, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 06:03:26,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 06:03:26,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:03:26,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:03:28,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:03:28,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:31,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:03:35,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 06:03:38,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:03:42,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:03:44,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:03:44,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:48,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:03:50,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:03:51,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:03:51,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 06:03:56,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:03:57,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:03:57,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 06:03:59,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:03:59,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 06:03:59,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=776160.0, ans=0.125 2023-10-02 06:04:00,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:04:00,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:04:05,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:04:05,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 06:04:06,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 06:04:07,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:04:14,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:04:14,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 06:04:15,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:04:15,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:04:18,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:04:20,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 06:04:20,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:04:20,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=776226.6666666666, ans=0.0 2023-10-02 06:04:22,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 06:04:22,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:04:23,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:04:23,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 06:04:30,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:04:37,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:04:37,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:04:40,389 INFO [train.py:1046] (1/4) Epoch 22, batch 4900, loss[loss=0.1598, simple_loss=0.2482, pruned_loss=0.03573, over 24642.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2509, pruned_loss=0.04871, over 4718422.39 frames. ], batch size: 68, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 06:04:42,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 06:04:42,740 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.35 vs. limit=15.0 2023-10-02 06:04:43,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:04:49,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:04:51,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:04:51,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:04:53,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 06:04:59,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 06:05:03,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 06:05:03,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 06:05:03,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:05:03,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:05:04,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:05:04,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:05:04,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:05:05,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 06:05:07,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 06:05:09,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 06:05:09,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=776493.3333333334, ans=0.125 2023-10-02 06:05:10,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:05:11,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:05:13,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:05:15,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:05:16,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:05:16,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 06:05:16,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=776493.3333333334, ans=0.125 2023-10-02 06:05:18,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:05:18,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:05:18,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 06:05:18,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 06:05:24,335 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.396e+02 1.841e+02 1.995e+02 2.206e+02 2.989e+02, threshold=3.989e+02, percent-clipped=0.0 2023-10-02 06:05:24,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 06:05:26,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:05:26,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:05:27,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:05:27,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:05:27,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 06:05:29,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:05:29,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 06:05:31,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:05:31,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=776560.0, ans=0.1 2023-10-02 06:05:33,723 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.69 vs. limit=15.0 2023-10-02 06:05:34,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:05:35,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:05:38,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 06:05:38,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:05:39,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 06:05:39,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 06:05:45,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:05:46,288 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.70 vs. limit=15.0 2023-10-02 06:05:47,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:05:47,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=776626.6666666666, ans=0.125 2023-10-02 06:05:48,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 06:05:49,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 06:05:49,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:05:51,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:05:54,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:05:56,723 INFO [train.py:1046] (1/4) Epoch 22, batch 4950, loss[loss=0.1773, simple_loss=0.257, pruned_loss=0.04881, over 23668.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2493, pruned_loss=0.04828, over 4704292.53 frames. ], batch size: 85, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:05:56,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:05:56,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:05:56,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 06:05:58,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 06:06:01,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:06:01,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 06:06:04,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 06:06:05,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 06:06:05,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:06:07,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 06:06:07,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:07,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:06:07,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:06:07,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:10,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:06:10,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:06:12,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:06:12,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:06:14,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:14,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:06:18,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=776760.0, ans=0.125 2023-10-02 06:06:19,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 06:06:23,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=776760.0, ans=0.125 2023-10-02 06:06:24,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:24,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:06:27,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:27,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:27,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=776826.6666666666, ans=0.0 2023-10-02 06:06:28,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:06:30,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 06:06:30,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 06:06:31,631 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.20 vs. limit=15.0 2023-10-02 06:06:32,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:34,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:06:34,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:06:35,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:06:36,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:06:37,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:06:38,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=776826.6666666666, ans=0.125 2023-10-02 06:06:39,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:06:41,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:06:42,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=776893.3333333334, ans=0.125 2023-10-02 06:06:43,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:06:45,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:46,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:47,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 06:06:47,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:06:49,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:06:52,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:06:52,846 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.88 vs. limit=22.5 2023-10-02 06:06:53,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:06:53,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:06:53,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=776893.3333333334, ans=0.0 2023-10-02 06:06:54,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:55,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:06:56,216 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.84 vs. limit=12.0 2023-10-02 06:06:56,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:06:58,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:06:59,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:06:59,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:07:01,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 06:07:04,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:07:07,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=776960.0, ans=0.125 2023-10-02 06:07:10,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 06:07:10,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 06:07:11,432 INFO [train.py:1046] (1/4) Epoch 22, batch 5000, loss[loss=0.1591, simple_loss=0.2382, pruned_loss=0.04001, over 23315.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2494, pruned_loss=0.04784, over 4720590.76 frames. ], batch size: 105, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:07:16,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:07:16,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:07:17,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=777026.6666666666, ans=0.125 2023-10-02 06:07:19,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 06:07:19,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 06:07:19,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=777026.6666666666, ans=0.0 2023-10-02 06:07:21,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:07:24,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 06:07:24,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:07:24,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:07:26,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 06:07:27,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:07:27,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:07:27,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 06:07:27,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:07:29,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:07:31,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 06:07:31,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 06:07:32,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:07:32,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 06:07:32,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 06:07:33,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:07:34,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:07:34,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 06:07:34,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 06:07:36,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 06:07:37,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:07:37,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:07:37,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 06:07:37,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:07:40,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:07:41,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:07:43,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 06:07:44,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 06:07:46,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:07:47,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:07:47,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=777160.0, ans=0.125 2023-10-02 06:07:51,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=777160.0, ans=0.0 2023-10-02 06:07:52,349 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 06:07:53,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:07:54,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=777160.0, ans=0.125 2023-10-02 06:07:55,098 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.890e+02 2.101e+02 2.537e+02 4.736e+02, threshold=4.203e+02, percent-clipped=2.0 2023-10-02 06:07:55,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:07:55,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:07:59,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 06:07:59,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:07:59,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:08:01,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:08:02,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 06:08:03,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:08:05,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=777226.6666666666, ans=0.0 2023-10-02 06:08:07,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:08:07,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:08:10,697 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.95 vs. limit=15.0 2023-10-02 06:08:12,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 06:08:16,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:08:20,692 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.02 vs. limit=10.0 2023-10-02 06:08:21,981 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.23 vs. limit=6.0 2023-10-02 06:08:25,969 INFO [train.py:1046] (1/4) Epoch 22, batch 5050, loss[loss=0.177, simple_loss=0.2673, pruned_loss=0.04335, over 24648.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2494, pruned_loss=0.04794, over 4715627.71 frames. ], batch size: 73, lr: 4.62e-03, grad_scale: 8.0 2023-10-02 06:08:26,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:08:26,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=777360.0, ans=0.0 2023-10-02 06:08:27,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:08:27,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:08:28,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:08:28,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:08:28,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:08:28,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:08:29,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=777360.0, ans=0.125 2023-10-02 06:08:29,443 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.72 vs. limit=15.0 2023-10-02 06:08:34,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:08:34,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 06:08:35,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:08:38,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:08:38,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=777360.0, ans=0.1 2023-10-02 06:08:40,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:08:40,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 06:08:40,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:08:41,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:08:43,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:08:44,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:08:45,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:08:53,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 06:08:54,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:08:54,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:08:54,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 06:08:56,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:08:57,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:08:57,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:08:57,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:08:57,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=777493.3333333334, ans=0.1 2023-10-02 06:08:59,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 06:08:59,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 06:09:00,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:09:03,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:09:04,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=777493.3333333334, ans=0.125 2023-10-02 06:09:05,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:09:06,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 06:09:09,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:09:11,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 06:09:13,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:09:13,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:09:13,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:09:14,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:09:17,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:09:18,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:09:20,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:09:20,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:09:20,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:09:20,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 06:09:21,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:09:21,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:09:26,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:09:26,416 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 06:09:26,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:09:27,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:09:29,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:09:29,120 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 06:09:33,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:09:33,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 06:09:33,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:09:36,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:09:37,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:09:37,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 06:09:39,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 06:09:42,317 INFO [train.py:1046] (1/4) Epoch 22, batch 5100, loss[loss=0.1776, simple_loss=0.2472, pruned_loss=0.05402, over 23484.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2502, pruned_loss=0.04821, over 4719705.88 frames. ], batch size: 256, lr: 4.62e-03, grad_scale: 8.0 2023-10-02 06:09:42,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:09:42,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:09:43,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:09:45,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=777693.3333333334, ans=0.125 2023-10-02 06:09:46,516 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 06:09:47,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:09:49,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 06:09:50,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 06:09:50,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:09:52,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:09:52,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=777693.3333333334, ans=0.125 2023-10-02 06:09:55,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:09:56,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 06:09:56,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 06:10:01,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:10:01,634 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.25 vs. limit=15.0 2023-10-02 06:10:02,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:10:04,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:10:08,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 06:10:08,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:10:10,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:10:10,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 06:10:13,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:10:15,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:10:15,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 06:10:16,903 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 06:10:18,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:10:18,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 06:10:18,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 06:10:22,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:10:26,630 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.838e+02 2.102e+02 2.485e+02 3.822e+02, threshold=4.204e+02, percent-clipped=0.0 2023-10-02 06:10:31,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:10:33,082 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:10:34,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 06:10:34,192 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 06:10:34,200 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 06:10:37,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 06:10:37,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:10:38,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 06:10:42,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 06:10:43,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 06:10:45,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:10:47,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 06:10:49,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:10:51,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 06:10:55,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:10:55,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:10:55,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:10:57,408 INFO [train.py:1046] (1/4) Epoch 22, batch 5150, loss[loss=0.1935, simple_loss=0.2605, pruned_loss=0.06329, over 23663.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2514, pruned_loss=0.0486, over 4721068.44 frames. ], batch size: 232, lr: 4.62e-03, grad_scale: 8.0 2023-10-02 06:10:57,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:10:57,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:10:57,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:10:58,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 06:10:58,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 06:11:00,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 06:11:00,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:11:00,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 06:11:02,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:11:02,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 06:11:04,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:11:05,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:11:11,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 06:11:12,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 06:11:13,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:11:13,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:11:15,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:11:15,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:11:15,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:11:15,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:11:16,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:11:16,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 06:11:18,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:11:18,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=778093.3333333334, ans=0.125 2023-10-02 06:11:19,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:11:21,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 06:11:24,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 06:11:25,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:11:31,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:11:33,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 06:11:36,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:11:42,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:11:42,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=778226.6666666666, ans=0.0 2023-10-02 06:11:44,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:11:46,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:11:48,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:11:51,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 06:11:54,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:11:55,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:11:55,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:11:58,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:11:59,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:12:01,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 06:12:04,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:12:05,377 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.02 vs. limit=15.0 2023-10-02 06:12:06,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 06:12:08,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:12:08,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:12:10,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:12:10,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:12:10,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:12:10,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:12:12,061 INFO [train.py:1046] (1/4) Epoch 22, batch 5200, loss[loss=0.1813, simple_loss=0.2732, pruned_loss=0.04471, over 24277.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2529, pruned_loss=0.04951, over 4710566.25 frames. ], batch size: 74, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:12:15,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:12:15,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:12:18,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:12:20,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=778360.0, ans=0.125 2023-10-02 06:12:21,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 06:12:23,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:12:24,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:12:27,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:12:27,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:12:27,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:12:29,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 06:12:32,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 06:12:32,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:12:36,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 06:12:37,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:12:38,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:12:40,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 06:12:40,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 06:12:43,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 06:12:43,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:12:43,541 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 06:12:43,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:12:45,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:12:46,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:12:46,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 06:12:48,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:12:51,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:12:51,795 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=778493.3333333334, ans=0.1 2023-10-02 06:12:52,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 06:12:54,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 06:12:54,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 06:12:57,090 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.869e+02 2.074e+02 2.412e+02 3.434e+02, threshold=4.148e+02, percent-clipped=0.0 2023-10-02 06:12:58,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 06:12:58,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:13:02,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:13:02,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:13:04,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 06:13:06,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:13:06,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 06:13:06,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:13:07,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:13:09,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:13:10,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:13:13,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=778626.6666666666, ans=0.125 2023-10-02 06:13:14,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:13:16,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:13:16,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:13:21,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=778626.6666666666, ans=0.0 2023-10-02 06:13:23,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:13:23,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 06:13:24,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:13:24,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:13:26,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:13:26,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:13:27,227 INFO [train.py:1046] (1/4) Epoch 22, batch 5250, loss[loss=0.1842, simple_loss=0.266, pruned_loss=0.05119, over 24069.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2527, pruned_loss=0.04915, over 4717877.38 frames. ], batch size: 80, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:13:27,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:13:28,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=778693.3333333334, ans=0.125 2023-10-02 06:13:30,607 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.02 vs. limit=15.0 2023-10-02 06:13:31,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:13:34,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:13:35,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:13:35,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:13:41,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:13:43,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:13:45,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:13:46,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:13:49,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 06:13:49,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:13:51,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:13:57,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=778826.6666666666, ans=0.0 2023-10-02 06:14:04,221 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.25 vs. limit=15.0 2023-10-02 06:14:28,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=778960.0, ans=0.1 2023-10-02 06:14:36,718 INFO [train.py:1046] (1/4) Epoch 22, batch 5300, loss[loss=0.1736, simple_loss=0.2473, pruned_loss=0.04994, over 23371.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.251, pruned_loss=0.04877, over 4696789.00 frames. ], batch size: 119, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:14:42,760 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.95 vs. limit=22.5 2023-10-02 06:14:51,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:14:51,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 06:14:51,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 06:14:51,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:14:51,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:14:51,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:14:52,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:14:52,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:14:52,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:14:52,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:14:52,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:14:52,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:14:52,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 06:14:52,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 06:14:52,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 06:14:52,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:14:52,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 06:14:52,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 06:14:52,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:14:53,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:14:53,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:14:53,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:14:53,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:14:54,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:14:54,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:14:54,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:14:54,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:14:54,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:14:54,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:14:54,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:14:54,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:14:54,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 06:14:54,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:14:55,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:14:55,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 06:14:55,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 06:14:55,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:14:55,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:14:55,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 06:14:55,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 06:14:55,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 06:14:55,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:14:56,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:14:56,489 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 06:14:56,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 06:14:56,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:14:56,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:14:56,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 06:14:56,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 06:14:56,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 06:14:56,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 06:14:59,500 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.97 vs. limit=15.0 2023-10-02 06:15:03,814 INFO [train.py:1046] (1/4) Epoch 23, batch 0, loss[loss=0.1695, simple_loss=0.2483, pruned_loss=0.0454, over 23720.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2483, pruned_loss=0.0454, over 23720.00 frames. ], batch size: 232, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:15:03,815 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 06:15:16,812 INFO [train.py:1078] (1/4) Epoch 23, validation: loss=0.2993, simple_loss=0.2685, pruned_loss=0.165, over 1125622.00 frames. 2023-10-02 06:15:16,813 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20387MB 2023-10-02 06:15:19,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 06:15:20,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:15:20,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=779106.6666666666, ans=0.125 2023-10-02 06:15:21,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:15:22,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=779106.6666666666, ans=0.0 2023-10-02 06:15:26,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:15:26,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:15:27,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:15:28,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 06:15:29,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 06:15:32,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:15:33,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:15:36,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:15:36,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:15:37,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:15:37,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:15:39,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 06:15:41,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:15:43,685 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.858e+02 2.101e+02 2.344e+02 3.915e+02, threshold=4.201e+02, percent-clipped=0.0 2023-10-02 06:15:48,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:15:48,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:15:50,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 06:15:54,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:15:54,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:15:56,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:15:59,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=779240.0, ans=0.0 2023-10-02 06:16:00,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:16:03,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:16:10,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 06:16:13,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 06:16:14,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:16:14,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:16:15,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:16:17,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:16:17,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 06:16:20,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:16:22,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:16:23,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:16:27,642 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 06:16:29,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:16:31,781 INFO [train.py:1046] (1/4) Epoch 23, batch 50, loss[loss=0.1848, simple_loss=0.2696, pruned_loss=0.05005, over 24402.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2516, pruned_loss=0.05028, over 1066069.50 frames. ], batch size: 77, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:16:31,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:16:35,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:16:35,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 06:16:35,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:16:35,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:16:35,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=779440.0, ans=0.0 2023-10-02 06:16:37,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:16:39,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:16:40,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:16:42,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 06:16:43,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:16:45,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=779506.6666666666, ans=0.125 2023-10-02 06:16:50,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:16:53,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 06:16:54,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 06:16:54,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=779506.6666666666, ans=0.1 2023-10-02 06:16:57,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:16:58,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:16:58,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:17:00,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:17:00,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:17:01,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 06:17:01,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:17:03,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=779573.3333333334, ans=0.0 2023-10-02 06:17:03,236 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:17:07,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:17:07,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:17:09,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:17:09,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 06:17:10,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:17:12,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:17:12,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 06:17:12,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:17:15,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 06:17:22,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:17:22,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:17:22,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:17:24,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:17:24,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:17:26,338 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.01 vs. limit=22.5 2023-10-02 06:17:26,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 06:17:28,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 06:17:29,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:17:29,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:17:30,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:17:32,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:17:32,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 06:17:32,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 06:17:33,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 06:17:35,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:17:35,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:17:36,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 06:17:36,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 06:17:37,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:17:38,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:17:40,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 06:17:40,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:17:43,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:17:45,363 INFO [train.py:1046] (1/4) Epoch 23, batch 100, loss[loss=0.1691, simple_loss=0.2544, pruned_loss=0.04192, over 24474.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2532, pruned_loss=0.04956, over 1883256.10 frames. ], batch size: 69, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:17:47,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:17:49,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=779773.3333333334, ans=0.125 2023-10-02 06:17:50,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:17:51,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 06:17:51,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:17:56,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:17:56,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:17:56,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:17:56,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:17:57,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:17:59,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 06:17:59,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=779840.0, ans=0.125 2023-10-02 06:18:00,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:18:00,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=779840.0, ans=0.1 2023-10-02 06:18:02,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:18:02,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:18:02,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:18:04,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 06:18:06,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:18:07,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:18:08,379 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.33 vs. limit=10.0 2023-10-02 06:18:08,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 06:18:10,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:18:12,035 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.840e+02 2.037e+02 2.251e+02 3.061e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-02 06:18:13,492 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 06:18:13,506 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 06:18:16,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:18:16,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:18:20,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:18:21,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:18:22,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:28,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:28,684 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 06:18:30,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 06:18:32,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:18:34,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:18:36,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:38,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:18:41,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:18:42,161 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.44 vs. limit=15.0 2023-10-02 06:18:43,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:18:45,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:45,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:18:48,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:18:48,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:18:48,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:50,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 06:18:50,377 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 06:18:51,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:18:51,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:18:52,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:18:52,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:18:52,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 06:18:52,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 06:18:54,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:18:54,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:18:54,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:18:54,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=780040.0, ans=0.0 2023-10-02 06:18:55,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:18:55,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:18:57,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:18:59,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:19:00,317 INFO [train.py:1046] (1/4) Epoch 23, batch 150, loss[loss=0.1825, simple_loss=0.2646, pruned_loss=0.05021, over 24354.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2528, pruned_loss=0.04893, over 2528179.74 frames. ], batch size: 77, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:19:00,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:19:00,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:19:01,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:04,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:19:04,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:07,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:19:08,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:09,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=780106.6666666666, ans=10.0 2023-10-02 06:19:13,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 06:19:15,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 06:19:15,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 06:19:18,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:19:18,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:19:18,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:19:19,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:19:21,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:19:21,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:21,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:23,106 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 06:19:24,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:19:30,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:19:34,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:19:34,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 06:19:37,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:19:37,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:19:37,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:19:40,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:19:42,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:19:42,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:19:43,668 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.70 vs. limit=10.0 2023-10-02 06:19:44,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:19:44,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 06:19:49,934 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.02 vs. limit=10.0 2023-10-02 06:19:52,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:19:52,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:19:52,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=780306.6666666666, ans=0.0 2023-10-02 06:19:52,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=780306.6666666666, ans=0.125 2023-10-02 06:19:53,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:19:53,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:19:55,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:19:58,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 06:20:01,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:20:04,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:20:07,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:20:08,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:20:08,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 06:20:08,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:20:08,979 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 06:20:11,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:20:14,534 INFO [train.py:1046] (1/4) Epoch 23, batch 200, loss[loss=0.1695, simple_loss=0.2598, pruned_loss=0.03956, over 24566.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.253, pruned_loss=0.04841, over 3019579.73 frames. ], batch size: 71, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:20:15,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:20:15,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:20:17,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 06:20:17,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:20:17,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:20:20,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 06:20:22,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:20:23,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:20:24,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:20:25,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=780440.0, ans=0.1 2023-10-02 06:20:26,908 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.43 vs. limit=15.0 2023-10-02 06:20:30,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:20:30,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:20:30,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:20:40,946 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.789e+02 2.004e+02 2.358e+02 3.840e+02, threshold=4.009e+02, percent-clipped=0.0 2023-10-02 06:20:49,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:20:49,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:20:50,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:20:52,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:20:52,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 06:20:52,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:20:52,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=780573.3333333334, ans=0.2 2023-10-02 06:20:53,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:20:55,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:20:57,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:20:57,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:20:58,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 06:21:00,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 06:21:01,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:21:05,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:21:12,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:21:17,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:19,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:21:20,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=780706.6666666666, ans=0.0 2023-10-02 06:21:24,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:26,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 06:21:27,164 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.10 vs. limit=15.0 2023-10-02 06:21:27,521 INFO [train.py:1046] (1/4) Epoch 23, batch 250, loss[loss=0.1704, simple_loss=0.2492, pruned_loss=0.04582, over 24328.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2521, pruned_loss=0.04851, over 3391538.92 frames. ], batch size: 61, lr: 4.51e-03, grad_scale: 16.0 2023-10-02 06:21:27,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:21:27,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:21:27,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:21:29,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:21:31,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 06:21:33,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:21:33,434 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 06:21:34,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:37,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:21:38,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:38,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:21:39,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=780773.3333333334, ans=0.0 2023-10-02 06:21:41,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:21:41,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:43,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:21:46,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:21:49,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=780840.0, ans=0.0 2023-10-02 06:21:53,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:21:56,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:21:56,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:22:01,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:22:03,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:22:03,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:22:03,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:22:05,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:22:05,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:22:05,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:22:05,901 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.52 vs. limit=15.0 2023-10-02 06:22:07,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:22:09,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 06:22:09,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:22:12,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:22:12,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:22:12,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:22:12,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=780973.3333333334, ans=0.025 2023-10-02 06:22:12,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=780973.3333333334, ans=0.125 2023-10-02 06:22:13,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:22:15,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:22:15,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:22:15,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=780973.3333333334, ans=0.125 2023-10-02 06:22:17,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:22:19,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:22:19,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:22:22,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:22:23,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=780973.3333333334, ans=0.0 2023-10-02 06:22:24,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:22:28,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:22:33,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:22:34,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:22:38,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 06:22:40,129 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.40 vs. limit=15.0 2023-10-02 06:22:40,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:22:40,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:22:42,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 06:22:42,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 06:22:43,419 INFO [train.py:1046] (1/4) Epoch 23, batch 300, loss[loss=0.1584, simple_loss=0.2329, pruned_loss=0.04196, over 24326.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.25, pruned_loss=0.04786, over 3682178.94 frames. ], batch size: 56, lr: 4.51e-03, grad_scale: 16.0 2023-10-02 06:22:43,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:22:43,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 06:22:48,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:22:50,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:22:51,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:22:53,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 06:22:55,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:22:56,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:22:56,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 06:22:56,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:23:00,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:23:05,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:23:05,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 06:23:06,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 06:23:08,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:10,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:23:11,714 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.930e+02 2.119e+02 2.410e+02 3.837e+02, threshold=4.239e+02, percent-clipped=0.0 2023-10-02 06:23:12,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=781240.0, ans=0.1 2023-10-02 06:23:12,475 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.69 vs. limit=22.5 2023-10-02 06:23:13,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:13,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 06:23:13,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:23:14,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:23:17,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:23:17,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:23:21,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 06:23:21,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 06:23:22,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:23:23,756 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.23 vs. limit=22.5 2023-10-02 06:23:26,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:26,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=781306.6666666666, ans=0.5 2023-10-02 06:23:27,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 06:23:28,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:23:33,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:23:37,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:23:37,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 06:23:40,523 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.37 vs. limit=10.0 2023-10-02 06:23:41,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:41,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:23:43,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:43,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=781373.3333333334, ans=0.07 2023-10-02 06:23:44,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:23:44,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 06:23:44,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 06:23:46,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:23:46,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=781373.3333333334, ans=0.95 2023-10-02 06:23:47,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 06:23:49,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:49,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:23:50,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:23:50,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:23:51,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:23:56,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:23:56,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 06:23:56,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=781440.0, ans=0.2 2023-10-02 06:23:58,137 INFO [train.py:1046] (1/4) Epoch 23, batch 350, loss[loss=0.1979, simple_loss=0.2775, pruned_loss=0.05913, over 24420.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2487, pruned_loss=0.04751, over 3897131.00 frames. ], batch size: 77, lr: 4.51e-03, grad_scale: 16.0 2023-10-02 06:23:59,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:05,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:24:08,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:24:10,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:13,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 06:24:14,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=781506.6666666666, ans=0.125 2023-10-02 06:24:16,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:24:16,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 06:24:19,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:19,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 06:24:20,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:24:22,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 06:24:23,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:24:24,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=781506.6666666666, ans=0.125 2023-10-02 06:24:25,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:24:26,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:24:26,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:24:28,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:24:28,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:24:28,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:24:28,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:24:29,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:24:29,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:38,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:24:38,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:24:38,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:24:38,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:24:45,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 06:24:45,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:49,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:24:49,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:24:49,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:24:49,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=781640.0, ans=0.125 2023-10-02 06:24:50,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 06:24:52,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:24:53,669 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 06:24:53,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 06:24:55,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:24:56,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff3.min_abs, batch_count=781706.6666666666, ans=0.2 2023-10-02 06:24:58,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:24:58,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 06:25:01,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:25:02,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:25:02,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:25:04,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:25:04,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:25:07,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:25:10,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:25:11,393 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.29 vs. limit=15.0 2023-10-02 06:25:12,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:25:12,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 06:25:12,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:25:13,646 INFO [train.py:1046] (1/4) Epoch 23, batch 400, loss[loss=0.1768, simple_loss=0.245, pruned_loss=0.05428, over 23764.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2484, pruned_loss=0.04732, over 4081584.47 frames. ], batch size: 179, lr: 4.51e-03, grad_scale: 16.0 2023-10-02 06:25:14,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:25:15,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:25:15,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:25:16,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=781773.3333333334, ans=0.125 2023-10-02 06:25:17,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:25:18,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:25:20,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 06:25:21,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 06:25:21,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:25:22,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 06:25:24,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:25:30,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:25:30,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:25:30,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 06:25:30,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:25:30,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:25:31,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:25:31,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:25:34,631 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 06:25:35,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 06:25:38,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=781840.0, ans=0.125 2023-10-02 06:25:41,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:25:41,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=781840.0, ans=0.0 2023-10-02 06:25:42,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:25:42,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 06:25:43,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 06:25:45,159 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.817e+02 2.027e+02 2.516e+02 3.767e+02, threshold=4.054e+02, percent-clipped=0.0 2023-10-02 06:25:45,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:25:47,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:25:50,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=781906.6666666666, ans=0.0 2023-10-02 06:25:54,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 06:25:59,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:26:00,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 06:26:02,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:26:03,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:26:04,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 06:26:07,057 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.07 vs. limit=15.0 2023-10-02 06:26:07,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:26:10,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:26:10,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:26:11,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=781973.3333333334, ans=0.0 2023-10-02 06:26:14,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:26:15,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 06:26:17,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 06:26:17,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 06:26:20,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:26:20,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:26:21,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=782040.0, ans=0.04949747468305833 2023-10-02 06:26:23,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 06:26:23,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:26:23,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=782040.0, ans=0.125 2023-10-02 06:26:24,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:26:24,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 06:26:26,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 06:26:26,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:26:27,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:26:27,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:26:27,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 06:26:29,561 INFO [train.py:1046] (1/4) Epoch 23, batch 450, loss[loss=0.1558, simple_loss=0.2387, pruned_loss=0.03639, over 24448.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2495, pruned_loss=0.04733, over 4231096.78 frames. ], batch size: 63, lr: 4.51e-03, grad_scale: 8.0 2023-10-02 06:26:29,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:26:31,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:26:32,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:26:40,312 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.21 vs. limit=6.0 2023-10-02 06:26:42,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:26:42,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:26:46,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 06:26:47,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 06:26:51,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:26:53,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:26:56,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:26:56,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=782173.3333333334, ans=0.0 2023-10-02 06:26:58,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:26:59,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:27:00,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 06:27:02,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 06:27:04,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 06:27:04,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:27:05,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:27:05,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:27:08,439 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 06:27:08,447 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 06:27:09,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:27:11,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:27:12,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 06:27:14,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=782306.6666666666, ans=0.1 2023-10-02 06:27:14,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=782306.6666666666, ans=0.0 2023-10-02 06:27:15,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:27:15,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:27:15,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 06:27:17,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 06:27:20,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:27:23,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 06:27:23,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:27:25,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 06:27:27,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:27:28,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 06:27:29,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 06:27:30,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:27:32,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=782373.3333333334, ans=0.0 2023-10-02 06:27:35,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:27:37,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:27:38,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:27:38,858 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 06:27:43,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:27:43,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:27:44,765 INFO [train.py:1046] (1/4) Epoch 23, batch 500, loss[loss=0.1777, simple_loss=0.2494, pruned_loss=0.05296, over 23387.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2507, pruned_loss=0.04805, over 4331070.15 frames. ], batch size: 93, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:27:44,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:27:44,840 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 06:27:46,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 06:27:46,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:27:50,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 06:27:54,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 06:27:57,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:27:58,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:27:58,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:27:59,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:09,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:28:09,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:28:10,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:28:10,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:28:10,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 06:28:11,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:28:14,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:28:14,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:28:14,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:28:15,805 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.861e+02 2.114e+02 2.319e+02 3.215e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-02 06:28:15,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:28:15,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 06:28:20,587 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 06:28:23,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:28:23,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:25,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:26,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:27,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:28:28,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 06:28:30,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=782640.0, ans=0.1 2023-10-02 06:28:32,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:28:32,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:28:35,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=782640.0, ans=0.0 2023-10-02 06:28:37,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:28:40,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:44,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:28:46,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=782706.6666666666, ans=0.0 2023-10-02 06:28:47,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 06:28:47,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:28:48,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:28:48,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=782706.6666666666, ans=0.125 2023-10-02 06:28:51,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 06:28:53,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 06:28:55,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:28:59,826 INFO [train.py:1046] (1/4) Epoch 23, batch 550, loss[loss=0.1866, simple_loss=0.2633, pruned_loss=0.05499, over 23947.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2511, pruned_loss=0.04825, over 4426373.34 frames. ], batch size: 86, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:28:59,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 06:29:03,275 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.08 vs. limit=15.0 2023-10-02 06:29:03,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 06:29:04,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:29:05,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 06:29:05,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:29:05,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:29:05,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:05,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:05,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:29:06,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:29:10,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:29:12,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 06:29:12,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:29:16,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:29:16,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:18,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:29:20,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:21,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=782840.0, ans=0.2 2023-10-02 06:29:23,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 06:29:25,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 06:29:25,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=782840.0, ans=0.0 2023-10-02 06:29:26,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:29:31,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=782906.6666666666, ans=0.0 2023-10-02 06:29:33,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:29:34,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:29:35,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:29:37,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:29:37,249 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 06:29:38,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:40,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 06:29:40,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=782906.6666666666, ans=10.0 2023-10-02 06:29:41,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:29:43,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:29:43,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:29:43,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:29:44,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 06:29:44,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 06:29:46,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:29:46,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:29:46,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:29:46,721 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.22 vs. limit=22.5 2023-10-02 06:29:47,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:29:49,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=782973.3333333334, ans=0.2 2023-10-02 06:29:50,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:29:53,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:29:54,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:29:55,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:29:56,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 06:29:58,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:29:59,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:30:00,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:30:02,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:30:03,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=783040.0, ans=0.125 2023-10-02 06:30:04,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:30:04,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 06:30:11,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 06:30:12,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=783106.6666666666, ans=0.0 2023-10-02 06:30:13,913 INFO [train.py:1046] (1/4) Epoch 23, batch 600, loss[loss=0.1777, simple_loss=0.262, pruned_loss=0.04671, over 24410.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2508, pruned_loss=0.04858, over 4482522.79 frames. ], batch size: 77, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:30:14,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 06:30:15,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:30:15,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:30:17,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:30:18,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=783106.6666666666, ans=0.0 2023-10-02 06:30:20,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=783106.6666666666, ans=0.125 2023-10-02 06:30:24,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:30:25,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:30:27,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 06:30:29,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=783173.3333333334, ans=0.05 2023-10-02 06:30:30,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:30:31,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:30:33,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:30:35,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 06:30:35,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:30:36,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=783173.3333333334, ans=0.1 2023-10-02 06:30:41,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 06:30:44,839 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.803e+02 1.978e+02 2.195e+02 2.831e+02, threshold=3.957e+02, percent-clipped=0.0 2023-10-02 06:30:44,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:30:44,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:30:45,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:30:51,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:30:51,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:30:52,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:30:59,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:31:05,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:31:05,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:31:05,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:31:07,374 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.85 vs. limit=22.5 2023-10-02 06:31:09,906 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.04 vs. limit=15.0 2023-10-02 06:31:12,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 06:31:16,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:31:16,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:31:16,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=783373.3333333334, ans=0.125 2023-10-02 06:31:21,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 06:31:21,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:31:23,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 06:31:25,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:31:25,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:31:28,452 INFO [train.py:1046] (1/4) Epoch 23, batch 650, loss[loss=0.1724, simple_loss=0.2546, pruned_loss=0.0451, over 24482.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.25, pruned_loss=0.04851, over 4527134.33 frames. ], batch size: 63, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:31:29,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 06:31:31,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:31:33,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:31:35,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:31:36,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:31:36,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=783440.0, ans=0.1 2023-10-02 06:31:40,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 06:31:40,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:31:44,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:31:44,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:31:46,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:31:50,494 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.42 vs. limit=15.0 2023-10-02 06:31:51,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 06:31:52,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:31:54,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:31:57,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:31:57,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 06:31:59,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:32:01,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:01,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 06:32:03,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:04,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:32:06,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:32:07,407 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 06:32:07,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:32:07,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:32:10,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:11,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:32:11,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:32:11,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:32:12,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 06:32:13,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:32:14,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:32:15,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:32:15,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:32:17,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 06:32:18,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 06:32:20,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 06:32:20,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:20,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:32:20,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:32:20,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=783640.0, ans=0.04949747468305833 2023-10-02 06:32:21,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:32:23,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:32:28,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:28,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:32:30,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:32:32,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=783706.6666666666, ans=0.1 2023-10-02 06:32:33,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:32:33,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 06:32:33,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:32:40,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:32:40,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:32:42,107 INFO [train.py:1046] (1/4) Epoch 23, batch 700, loss[loss=0.1735, simple_loss=0.2417, pruned_loss=0.05263, over 23737.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2491, pruned_loss=0.04812, over 4574257.88 frames. ], batch size: 164, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:32:42,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:32:42,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:32:45,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=783773.3333333334, ans=0.125 2023-10-02 06:32:47,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 06:32:47,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 06:32:51,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 06:32:52,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:52,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:32:55,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 06:32:59,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:33:01,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:33:03,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:33:04,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:33:05,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:33:08,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:33:10,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 06:33:10,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:33:12,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 06:33:12,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=783906.6666666666, ans=0.2 2023-10-02 06:33:13,310 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.806e+02 2.034e+02 2.307e+02 3.674e+02, threshold=4.068e+02, percent-clipped=0.0 2023-10-02 06:33:13,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 06:33:17,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:33:17,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:33:18,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=783906.6666666666, ans=0.0 2023-10-02 06:33:19,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:33:23,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:33:23,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 06:33:27,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:33:28,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:33:28,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 06:33:35,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:33:35,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:33:38,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:33:42,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:33:42,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 06:33:46,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 06:33:46,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 06:33:51,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:33:53,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:33:53,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:33:55,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:33:55,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 06:33:57,197 INFO [train.py:1046] (1/4) Epoch 23, batch 750, loss[loss=0.1724, simple_loss=0.2572, pruned_loss=0.04387, over 24026.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2481, pruned_loss=0.04763, over 4607408.99 frames. ], batch size: 80, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:33:59,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 06:33:59,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 06:34:01,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 06:34:03,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 06:34:03,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 06:34:03,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:34:04,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 06:34:06,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:34:06,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:34:09,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:34:10,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:34:10,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:34:10,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:34:13,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:34:14,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:34:17,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:34:20,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:34:20,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:34:21,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 06:34:23,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:34:23,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:34:23,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=784173.3333333334, ans=0.0 2023-10-02 06:34:25,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:34:28,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 06:34:29,098 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.29 vs. limit=22.5 2023-10-02 06:34:29,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 06:34:29,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:34:31,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 06:34:31,218 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 06:34:31,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 06:34:31,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:34:31,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 06:34:34,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:34:40,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:34:40,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:34:40,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:34:41,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:34:45,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:34:46,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 06:34:46,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:34:47,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 06:34:48,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:34:50,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:34:50,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 06:34:52,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:34:56,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:34:57,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:34:58,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:34:59,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:35:03,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 06:35:03,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=784373.3333333334, ans=0.1 2023-10-02 06:35:05,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:35:05,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:35:07,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=784373.3333333334, ans=0.125 2023-10-02 06:35:08,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:35:08,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:35:11,283 INFO [train.py:1046] (1/4) Epoch 23, batch 800, loss[loss=0.1814, simple_loss=0.2577, pruned_loss=0.05253, over 23222.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2486, pruned_loss=0.04753, over 4647351.12 frames. ], batch size: 105, lr: 4.50e-03, grad_scale: 16.0 2023-10-02 06:35:11,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:35:11,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:35:20,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:35:20,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:23,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:35:23,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:35:23,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=784440.0, ans=0.07 2023-10-02 06:35:24,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:24,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:35:26,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:30,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:35:31,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:35:33,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 06:35:33,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:35:35,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:35:35,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:35:35,504 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:35:37,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:35:37,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 06:35:37,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:35:37,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 06:35:39,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:41,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:35:42,557 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.768e+02 2.043e+02 2.413e+02 3.379e+02, threshold=4.086e+02, percent-clipped=0.0 2023-10-02 06:35:42,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:35:44,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:35:47,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:35:47,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:35:52,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:35:52,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:35:52,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 06:35:52,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=784573.3333333334, ans=0.0 2023-10-02 06:35:54,658 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 06:35:55,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 06:35:55,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:35:55,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:35:57,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:57,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:36:03,505 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 06:36:03,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 06:36:03,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:36:05,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:36:09,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:36:12,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:36:13,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 06:36:13,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:36:15,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=784706.6666666666, ans=0.0 2023-10-02 06:36:16,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 06:36:22,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:36:23,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=784706.6666666666, ans=0.2 2023-10-02 06:36:26,658 INFO [train.py:1046] (1/4) Epoch 23, batch 850, loss[loss=0.1713, simple_loss=0.241, pruned_loss=0.05079, over 23622.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2494, pruned_loss=0.04806, over 4657523.38 frames. ], batch size: 149, lr: 4.50e-03, grad_scale: 16.0 2023-10-02 06:36:26,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:36:26,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 06:36:26,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:36:26,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=784773.3333333334, ans=0.1 2023-10-02 06:36:28,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:36:29,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 06:36:29,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:36:31,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:36:31,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=784773.3333333334, ans=0.1 2023-10-02 06:36:32,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:36:33,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:36:36,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:36:37,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 06:36:37,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 06:36:37,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 06:36:38,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:36:40,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:36:41,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:36:41,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:36:41,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:36:44,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=784840.0, ans=0.125 2023-10-02 06:36:46,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:36:47,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:36:47,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 06:36:51,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 06:36:54,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:36:55,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 06:36:59,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 06:37:01,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 06:37:02,547 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 06:37:02,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:37:02,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:37:03,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 06:37:07,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:37:07,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:37:08,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 06:37:11,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:37:12,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:37:12,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:37:12,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:37:14,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:37:15,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:37:15,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 06:37:19,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:37:19,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:37:19,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:37:19,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:37:21,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:37:21,524 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:37:24,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:37:27,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:37:29,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:37:29,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:37:31,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 06:37:34,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=785040.0, ans=0.09899494936611666 2023-10-02 06:37:37,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 06:37:38,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:37:40,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 06:37:40,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:37:40,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:37:41,628 INFO [train.py:1046] (1/4) Epoch 23, batch 900, loss[loss=0.1623, simple_loss=0.2481, pruned_loss=0.03826, over 24664.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2507, pruned_loss=0.04851, over 4671464.19 frames. ], batch size: 65, lr: 4.50e-03, grad_scale: 16.0 2023-10-02 06:37:43,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 06:37:47,874 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.26 vs. limit=10.0 2023-10-02 06:37:48,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:37:51,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:37:51,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 06:37:53,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=785106.6666666666, ans=0.2 2023-10-02 06:37:56,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:37:57,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 06:37:58,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 06:37:58,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:37:58,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:38:00,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:38:00,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:38:11,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:38:11,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:38:11,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:38:11,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=785240.0, ans=0.0 2023-10-02 06:38:12,831 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.908e+02 2.046e+02 2.304e+02 2.973e+02, threshold=4.093e+02, percent-clipped=0.0 2023-10-02 06:38:15,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:38:19,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 06:38:21,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:38:25,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:38:25,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:38:25,561 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 06:38:28,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 06:38:31,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:38:31,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:38:31,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:38:39,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:38:39,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:38:40,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=785373.3333333334, ans=0.125 2023-10-02 06:38:41,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 06:38:41,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:38:42,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=785373.3333333334, ans=0.2 2023-10-02 06:38:44,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 06:38:46,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:38:46,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:38:46,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=785373.3333333334, ans=0.0 2023-10-02 06:38:48,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:38:48,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:38:50,895 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.13 vs. limit=12.0 2023-10-02 06:38:53,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 06:38:53,100 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 06:38:55,738 INFO [train.py:1046] (1/4) Epoch 23, batch 950, loss[loss=0.1709, simple_loss=0.2638, pruned_loss=0.03895, over 24548.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2512, pruned_loss=0.04868, over 4687305.95 frames. ], batch size: 71, lr: 4.50e-03, grad_scale: 16.0 2023-10-02 06:38:55,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 06:38:55,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 06:38:57,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:39:02,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 06:39:05,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:39:09,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:39:09,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:39:09,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 06:39:11,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=785506.6666666666, ans=0.0 2023-10-02 06:39:13,055 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 06:39:15,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:39:17,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:39:17,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:39:17,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:39:17,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 06:39:18,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 06:39:20,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:39:21,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 06:39:22,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:39:25,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:39:25,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:39:25,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:39:28,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 06:39:30,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 06:39:32,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:39:33,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:39:37,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:39:37,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:39:43,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 06:39:44,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 06:39:44,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:39:44,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:39:46,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:39:46,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:39:50,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 06:39:50,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:39:51,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:39:53,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:39:53,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 06:39:54,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:39:54,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:39:54,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 06:40:00,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:40:00,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=785706.6666666666, ans=0.07 2023-10-02 06:40:02,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:40:03,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=785706.6666666666, ans=0.0 2023-10-02 06:40:06,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:40:07,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 06:40:07,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 06:40:10,922 INFO [train.py:1046] (1/4) Epoch 23, batch 1000, loss[loss=0.1666, simple_loss=0.2549, pruned_loss=0.03914, over 24293.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2501, pruned_loss=0.04853, over 4689110.53 frames. ], batch size: 74, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:40:12,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:40:14,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 06:40:14,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:40:21,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:40:21,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 06:40:21,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 06:40:25,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=785840.0, ans=15.0 2023-10-02 06:40:25,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:40:25,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:40:27,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:40:29,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 06:40:33,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 06:40:35,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 06:40:37,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:40:37,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 06:40:38,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 06:40:38,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 06:40:40,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:40:40,679 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=9.68 vs. limit=15.0 2023-10-02 06:40:41,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:40:42,773 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.822e+02 2.055e+02 2.435e+02 3.236e+02, threshold=4.111e+02, percent-clipped=0.0 2023-10-02 06:40:50,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:40:51,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:40:51,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:40:53,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:40:53,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 06:40:54,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:40:54,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:40:55,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:40:56,028 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 06:40:58,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 06:41:00,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 06:41:01,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 06:41:03,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:41:03,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=785973.3333333334, ans=0.125 2023-10-02 06:41:08,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=785973.3333333334, ans=0.125 2023-10-02 06:41:09,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:41:11,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:41:11,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:41:12,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:41:13,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 06:41:13,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:41:14,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=786040.0, ans=0.125 2023-10-02 06:41:15,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 06:41:15,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 06:41:17,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:41:17,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:41:18,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:41:21,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:41:24,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:41:25,587 INFO [train.py:1046] (1/4) Epoch 23, batch 1050, loss[loss=0.186, simple_loss=0.2711, pruned_loss=0.05045, over 24390.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2484, pruned_loss=0.04781, over 4692087.02 frames. ], batch size: 77, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:41:25,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:41:27,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:41:28,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:41:29,566 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.94 vs. limit=22.5 2023-10-02 06:41:30,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:41:32,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:41:33,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:41:34,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=786106.6666666666, ans=0.1 2023-10-02 06:41:35,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:41:37,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:41:38,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:41:38,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:41:39,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:41:39,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 06:41:41,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:41:41,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 06:41:44,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:41:44,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 06:41:44,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=786173.3333333334, ans=0.0 2023-10-02 06:41:45,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 06:41:50,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:41:51,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:41:51,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:41:52,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=786173.3333333334, ans=0.0 2023-10-02 06:41:54,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 06:41:54,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 06:41:54,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:41:58,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 06:42:02,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 06:42:03,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:42:07,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 06:42:10,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 06:42:10,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:42:11,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:42:15,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:42:18,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 06:42:20,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 06:42:20,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 06:42:20,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:42:21,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:42:22,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 06:42:27,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:42:29,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:42:29,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:42:30,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:42:30,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:42:33,098 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.46 vs. limit=15.0 2023-10-02 06:42:34,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:42:34,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 06:42:37,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:42:37,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 06:42:37,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 06:42:39,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:42:42,282 INFO [train.py:1046] (1/4) Epoch 23, batch 1100, loss[loss=0.1741, simple_loss=0.2621, pruned_loss=0.04304, over 24449.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2484, pruned_loss=0.04757, over 4691532.53 frames. ], batch size: 69, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:42:43,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:42:48,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:42:51,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:42:52,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:42:52,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:42:52,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 06:42:53,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=786440.0, ans=0.125 2023-10-02 06:42:55,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:42:57,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=786506.6666666666, ans=0.05 2023-10-02 06:42:58,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:42:59,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:43:02,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:43:02,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 06:43:03,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 06:43:05,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:43:05,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:43:05,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=786506.6666666666, ans=0.0 2023-10-02 06:43:09,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:43:12,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:43:13,580 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.798e+02 1.908e+02 2.172e+02 3.443e+02, threshold=3.816e+02, percent-clipped=0.0 2023-10-02 06:43:16,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:43:19,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 06:43:19,340 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 06:43:19,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:43:22,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:43:23,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:43:24,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:43:25,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 06:43:26,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:43:26,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:43:26,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:43:26,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:43:26,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 06:43:32,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:43:32,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 06:43:34,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:43:39,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:43:39,698 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:43:42,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 06:43:42,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 06:43:44,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:43:46,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:43:48,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:43:49,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 06:43:49,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:43:49,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:43:50,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 06:43:51,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:43:52,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 06:43:52,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:43:52,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:43:54,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:43:57,338 INFO [train.py:1046] (1/4) Epoch 23, batch 1150, loss[loss=0.2062, simple_loss=0.2687, pruned_loss=0.07185, over 23558.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2484, pruned_loss=0.04765, over 4697362.00 frames. ], batch size: 256, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:43:58,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:44:00,518 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:44:00,801 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.58 vs. limit=15.0 2023-10-02 06:44:01,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:44:04,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:44:04,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:44:05,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 06:44:05,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:44:09,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 06:44:09,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:44:09,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:44:17,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 06:44:19,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:44:24,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:44:24,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:44:25,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 06:44:25,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:44:25,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:44:29,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 06:44:30,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:44:30,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=786906.6666666666, ans=0.1 2023-10-02 06:44:31,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:44:37,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=786906.6666666666, ans=0.125 2023-10-02 06:44:41,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=786973.3333333334, ans=0.125 2023-10-02 06:44:41,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=786973.3333333334, ans=0.125 2023-10-02 06:44:42,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:44:44,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=786973.3333333334, ans=0.125 2023-10-02 06:44:45,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=786973.3333333334, ans=0.125 2023-10-02 06:44:47,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:44:49,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 06:44:49,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:44:50,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:44:54,669 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 06:44:55,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:45:03,368 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 06:45:07,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:45:09,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:45:10,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:45:10,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:45:12,377 INFO [train.py:1046] (1/4) Epoch 23, batch 1200, loss[loss=0.1646, simple_loss=0.2548, pruned_loss=0.03726, over 24354.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.249, pruned_loss=0.04758, over 4709536.24 frames. ], batch size: 74, lr: 4.49e-03, grad_scale: 32.0 2023-10-02 06:45:12,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:45:15,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=787106.6666666666, ans=0.0 2023-10-02 06:45:17,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:45:17,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:45:17,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=787106.6666666666, ans=0.125 2023-10-02 06:45:19,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:45:19,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:45:19,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:45:20,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=787106.6666666666, ans=0.125 2023-10-02 06:45:21,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:45:23,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:45:23,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=787106.6666666666, ans=0.125 2023-10-02 06:45:24,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:45:24,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:45:27,473 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 06:45:30,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 06:45:33,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:45:33,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=787173.3333333334, ans=0.0 2023-10-02 06:45:36,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:45:37,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:45:39,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:45:40,378 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 06:45:42,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:45:43,665 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.418e+02 1.812e+02 2.009e+02 2.421e+02 3.393e+02, threshold=4.017e+02, percent-clipped=0.0 2023-10-02 06:45:49,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:45:49,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:45:49,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 06:45:51,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:45:53,860 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.93 vs. limit=15.0 2023-10-02 06:45:54,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 06:45:58,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 06:45:58,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:45:59,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:46:01,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:46:02,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:46:02,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:46:02,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:46:04,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:46:05,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 06:46:06,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:46:07,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:46:07,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 06:46:08,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:46:08,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:46:13,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 06:46:16,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:46:17,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=787373.3333333334, ans=0.05 2023-10-02 06:46:19,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 06:46:22,547 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 06:46:24,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:46:25,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:46:27,469 INFO [train.py:1046] (1/4) Epoch 23, batch 1250, loss[loss=0.1742, simple_loss=0.2533, pruned_loss=0.04757, over 24339.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2494, pruned_loss=0.04789, over 4721633.54 frames. ], batch size: 61, lr: 4.49e-03, grad_scale: 32.0 2023-10-02 06:46:27,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:46:28,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:46:33,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 06:46:37,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:46:37,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:46:38,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=787440.0, ans=0.125 2023-10-02 06:46:39,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 06:46:40,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:46:42,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:46:45,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:46:45,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:46:46,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:46:46,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:46:48,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 06:46:53,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 06:46:53,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:46:53,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:46:54,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:46:54,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:46:57,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:46:57,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 06:47:03,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 06:47:03,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:47:06,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:47:08,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 06:47:09,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:47:09,440 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 06:47:09,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:47:09,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:47:11,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=787640.0, ans=0.125 2023-10-02 06:47:12,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:47:15,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:47:15,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:47:15,932 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:47:17,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 06:47:17,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 06:47:17,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 06:47:23,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:47:24,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 06:47:24,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:47:28,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 06:47:28,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:47:28,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=787706.6666666666, ans=0.0 2023-10-02 06:47:32,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 06:47:32,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:47:32,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:47:32,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 06:47:32,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:47:34,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 06:47:36,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:47:38,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:47:39,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:47:39,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=787706.6666666666, ans=0.125 2023-10-02 06:47:41,479 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.35 vs. limit=10.0 2023-10-02 06:47:42,160 INFO [train.py:1046] (1/4) Epoch 23, batch 1300, loss[loss=0.1825, simple_loss=0.2472, pruned_loss=0.05891, over 23519.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2501, pruned_loss=0.04864, over 4714330.04 frames. ], batch size: 120, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:47:43,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:47:44,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:47:45,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 06:47:51,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:47:52,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:47:53,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:47:55,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:47:57,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:47:57,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 06:48:01,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:48:01,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=787840.0, ans=0.125 2023-10-02 06:48:02,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:48:03,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 06:48:06,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 06:48:09,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:48:10,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:48:12,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:48:12,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:48:13,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:48:15,083 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.873e+02 2.076e+02 2.335e+02 3.601e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-02 06:48:15,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 06:48:15,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 06:48:21,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:48:23,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:48:24,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 06:48:25,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 06:48:27,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:48:28,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:48:30,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 06:48:30,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:48:30,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 06:48:31,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:48:35,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:48:35,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:48:39,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 06:48:41,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 06:48:41,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 06:48:46,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:48:47,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 06:48:49,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:48:49,702 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=6.65 vs. limit=12.0 2023-10-02 06:48:55,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=788040.0, ans=0.125 2023-10-02 06:48:55,798 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.56 vs. limit=15.0 2023-10-02 06:48:56,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 06:48:57,843 INFO [train.py:1046] (1/4) Epoch 23, batch 1350, loss[loss=0.1705, simple_loss=0.2511, pruned_loss=0.045, over 24497.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2501, pruned_loss=0.04871, over 4703284.00 frames. ], batch size: 63, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:48:59,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:49:00,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:49:02,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=788106.6666666666, ans=0.125 2023-10-02 06:49:04,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:49:05,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:49:07,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:49:07,569 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=788106.6666666666, ans=0.0 2023-10-02 06:49:08,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:49:09,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=788106.6666666666, ans=0.2 2023-10-02 06:49:11,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:49:14,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 06:49:14,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:49:15,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:49:18,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 06:49:18,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:49:20,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:49:20,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 06:49:23,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 06:49:24,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 06:49:26,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:49:26,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 06:49:37,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:49:38,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=788240.0, ans=0.2 2023-10-02 06:49:46,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:49:46,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:49:47,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 06:49:51,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:49:52,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 06:49:52,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:49:53,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:49:55,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:49:58,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 06:49:59,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:50:01,937 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.60 vs. limit=15.0 2023-10-02 06:50:04,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 06:50:06,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 06:50:13,338 INFO [train.py:1046] (1/4) Epoch 23, batch 1400, loss[loss=0.1697, simple_loss=0.2457, pruned_loss=0.0469, over 23500.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2479, pruned_loss=0.04832, over 4694050.37 frames. ], batch size: 134, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:50:13,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 06:50:14,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:50:16,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:50:16,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=788440.0, ans=0.2 2023-10-02 06:50:17,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:50:22,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 06:50:23,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 06:50:30,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=788506.6666666666, ans=0.2 2023-10-02 06:50:31,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=788506.6666666666, ans=0.1 2023-10-02 06:50:34,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=788506.6666666666, ans=0.125 2023-10-02 06:50:35,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:50:37,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:50:39,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:50:39,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 06:50:44,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:50:44,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 06:50:46,775 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.899e+02 2.083e+02 2.387e+02 3.639e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-02 06:50:47,490 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.22 vs. limit=10.0 2023-10-02 06:50:53,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:50:54,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:50:57,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 06:50:58,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:50:58,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:51:00,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:51:00,923 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.31 vs. limit=12.0 2023-10-02 06:51:02,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:51:02,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:51:02,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:51:02,803 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=6.05 vs. limit=12.0 2023-10-02 06:51:03,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:51:03,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 06:51:04,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:51:09,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:13,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:51:18,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 06:51:20,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:51:21,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:51:24,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 06:51:24,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:51:27,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:51:28,910 INFO [train.py:1046] (1/4) Epoch 23, batch 1450, loss[loss=0.1496, simple_loss=0.2297, pruned_loss=0.0347, over 24557.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2476, pruned_loss=0.04794, over 4696098.06 frames. ], batch size: 60, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:51:29,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:51:30,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:51:32,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:32,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 06:51:36,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:51:38,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:51:39,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:51:39,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 06:51:41,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:51:41,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 06:51:43,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:44,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:51:44,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 06:51:44,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:51:45,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:51:47,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 06:51:47,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:51:48,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:51:50,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:51,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:51:54,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:51:54,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:51:56,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:51:57,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:59,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:51:59,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:51:59,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:59,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=788906.6666666666, ans=0.1 2023-10-02 06:52:00,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:52:03,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 06:52:06,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:52:08,652 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 06:52:10,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:52:11,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:52:13,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:52:14,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 06:52:17,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:52:17,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 06:52:19,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 06:52:20,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:52:25,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:52:25,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:52:26,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 06:52:30,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 06:52:30,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 06:52:30,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=789040.0, ans=0.125 2023-10-02 06:52:31,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:52:33,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 06:52:41,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=789040.0, ans=0.125 2023-10-02 06:52:44,875 INFO [train.py:1046] (1/4) Epoch 23, batch 1500, loss[loss=0.1751, simple_loss=0.2517, pruned_loss=0.04925, over 23377.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2485, pruned_loss=0.04805, over 4705086.77 frames. ], batch size: 105, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:52:44,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 06:52:44,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:52:44,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:52:46,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:52:47,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:52:47,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:52:49,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 06:52:50,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:52:50,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 06:52:51,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:52:51,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:52:53,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:52:55,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:52:55,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=789106.6666666666, ans=0.125 2023-10-02 06:52:59,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:52:59,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 06:53:00,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:53:00,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:53:02,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:53:03,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 06:53:04,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=789173.3333333334, ans=0.1 2023-10-02 06:53:08,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 06:53:08,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=789173.3333333334, ans=0.125 2023-10-02 06:53:09,088 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.07 vs. limit=12.0 2023-10-02 06:53:09,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:53:10,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 06:53:12,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 06:53:15,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:53:17,161 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.895e+02 2.081e+02 2.533e+02 4.073e+02, threshold=4.161e+02, percent-clipped=0.0 2023-10-02 06:53:17,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:53:17,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:53:19,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 06:53:19,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:53:19,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:53:21,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 06:53:21,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:53:25,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:53:25,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 06:53:27,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=789306.6666666666, ans=0.125 2023-10-02 06:53:30,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=789306.6666666666, ans=0.2 2023-10-02 06:53:33,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:53:35,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:53:39,530 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 06:53:39,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:53:39,592 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 06:53:40,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:53:42,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:53:42,909 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 06:53:44,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:53:47,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 06:53:47,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=789373.3333333334, ans=0.125 2023-10-02 06:53:49,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:53:49,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=789373.3333333334, ans=0.125 2023-10-02 06:53:52,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:53:52,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:53:53,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:53:53,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:53:54,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:53:56,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 06:53:56,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 06:53:57,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:53:57,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 06:53:57,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 06:53:57,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=789440.0, ans=0.0 2023-10-02 06:53:59,026 INFO [train.py:1046] (1/4) Epoch 23, batch 1550, loss[loss=0.1872, simple_loss=0.2679, pruned_loss=0.05329, over 23926.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2495, pruned_loss=0.04848, over 4709551.58 frames. ], batch size: 86, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 06:54:01,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:54:02,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:54:02,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:54:02,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:54:02,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:54:03,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:54:07,286 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 06:54:07,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:54:08,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:54:08,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:54:11,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:54:11,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 06:54:12,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:54:12,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 06:54:14,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 06:54:14,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 06:54:16,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:54:16,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:54:17,203 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.53 vs. limit=15.0 2023-10-02 06:54:20,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:54:22,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 06:54:22,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 06:54:24,021 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.66 vs. limit=15.0 2023-10-02 06:54:29,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:54:31,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=789573.3333333334, ans=0.125 2023-10-02 06:54:33,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:54:33,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:54:35,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:54:35,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 06:54:42,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:54:43,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:54:44,644 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.59 vs. limit=15.0 2023-10-02 06:54:47,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:54:49,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:54:50,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:54:50,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 06:54:50,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:54:52,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:54:52,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:54:52,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 06:54:52,420 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 06:54:54,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:54:56,217 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.09 vs. limit=15.0 2023-10-02 06:55:00,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 06:55:03,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:55:05,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:55:06,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 06:55:08,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:55:08,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:55:08,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:55:09,656 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.24 vs. limit=15.0 2023-10-02 06:55:10,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:55:10,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:55:14,654 INFO [train.py:1046] (1/4) Epoch 23, batch 1600, loss[loss=0.1818, simple_loss=0.2526, pruned_loss=0.05555, over 23576.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2507, pruned_loss=0.04934, over 4692410.26 frames. ], batch size: 256, lr: 4.48e-03, grad_scale: 32.0 2023-10-02 06:55:14,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:55:14,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 06:55:16,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 06:55:17,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 06:55:20,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:55:20,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 06:55:22,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:55:24,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:55:29,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:55:32,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 06:55:34,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:55:34,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=789840.0, ans=0.2 2023-10-02 06:55:35,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 06:55:35,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:55:35,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 06:55:41,149 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.14 vs. limit=10.0 2023-10-02 06:55:42,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=789906.6666666666, ans=0.1 2023-10-02 06:55:43,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 06:55:46,670 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.856e+02 2.043e+02 2.277e+02 4.874e+02, threshold=4.086e+02, percent-clipped=2.0 2023-10-02 06:55:49,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=789906.6666666666, ans=0.1 2023-10-02 06:55:51,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:55:51,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 06:55:52,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:55:52,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:55:52,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:55:55,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 06:56:00,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 06:56:01,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:56:01,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:56:02,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:56:04,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:56:05,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:56:07,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:56:08,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:56:09,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=789973.3333333334, ans=0.0 2023-10-02 06:56:09,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=789973.3333333334, ans=0.125 2023-10-02 06:56:14,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:56:16,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:56:18,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 06:56:18,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:56:18,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 06:56:24,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:56:25,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:56:25,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:56:25,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=790040.0, ans=0.125 2023-10-02 06:56:27,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 06:56:27,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 06:56:27,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 06:56:27,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 06:56:27,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=790106.6666666666, ans=0.125 2023-10-02 06:56:28,502 INFO [train.py:1046] (1/4) Epoch 23, batch 1650, loss[loss=0.1729, simple_loss=0.239, pruned_loss=0.05346, over 23714.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2508, pruned_loss=0.0491, over 4692521.38 frames. ], batch size: 164, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 06:56:31,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:56:31,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:56:31,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:56:31,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:56:34,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:56:35,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 06:56:37,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:56:38,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:56:38,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:56:38,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:56:40,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 06:56:40,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 06:56:45,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=790173.3333333334, ans=0.1 2023-10-02 06:56:46,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:56:47,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:57:00,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 06:57:02,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:57:03,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 06:57:05,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:57:06,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:57:08,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:57:08,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:57:09,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=790240.0, ans=0.125 2023-10-02 06:57:10,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:57:10,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:57:12,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:57:14,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:57:14,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:57:15,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:57:15,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:57:15,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:57:19,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:57:19,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 06:57:21,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:57:22,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 06:57:23,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 06:57:24,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 06:57:24,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:57:24,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:57:24,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:57:26,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:57:26,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 06:57:31,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:57:33,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:57:34,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:57:36,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 06:57:40,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:57:40,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:57:40,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 06:57:41,919 INFO [train.py:1046] (1/4) Epoch 23, batch 1700, loss[loss=0.16, simple_loss=0.2392, pruned_loss=0.04038, over 24660.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2489, pruned_loss=0.04835, over 4692166.80 frames. ], batch size: 65, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 06:57:42,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:57:42,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:57:42,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:57:45,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:57:45,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:57:46,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=790440.0, ans=0.1 2023-10-02 06:57:47,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 06:57:50,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:57:57,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=790506.6666666666, ans=0.125 2023-10-02 06:57:58,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:58:00,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:58:04,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:58:06,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:58:06,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:58:06,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:58:09,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 06:58:10,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:58:10,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:58:11,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:58:13,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 06:58:15,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 06:58:16,796 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.870e+02 2.002e+02 2.327e+02 4.481e+02, threshold=4.004e+02, percent-clipped=3.0 2023-10-02 06:58:16,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 06:58:18,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:58:20,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 06:58:20,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:58:24,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=790573.3333333334, ans=0.04949747468305833 2023-10-02 06:58:27,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=790640.0, ans=0.125 2023-10-02 06:58:30,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:58:30,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:58:30,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:58:33,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:58:33,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 06:58:33,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:58:34,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=790640.0, ans=0.0 2023-10-02 06:58:35,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:58:35,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 06:58:37,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:58:37,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:58:37,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:58:37,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:58:40,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:58:40,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:58:42,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:58:43,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:58:43,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:58:45,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=790706.6666666666, ans=22.5 2023-10-02 06:58:48,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:58:48,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=790706.6666666666, ans=0.04949747468305833 2023-10-02 06:58:49,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 06:58:51,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:58:52,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:58:54,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 06:58:57,216 INFO [train.py:1046] (1/4) Epoch 23, batch 1750, loss[loss=0.1677, simple_loss=0.2364, pruned_loss=0.04955, over 23593.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2473, pruned_loss=0.04749, over 4706254.95 frames. ], batch size: 256, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 06:59:00,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:59:01,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:59:01,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 06:59:02,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 06:59:02,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:59:03,956 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.23 vs. limit=22.5 2023-10-02 06:59:05,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:59:05,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:59:10,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=790840.0, ans=0.125 2023-10-02 06:59:12,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 06:59:14,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten.whitening_limit, batch_count=790840.0, ans=15.0 2023-10-02 06:59:15,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:59:16,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 06:59:16,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:59:17,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:59:19,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 06:59:21,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 06:59:22,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:59:24,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 06:59:28,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=790906.6666666666, ans=0.0 2023-10-02 06:59:31,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:59:31,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=790906.6666666666, ans=0.2 2023-10-02 06:59:32,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:59:32,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:59:35,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:59:35,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:59:38,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:59:38,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:59:41,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:59:41,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:59:43,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 06:59:45,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:59:48,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 06:59:48,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:59:50,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:59:51,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:59:55,592 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.27 vs. limit=22.5 2023-10-02 06:59:56,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 06:59:57,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 06:59:59,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:00:00,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:00:03,787 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.04 vs. limit=12.0 2023-10-02 07:00:04,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:00:06,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:00:07,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:00:08,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 07:00:08,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:00:09,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 07:00:09,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:00:09,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:00:09,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:00:10,938 INFO [train.py:1046] (1/4) Epoch 23, batch 1800, loss[loss=0.1686, simple_loss=0.2454, pruned_loss=0.0459, over 24431.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2475, pruned_loss=0.047, over 4713108.79 frames. ], batch size: 58, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 07:00:11,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:00:13,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:00:15,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:00:15,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=791106.6666666666, ans=0.125 2023-10-02 07:00:18,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:00:19,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:00:21,869 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.87 vs. limit=10.0 2023-10-02 07:00:22,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:00:23,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:00:25,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:00:28,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:00:28,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=791173.3333333334, ans=0.035 2023-10-02 07:00:29,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:00:30,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=791173.3333333334, ans=0.05 2023-10-02 07:00:31,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:00:32,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:00:32,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 07:00:32,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:00:35,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:00:39,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 07:00:41,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 07:00:43,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 07:00:43,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:00:45,213 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.867e+02 2.054e+02 2.379e+02 4.412e+02, threshold=4.108e+02, percent-clipped=1.0 2023-10-02 07:00:45,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:00:45,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:00:45,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:00:52,839 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 07:00:54,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:00:55,822 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.60 vs. limit=15.0 2023-10-02 07:00:57,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:00:58,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 07:01:00,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 07:01:00,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:01:01,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:01:01,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:01:04,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 07:01:06,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=791306.6666666666, ans=0.1 2023-10-02 07:01:11,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:01:11,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 07:01:11,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:01:11,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:01:13,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:01:13,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 07:01:15,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:01:15,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:01:20,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 07:01:20,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:01:22,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:01:22,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:01:23,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:01:24,780 INFO [train.py:1046] (1/4) Epoch 23, batch 1850, loss[loss=0.1746, simple_loss=0.2495, pruned_loss=0.04988, over 23682.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2478, pruned_loss=0.04709, over 4707833.30 frames. ], batch size: 149, lr: 4.48e-03, grad_scale: 8.0 2023-10-02 07:01:24,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:01:24,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:01:27,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:01:27,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:01:31,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:01:32,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:01:39,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:01:39,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 07:01:42,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 07:01:43,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 07:01:47,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:01:47,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 07:01:47,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 07:01:52,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=791506.6666666666, ans=0.125 2023-10-02 07:01:57,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:01:59,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 07:02:00,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:02:00,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:02:06,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 07:02:06,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:06,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:02:07,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:02:09,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=791640.0, ans=0.125 2023-10-02 07:02:10,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:02:11,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:02:12,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=791640.0, ans=0.125 2023-10-02 07:02:16,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:02:16,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:16,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 07:02:16,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:02:18,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=791640.0, ans=0.1 2023-10-02 07:02:19,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:02:20,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:02:24,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 07:02:24,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:02:27,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:02:27,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:02:27,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 07:02:28,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 07:02:30,388 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 07:02:31,791 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 07:02:33,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:02:34,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:02:34,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:02:34,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:35,808 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 07:02:35,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:02:35,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:37,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 07:02:37,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:02:37,988 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.22 vs. limit=15.0 2023-10-02 07:02:38,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.49 vs. limit=15.0 2023-10-02 07:02:38,713 INFO [train.py:1046] (1/4) Epoch 23, batch 1900, loss[loss=0.1548, simple_loss=0.2431, pruned_loss=0.03323, over 24503.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2482, pruned_loss=0.04743, over 4707143.53 frames. ], batch size: 66, lr: 4.48e-03, grad_scale: 8.0 2023-10-02 07:02:38,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:02:38,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 07:02:40,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:41,320 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 07:02:41,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:02:42,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:02:48,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:02:50,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:02:51,659 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 07:02:51,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 07:02:53,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:02:55,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:02:55,091 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 07:02:55,129 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 07:02:58,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 07:03:01,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:03:05,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 07:03:05,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 07:03:13,810 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.937e+02 2.220e+02 2.601e+02 3.701e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-02 07:03:13,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 07:03:17,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 07:03:18,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:03:19,079 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:03:20,150 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 07:03:20,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 07:03:20,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 07:03:21,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 07:03:21,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:03:24,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 07:03:29,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:03:31,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:03:31,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 07:03:32,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:03:35,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 07:03:35,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:03:41,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:03:41,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:03:42,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:03:42,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:03:45,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:03:45,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:03:45,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=792040.0, ans=0.09899494936611666 2023-10-02 07:03:47,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:03:51,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:03:51,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:03:52,942 INFO [train.py:1046] (1/4) Epoch 23, batch 1950, loss[loss=0.1902, simple_loss=0.2592, pruned_loss=0.06062, over 23535.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2486, pruned_loss=0.0474, over 4723803.59 frames. ], batch size: 256, lr: 4.48e-03, grad_scale: 8.0 2023-10-02 07:03:54,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:03:54,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:03:54,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:03:56,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:03:59,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:04:01,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:04:01,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:01,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:04:03,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 07:04:05,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 07:04:05,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:05,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:07,047 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:04:09,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:04:09,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:10,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:04:12,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:04:12,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=792173.3333333334, ans=0.125 2023-10-02 07:04:13,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:04:13,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:04:13,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:04:13,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:04:17,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:04:21,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:04:21,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:21,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:04:21,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 07:04:21,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:04:21,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:04:21,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=792240.0, ans=0.0 2023-10-02 07:04:22,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:26,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:04:29,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:04:34,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:04:37,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:04:37,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:04:37,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 07:04:37,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:04:38,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=792306.6666666666, ans=0.2 2023-10-02 07:04:40,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=792306.6666666666, ans=0.125 2023-10-02 07:04:41,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:04:41,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:04:42,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:04:46,077 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.97 vs. limit=15.0 2023-10-02 07:04:47,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=792306.6666666666, ans=0.125 2023-10-02 07:04:47,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=792306.6666666666, ans=0.125 2023-10-02 07:04:50,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:50,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:52,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:54,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:57,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:04:58,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:59,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 07:04:59,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:05:00,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:05:00,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 07:05:03,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:05:05,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=792373.3333333334, ans=0.125 2023-10-02 07:05:06,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:05:07,951 INFO [train.py:1046] (1/4) Epoch 23, batch 2000, loss[loss=0.1464, simple_loss=0.2207, pruned_loss=0.03608, over 21609.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2498, pruned_loss=0.04862, over 4704124.68 frames. ], batch size: 47, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 07:05:08,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:05:08,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:05:08,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=792440.0, ans=0.125 2023-10-02 07:05:09,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:05:11,484 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.34 vs. limit=15.0 2023-10-02 07:05:12,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:05:12,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=792440.0, ans=0.0 2023-10-02 07:05:14,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 07:05:14,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:05:17,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:05:18,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=792440.0, ans=0.2 2023-10-02 07:05:19,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 07:05:21,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:05:21,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:05:22,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:05:25,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 07:05:27,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:28,140 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.84 vs. limit=15.0 2023-10-02 07:05:30,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:30,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:30,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 07:05:31,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:05:33,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 07:05:33,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:05:34,351 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.28 vs. limit=22.5 2023-10-02 07:05:37,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:05:39,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 07:05:39,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:39,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:05:40,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:05:40,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 07:05:43,151 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.945e+02 2.106e+02 2.300e+02 3.133e+02, threshold=4.213e+02, percent-clipped=0.0 2023-10-02 07:05:44,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 07:05:44,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:05:44,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:05:50,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:05:50,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:05:50,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:05:51,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:05:53,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:05:55,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:05:55,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:05:55,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:05:56,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:59,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:06:00,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 07:06:04,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:06:06,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:10,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:10,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:06:13,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:14,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:06:14,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:16,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:06:17,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:06:18,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:20,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:21,945 INFO [train.py:1046] (1/4) Epoch 23, batch 2050, loss[loss=0.1916, simple_loss=0.2737, pruned_loss=0.05472, over 24002.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2496, pruned_loss=0.0483, over 4718959.54 frames. ], batch size: 80, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 07:06:23,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:06:24,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:29,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:06:29,861 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:06:31,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:06:32,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:34,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:06:35,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 07:06:35,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:06:37,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:06:37,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:06:45,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:06:45,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:49,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 07:06:50,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:50,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 07:06:51,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:06:55,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:06:58,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:07:00,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 07:07:01,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:07:03,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:07:03,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=792906.6666666666, ans=0.0 2023-10-02 07:07:04,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:07:04,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:07:07,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:07:09,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:07:11,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:07:11,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:07:15,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:07:19,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:07:21,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 07:07:27,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:07:28,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:07:31,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=793040.0, ans=0.125 2023-10-02 07:07:32,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:07:33,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 07:07:36,276 INFO [train.py:1046] (1/4) Epoch 23, batch 2100, loss[loss=0.1512, simple_loss=0.2281, pruned_loss=0.03712, over 24421.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2484, pruned_loss=0.04785, over 4712518.00 frames. ], batch size: 58, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:07:38,166 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 07:07:38,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:07:38,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:07:39,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:07:42,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:07:42,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 07:07:42,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 07:07:43,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:07:44,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:07:45,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:07:47,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:07:49,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:07:49,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 07:07:49,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:07:49,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 07:07:49,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 07:07:50,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:07:50,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:07:51,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 07:07:52,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 07:07:56,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 07:07:56,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:08:02,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:08:02,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:08:06,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:08:06,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 07:08:06,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:08:06,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 07:08:08,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=793240.0, ans=0.035 2023-10-02 07:08:09,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 07:08:09,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:08:09,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 07:08:09,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 07:08:10,401 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.40 vs. limit=15.0 2023-10-02 07:08:11,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 07:08:11,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:08:12,674 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.417e+02 1.829e+02 2.013e+02 2.459e+02 4.112e+02, threshold=4.025e+02, percent-clipped=0.0 2023-10-02 07:08:12,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:08:15,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:08:16,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:08:18,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:08:18,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=793240.0, ans=0.0 2023-10-02 07:08:19,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:08:19,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 07:08:19,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:08:19,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:08:21,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:08:21,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 07:08:23,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 07:08:23,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 07:08:28,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:08:32,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:08:32,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 07:08:32,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=793306.6666666666, ans=0.125 2023-10-02 07:08:39,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:08:41,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:08:42,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:08:42,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:08:43,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 07:08:43,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:08:44,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:08:44,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:08:45,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:08:47,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:08:48,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 07:08:50,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 07:08:50,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:08:51,397 INFO [train.py:1046] (1/4) Epoch 23, batch 2150, loss[loss=0.1931, simple_loss=0.2617, pruned_loss=0.06228, over 23834.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2471, pruned_loss=0.04758, over 4693222.05 frames. ], batch size: 195, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:08:53,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:08:53,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:08:53,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:08:54,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:08:58,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 07:09:01,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:09:01,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:09:05,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:09:05,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:06,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:09:11,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:09:11,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:09:11,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:09:14,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=793506.6666666666, ans=0.125 2023-10-02 07:09:15,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:15,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 07:09:19,343 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.81 vs. limit=22.5 2023-10-02 07:09:20,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:09:20,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:09:21,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:21,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:09:22,248 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.84 vs. limit=15.0 2023-10-02 07:09:22,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:23,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:09:24,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:09:24,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:09:24,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:09:24,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=793573.3333333334, ans=0.125 2023-10-02 07:09:25,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 07:09:27,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:09:28,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:09:28,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:09:29,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:09:31,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:09:34,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:09:34,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:09:34,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:09:34,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 07:09:34,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:09:38,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:09:40,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:41,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:09:41,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=793640.0, ans=0.125 2023-10-02 07:09:42,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:09:42,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:09:44,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:44,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 07:09:44,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=793640.0, ans=0.0 2023-10-02 07:09:45,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 07:09:45,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:09:46,985 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 07:09:48,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:09:48,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:09:48,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 07:09:48,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:09:48,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 07:09:48,473 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 07:09:48,473 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 07:09:48,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 07:09:51,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:09:51,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:09:51,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:09:51,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:52,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:09:52,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:09:54,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:10:01,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:10:02,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 07:10:06,057 INFO [train.py:1046] (1/4) Epoch 23, batch 2200, loss[loss=0.1874, simple_loss=0.2698, pruned_loss=0.05246, over 24403.00 frames. ], tot_loss[loss=0.171, simple_loss=0.247, pruned_loss=0.04752, over 4696759.62 frames. ], batch size: 77, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:10:08,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:10:12,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:10:12,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:10:14,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:10:14,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:10:17,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:10:17,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:10:17,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 07:10:22,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 07:10:24,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:10:28,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 07:10:31,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:10:32,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:10:34,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:10:37,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:10:37,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 07:10:42,796 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.834e+02 1.960e+02 2.207e+02 4.164e+02, threshold=3.921e+02, percent-clipped=1.0 2023-10-02 07:10:42,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:10:44,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:10:44,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 07:10:48,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:10:49,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:10:51,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:10:52,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:10:54,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 07:10:55,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:10:58,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 07:10:59,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:10:59,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:11:00,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:11:02,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:11:02,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:11:02,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:11:02,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:11:03,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 07:11:03,975 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:11:05,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:11:07,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:11:10,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 07:11:10,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:11:13,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:11:14,929 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 07:11:16,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:11:16,447 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 07:11:17,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 07:11:19,192 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 07:11:20,462 INFO [train.py:1046] (1/4) Epoch 23, batch 2250, loss[loss=0.165, simple_loss=0.2542, pruned_loss=0.0379, over 24310.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2473, pruned_loss=0.04751, over 4706539.89 frames. ], batch size: 74, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:11:21,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:11:21,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:11:23,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:11:24,737 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 07:11:26,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:11:28,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:11:34,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:11:34,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:11:37,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:11:38,434 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.87 vs. limit=12.0 2023-10-02 07:11:39,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:11:40,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:11:42,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 07:11:42,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:11:42,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:11:44,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 07:11:44,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:11:44,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:11:44,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=794173.3333333334, ans=0.125 2023-10-02 07:11:46,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:11:51,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:11:51,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 07:11:52,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 07:11:54,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 07:11:55,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:11:57,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:11:57,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=794240.0, ans=0.0 2023-10-02 07:12:00,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:12:02,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:12:04,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:12:04,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:12:07,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:12:07,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=794306.6666666666, ans=0.125 2023-10-02 07:12:09,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:12:13,021 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.90 vs. limit=15.0 2023-10-02 07:12:13,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:12:17,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 07:12:22,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:12:22,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:12:22,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:12:26,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 07:12:29,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:12:29,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 07:12:29,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:12:31,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:12:33,777 INFO [train.py:1046] (1/4) Epoch 23, batch 2300, loss[loss=0.1933, simple_loss=0.2625, pruned_loss=0.06202, over 22712.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2486, pruned_loss=0.04771, over 4711742.99 frames. ], batch size: 322, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:12:33,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 07:12:36,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:12:36,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:12:42,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:12:44,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:12:46,381 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 07:12:46,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=794440.0, ans=0.125 2023-10-02 07:12:47,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:12:51,897 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.80 vs. limit=22.5 2023-10-02 07:12:52,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=794506.6666666666, ans=0.0 2023-10-02 07:12:55,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:12:55,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 07:12:55,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:12:56,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:12:56,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 07:12:56,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:12:59,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:13:00,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:13:03,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:13:04,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:13:05,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=794573.3333333334, ans=0.125 2023-10-02 07:13:07,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:13:08,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=794573.3333333334, ans=0.125 2023-10-02 07:13:09,518 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.881e+02 2.032e+02 2.329e+02 3.115e+02, threshold=4.065e+02, percent-clipped=0.0 2023-10-02 07:13:09,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=794573.3333333334, ans=0.125 2023-10-02 07:13:14,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:13:14,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:13:17,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:13:20,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:13:23,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:13:23,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:13:25,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:13:25,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 07:13:28,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 07:13:28,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:13:29,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:13:29,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:13:29,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:13:30,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 07:13:30,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:13:32,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 07:13:32,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:13:32,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:13:33,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 07:13:36,849 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:13:37,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:13:39,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:13:44,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=794706.6666666666, ans=0.125 2023-10-02 07:13:46,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:13:47,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:13:47,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:13:48,592 INFO [train.py:1046] (1/4) Epoch 23, batch 2350, loss[loss=0.2016, simple_loss=0.2802, pruned_loss=0.06156, over 23343.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2491, pruned_loss=0.04791, over 4721060.15 frames. ], batch size: 93, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:13:48,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:13:48,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:13:48,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:13:50,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 07:13:50,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=794773.3333333334, ans=0.2 2023-10-02 07:13:56,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:13:56,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 07:13:56,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=794773.3333333334, ans=0.025 2023-10-02 07:14:01,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 07:14:02,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=794840.0, ans=0.04949747468305833 2023-10-02 07:14:04,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:14:08,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:14:08,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:14:08,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:14:08,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:14:10,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 07:14:11,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:14:17,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 07:14:18,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:14:21,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:14:21,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:14:22,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=794906.6666666666, ans=0.0 2023-10-02 07:14:24,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:14:26,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 07:14:26,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:14:27,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:14:27,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:14:27,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:14:30,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:14:31,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 07:14:33,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:14:34,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:14:34,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:14:37,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 07:14:37,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=794973.3333333334, ans=0.125 2023-10-02 07:14:39,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:14:43,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 07:14:43,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:14:47,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 07:14:51,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 07:14:52,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:14:52,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 07:14:52,824 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 07:14:52,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 07:14:54,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 07:14:57,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:15:01,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:15:02,811 INFO [train.py:1046] (1/4) Epoch 23, batch 2400, loss[loss=0.1855, simple_loss=0.2473, pruned_loss=0.06182, over 23777.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2491, pruned_loss=0.04838, over 4707684.08 frames. ], batch size: 179, lr: 4.47e-03, grad_scale: 32.0 2023-10-02 07:15:05,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:15:07,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:15:09,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 07:15:09,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 07:15:16,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:15:16,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:15:18,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 07:15:20,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:15:20,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:15:21,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 07:15:27,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:15:29,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 07:15:30,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=795173.3333333334, ans=0.07 2023-10-02 07:15:33,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:15:36,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 07:15:37,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:15:39,698 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.839e+02 2.014e+02 2.319e+02 3.519e+02, threshold=4.028e+02, percent-clipped=0.0 2023-10-02 07:15:40,295 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.61 vs. limit=22.5 2023-10-02 07:15:41,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:15:43,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:15:44,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 07:15:44,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:15:51,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:15:54,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:15:57,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:15:59,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:15:59,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 07:15:59,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:15:59,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:15:59,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:16:00,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:16:03,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:16:05,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:16:05,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 07:16:07,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 07:16:08,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:16:08,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:16:10,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 07:16:10,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 07:16:10,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 07:16:10,572 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 07:16:11,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 07:16:13,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:16:14,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:16:14,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:16:15,964 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 07:16:17,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:16:18,740 INFO [train.py:1046] (1/4) Epoch 23, batch 2450, loss[loss=0.1598, simple_loss=0.2082, pruned_loss=0.0557, over 18928.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2465, pruned_loss=0.04813, over 4675582.19 frames. ], batch size: 388, lr: 4.47e-03, grad_scale: 32.0 2023-10-02 07:16:18,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 07:16:22,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:16:22,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:16:26,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:16:26,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:16:27,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 07:16:32,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:16:32,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:16:36,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:16:36,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:16:36,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:16:37,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 07:16:41,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:16:44,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:16:44,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:16:47,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:16:47,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:16:49,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:16:49,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:16:50,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=795573.3333333334, ans=0.2 2023-10-02 07:16:52,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 07:16:53,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:17:01,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:17:02,258 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:17:03,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:17:03,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:17:05,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:17:05,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:17:06,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:17:06,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 07:17:09,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:17:11,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:17:14,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:17:14,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:17:17,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:17:19,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 07:17:19,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:17:20,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:17:20,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 07:17:22,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:17:23,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:17:25,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:17:27,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:17:27,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:17:30,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 07:17:30,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=795706.6666666666, ans=0.2 2023-10-02 07:17:32,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:17:33,346 INFO [train.py:1046] (1/4) Epoch 23, batch 2500, loss[loss=0.1654, simple_loss=0.2491, pruned_loss=0.04091, over 24487.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2463, pruned_loss=0.048, over 4685485.63 frames. ], batch size: 66, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:17:38,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=795773.3333333334, ans=0.2 2023-10-02 07:17:39,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:17:43,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=795773.3333333334, ans=0.0 2023-10-02 07:17:47,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:17:49,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:17:50,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:17:50,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 07:17:56,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:17:58,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:17:58,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 07:17:58,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 07:17:58,638 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.42 vs. limit=22.5 2023-10-02 07:17:59,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 07:18:00,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:18:02,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:18:03,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 07:18:03,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:18:03,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 07:18:04,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:18:09,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:18:09,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:18:11,067 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.892e+02 2.068e+02 2.319e+02 3.851e+02, threshold=4.136e+02, percent-clipped=0.0 2023-10-02 07:18:12,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:18:12,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=795906.6666666666, ans=0.0 2023-10-02 07:18:12,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=795906.6666666666, ans=0.1 2023-10-02 07:18:14,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 07:18:14,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:18:16,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:18:18,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=795973.3333333334, ans=0.2 2023-10-02 07:18:19,002 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.69 vs. limit=12.0 2023-10-02 07:18:20,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:18:21,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=795973.3333333334, ans=0.125 2023-10-02 07:18:22,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:18:26,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:18:32,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:18:34,855 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.12 vs. limit=22.5 2023-10-02 07:18:35,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 07:18:35,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:18:35,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:18:36,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:18:36,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:18:38,191 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 07:18:38,192 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 07:18:38,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 07:18:40,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:18:42,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 07:18:42,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 07:18:44,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:18:46,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 07:18:49,413 INFO [train.py:1046] (1/4) Epoch 23, batch 2550, loss[loss=0.185, simple_loss=0.2657, pruned_loss=0.05216, over 24026.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2469, pruned_loss=0.04762, over 4698216.24 frames. ], batch size: 86, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:18:50,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 07:18:52,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:18:53,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:18:54,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=796106.6666666666, ans=0.2 2023-10-02 07:18:55,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:18:56,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:18:58,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 07:18:58,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:19:02,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 07:19:04,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:19:05,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:06,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:19:06,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 07:19:06,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:19:06,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:19:07,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=796173.3333333334, ans=0.0 2023-10-02 07:19:08,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:19:10,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:19:10,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 07:19:12,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:19:12,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:12,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 07:19:24,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:19:30,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:19:30,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:30,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:19:30,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:19:37,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:19:39,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:19:39,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:19:39,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:19:39,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 07:19:40,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:19:43,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:19:43,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:50,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:19:50,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 07:19:50,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:19:51,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:51,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=796373.3333333334, ans=0.125 2023-10-02 07:19:53,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:19:53,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:19:55,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:20:00,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:20:00,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=796373.3333333334, ans=0.2 2023-10-02 07:20:03,554 INFO [train.py:1046] (1/4) Epoch 23, batch 2600, loss[loss=0.1771, simple_loss=0.2518, pruned_loss=0.05121, over 23817.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2484, pruned_loss=0.04793, over 4708648.83 frames. ], batch size: 212, lr: 4.46e-03, grad_scale: 16.0 2023-10-02 07:20:03,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:20:06,384 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 07:20:09,142 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 07:20:09,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:20:09,186 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 07:20:09,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 07:20:10,559 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 07:20:12,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:20:12,089 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 07:20:13,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 07:20:13,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=796440.0, ans=0.025 2023-10-02 07:20:15,346 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 07:20:16,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:20:20,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 07:20:21,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=796506.6666666666, ans=0.125 2023-10-02 07:20:22,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 07:20:22,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:20:22,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 07:20:24,953 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 07:20:24,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 07:20:32,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:20:32,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:20:32,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:20:32,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 07:20:35,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:20:41,116 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.855e+02 2.030e+02 2.251e+02 3.622e+02, threshold=4.059e+02, percent-clipped=0.0 2023-10-02 07:20:41,266 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 07:20:41,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=796573.3333333334, ans=10.0 2023-10-02 07:20:47,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:20:48,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:20:50,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 07:20:50,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:20:50,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:20:52,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 07:20:55,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:20:56,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:20:58,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:21:01,135 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 07:21:01,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:21:01,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=796640.0, ans=0.2 2023-10-02 07:21:02,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:21:06,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:21:06,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:21:06,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 07:21:07,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:21:09,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:21:09,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:21:15,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 07:21:15,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:21:17,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:21:18,619 INFO [train.py:1046] (1/4) Epoch 23, batch 2650, loss[loss=0.1767, simple_loss=0.2626, pruned_loss=0.04542, over 23974.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2494, pruned_loss=0.04866, over 4703645.07 frames. ], batch size: 80, lr: 4.46e-03, grad_scale: 16.0 2023-10-02 07:21:22,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 07:21:22,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:21:24,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:21:24,209 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 07:21:24,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:21:26,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:21:28,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:21:29,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:21:32,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:21:32,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 07:21:32,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:21:32,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:21:37,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 07:21:38,632 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 07:21:41,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:21:44,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 07:21:44,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:21:44,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 07:21:48,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:21:48,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:21:48,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:21:48,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:21:51,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=796906.6666666666, ans=0.125 2023-10-02 07:21:55,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 07:21:55,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 07:21:56,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:22:01,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 07:22:01,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:22:03,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:03,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:22:04,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:22:04,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:22:06,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:22:08,428 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.26 vs. limit=10.0 2023-10-02 07:22:08,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:22:10,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:22:10,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:22:12,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:22:13,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:22:14,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:22:15,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:22:17,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:22:17,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 07:22:20,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:22,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:22:22,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:22:22,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 07:22:28,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:22:28,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=797040.0, ans=0.1 2023-10-02 07:22:29,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:29,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:30,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:22:30,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:22:32,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:22:33,650 INFO [train.py:1046] (1/4) Epoch 23, batch 2700, loss[loss=0.1627, simple_loss=0.2492, pruned_loss=0.03814, over 24667.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2496, pruned_loss=0.04874, over 4721044.90 frames. ], batch size: 73, lr: 4.46e-03, grad_scale: 16.0 2023-10-02 07:22:35,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:22:35,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 07:22:37,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:22:39,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 07:22:41,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:22:41,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:22:41,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:22:42,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=797106.6666666666, ans=0.1 2023-10-02 07:22:44,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:22:44,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:22:44,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:22:44,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 07:22:44,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 07:22:44,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:22:45,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:22:46,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:22:46,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:50,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:22:51,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 07:22:51,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=797173.3333333334, ans=0.125 2023-10-02 07:22:52,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:22:59,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:22:59,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:23:03,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:23:04,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:23:04,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:23:05,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:23:08,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:23:09,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:23:09,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:23:09,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:23:11,501 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.838e+02 2.027e+02 2.219e+02 3.329e+02, threshold=4.053e+02, percent-clipped=0.0 2023-10-02 07:23:14,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:23:14,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:23:23,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:23:23,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:23:27,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:23:27,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:23:29,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=797306.6666666666, ans=0.125 2023-10-02 07:23:30,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:23:31,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.28 vs. limit=15.0 2023-10-02 07:23:32,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:23:32,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:23:33,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:23:35,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:23:36,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:23:39,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:23:39,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:23:39,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:23:43,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 07:23:43,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:23:44,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=797373.3333333334, ans=0.1 2023-10-02 07:23:46,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:23:46,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 07:23:48,024 INFO [train.py:1046] (1/4) Epoch 23, batch 2750, loss[loss=0.1838, simple_loss=0.2617, pruned_loss=0.05294, over 23295.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2493, pruned_loss=0.04819, over 4734673.27 frames. ], batch size: 105, lr: 4.46e-03, grad_scale: 16.0 2023-10-02 07:23:49,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 07:23:49,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:23:50,443 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.90 vs. limit=15.0 2023-10-02 07:23:52,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:23:53,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:23:53,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=797440.0, ans=0.05 2023-10-02 07:23:53,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=797440.0, ans=0.0 2023-10-02 07:23:55,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:23:56,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:23:56,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:00,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:24:00,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 07:24:00,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:24:00,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:00,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 07:24:00,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:24:02,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:24:06,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 07:24:09,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:24:09,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:09,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:24:09,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:24:10,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:24:12,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:24:12,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:24:14,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:24:16,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:24:16,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 07:24:18,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:24:19,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:20,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:24:25,741 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.76 vs. limit=15.0 2023-10-02 07:24:27,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:24:30,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:24:30,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:24:34,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:34,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:24:36,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:24:40,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:24:40,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:24:40,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 07:24:45,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:24:45,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=797640.0, ans=0.2 2023-10-02 07:24:47,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 07:24:51,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 07:24:54,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:24:55,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 07:24:55,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:24:56,160 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.44 vs. limit=6.0 2023-10-02 07:24:56,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:24:58,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 07:24:58,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:25:01,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 07:25:02,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:25:03,238 INFO [train.py:1046] (1/4) Epoch 23, batch 2800, loss[loss=0.1776, simple_loss=0.2466, pruned_loss=0.05436, over 23370.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2467, pruned_loss=0.04763, over 4716737.47 frames. ], batch size: 119, lr: 4.46e-03, grad_scale: 32.0 2023-10-02 07:25:03,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:25:03,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 07:25:03,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:04,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:25:04,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:06,105 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 07:25:06,106 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 07:25:08,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:25:10,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:25:10,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:25:14,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:25:16,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 07:25:17,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 07:25:17,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 07:25:20,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:25:20,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:25:20,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:25:25,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:25:25,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:25:25,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 07:25:25,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:25:25,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=797840.0, ans=0.05 2023-10-02 07:25:36,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:25:36,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=797906.6666666666, ans=0.1 2023-10-02 07:25:37,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:25:37,906 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=797906.6666666666, ans=0.04949747468305833 2023-10-02 07:25:40,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:40,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:25:41,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:25:43,045 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.426e+02 1.852e+02 2.120e+02 2.350e+02 3.658e+02, threshold=4.240e+02, percent-clipped=0.0 2023-10-02 07:25:47,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:25:47,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 07:25:49,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:25:50,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:25:50,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:25:53,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=797973.3333333334, ans=0.125 2023-10-02 07:25:54,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:25:56,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:58,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:25:59,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:25:59,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:59,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:26:01,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:26:01,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:26:03,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:26:03,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 07:26:03,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:26:05,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:26:06,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:26:07,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 07:26:07,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:26:07,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:26:08,193 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:26:09,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:26:09,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 07:26:14,212 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.55 vs. limit=15.0 2023-10-02 07:26:15,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:26:16,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:26:16,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:26:18,364 INFO [train.py:1046] (1/4) Epoch 23, batch 2850, loss[loss=0.1478, simple_loss=0.2261, pruned_loss=0.03478, over 24371.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2467, pruned_loss=0.04726, over 4721777.42 frames. ], batch size: 56, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:26:18,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:26:21,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=798106.6666666666, ans=0.0 2023-10-02 07:26:22,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:26:22,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:26:22,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:26:25,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:26:25,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:26:28,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:26:28,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 07:26:33,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=798173.3333333334, ans=0.125 2023-10-02 07:26:35,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 07:26:35,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:26:37,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 07:26:37,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:26:39,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=798173.3333333334, ans=0.125 2023-10-02 07:26:40,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 07:26:40,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 07:26:43,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:26:52,743 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=798240.0, ans=0.1 2023-10-02 07:26:55,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:26:56,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:26:56,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:26:57,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:26:57,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:26:57,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:26:59,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:27:00,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 07:27:02,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:27:04,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:27:04,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:27:06,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:27:07,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:27:08,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:27:08,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:27:11,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:27:11,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:27:13,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:27:13,596 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.58 vs. limit=15.0 2023-10-02 07:27:14,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:27:17,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:27:20,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:27:22,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 07:27:23,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 07:27:23,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:27:24,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:27:24,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 07:27:25,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:27:26,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:27:26,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:27:26,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:27:26,388 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 07:27:27,679 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 07:27:27,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:27:27,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:27:32,441 INFO [train.py:1046] (1/4) Epoch 23, batch 2900, loss[loss=0.1838, simple_loss=0.2548, pruned_loss=0.0564, over 23756.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2467, pruned_loss=0.04711, over 4726495.74 frames. ], batch size: 195, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:27:34,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 07:27:34,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:27:34,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:27:36,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 07:27:38,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:27:40,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 07:27:40,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 07:27:42,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:27:42,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:27:44,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:27:47,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:27:50,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=798506.6666666666, ans=0.0 2023-10-02 07:27:51,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:27:51,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:27:54,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:27:54,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 07:27:54,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:27:55,532 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.54 vs. limit=15.0 2023-10-02 07:27:56,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:27:58,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 07:27:58,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 07:28:03,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:28:03,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 07:28:03,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:28:06,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:28:06,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 07:28:08,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:28:09,121 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=798573.3333333334, ans=0.1 2023-10-02 07:28:10,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:28:13,037 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.877e+02 2.119e+02 2.424e+02 3.198e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-02 07:28:13,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:28:14,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:28:17,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 07:28:17,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 07:28:17,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:28:19,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=798640.0, ans=0.125 2023-10-02 07:28:19,685 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.39 vs. limit=15.0 2023-10-02 07:28:20,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:28:23,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 07:28:24,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:28:28,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:28:29,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=798640.0, ans=0.125 2023-10-02 07:28:32,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=798706.6666666666, ans=0.125 2023-10-02 07:28:37,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:28:37,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:28:39,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 07:28:40,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=798706.6666666666, ans=0.125 2023-10-02 07:28:43,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:28:43,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 07:28:43,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:28:44,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:28:47,406 INFO [train.py:1046] (1/4) Epoch 23, batch 2950, loss[loss=0.174, simple_loss=0.2442, pruned_loss=0.05191, over 23346.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.248, pruned_loss=0.04731, over 4730150.98 frames. ], batch size: 119, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:28:50,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:28:52,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 07:28:52,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:28:52,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:28:53,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=798773.3333333334, ans=0.125 2023-10-02 07:28:56,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:28:57,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:28:57,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 07:28:59,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 07:28:59,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:28:59,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:29:05,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:29:06,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:29:07,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:29:08,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=798840.0, ans=0.125 2023-10-02 07:29:09,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:29:12,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:29:12,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:29:14,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:29:15,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:29:15,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:29:18,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 07:29:21,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=798906.6666666666, ans=0.125 2023-10-02 07:29:22,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 07:29:22,391 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 07:29:23,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:29:25,685 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 07:29:27,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 07:29:28,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:29:28,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:29:28,503 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 07:29:28,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:29:30,095 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=798973.3333333334, ans=0.1 2023-10-02 07:29:31,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 07:29:31,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:29:32,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:29:34,683 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.52 vs. limit=15.0 2023-10-02 07:29:35,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:29:35,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:29:35,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:29:36,389 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.26 vs. limit=22.5 2023-10-02 07:29:37,489 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 07:29:37,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:29:37,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 07:29:42,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=798973.3333333334, ans=0.2 2023-10-02 07:29:44,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:29:45,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:29:45,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=799040.0, ans=0.0 2023-10-02 07:29:46,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 07:29:46,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:29:48,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 07:29:50,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:29:52,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:29:52,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:29:53,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:29:53,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 07:29:54,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:29:55,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:29:55,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:29:56,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:29:56,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:29:57,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:29:58,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:29:58,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 07:30:00,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:30:01,205 INFO [train.py:1046] (1/4) Epoch 23, batch 3000, loss[loss=0.189, simple_loss=0.2636, pruned_loss=0.05723, over 23613.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2493, pruned_loss=0.04791, over 4730906.46 frames. ], batch size: 256, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:30:01,205 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 07:30:13,764 INFO [train.py:1078] (1/4) Epoch 23, validation: loss=0.3132, simple_loss=0.2719, pruned_loss=0.1772, over 1125622.00 frames. 2023-10-02 07:30:13,764 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20387MB 2023-10-02 07:30:15,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:30:16,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:30:19,297 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 07:30:19,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 07:30:22,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:30:22,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:30:23,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 07:30:23,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:30:25,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=799106.6666666666, ans=0.125 2023-10-02 07:30:28,821 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.36 vs. limit=15.0 2023-10-02 07:30:30,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:30:32,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=799173.3333333334, ans=0.035 2023-10-02 07:30:40,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:30:45,988 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.05 vs. limit=12.0 2023-10-02 07:30:46,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 07:30:49,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:30:52,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:30:52,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:30:52,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:30:54,303 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.810e+02 1.967e+02 2.227e+02 3.390e+02, threshold=3.934e+02, percent-clipped=0.0 2023-10-02 07:30:54,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:30:55,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 07:30:55,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 07:30:57,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:30:57,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 07:30:59,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:31:00,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:31:00,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:31:00,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:31:04,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:31:04,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:31:04,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:31:05,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:31:08,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 07:31:10,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:31:11,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:31:11,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:31:15,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=799373.3333333334, ans=0.0 2023-10-02 07:31:18,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:31:18,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:31:19,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 07:31:21,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 07:31:21,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:31:21,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 07:31:21,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=799373.3333333334, ans=0.125 2023-10-02 07:31:22,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:31:22,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 07:31:25,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:31:27,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:31:28,389 INFO [train.py:1046] (1/4) Epoch 23, batch 3050, loss[loss=0.1737, simple_loss=0.2447, pruned_loss=0.05139, over 23591.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2503, pruned_loss=0.04855, over 4721168.71 frames. ], batch size: 135, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:31:28,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 07:31:28,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 07:31:28,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:31:29,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:31:31,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:31:31,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:31:31,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:31:32,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:31:34,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 07:31:35,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:31:37,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:31:37,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:31:37,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=799440.0, ans=0.2 2023-10-02 07:31:40,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:31:41,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=799440.0, ans=0.04949747468305833 2023-10-02 07:31:42,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 07:31:46,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=799506.6666666666, ans=0.0 2023-10-02 07:31:49,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 07:31:49,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 07:31:51,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:31:53,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:31:57,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:31:57,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:31:58,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:32:01,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:32:01,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:32:02,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:32:02,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:32:02,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:32:02,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:32:05,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:32:08,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:32:08,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 07:32:09,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:32:09,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:32:11,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:32:12,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:32:14,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:32:14,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:32:20,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:32:20,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:32:26,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:32:26,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:32:26,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:32:29,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:32:29,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 07:32:31,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:32:32,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 07:32:33,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:32:33,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:32:33,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 07:32:36,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:32:41,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:32:42,357 INFO [train.py:1046] (1/4) Epoch 23, batch 3100, loss[loss=0.1639, simple_loss=0.2304, pruned_loss=0.04869, over 23763.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2505, pruned_loss=0.04823, over 4727035.13 frames. ], batch size: 232, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:32:42,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:32:46,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:32:47,119 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.39 vs. limit=6.0 2023-10-02 07:32:47,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 07:32:50,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 07:32:51,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 07:32:53,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:32:57,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:32:57,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:32:58,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 07:33:01,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=799840.0, ans=0.125 2023-10-02 07:33:02,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:33:07,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 07:33:12,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 07:33:14,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:14,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:33:14,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:33:15,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 07:33:18,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:33:19,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 07:33:19,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:33:19,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:33:20,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 07:33:22,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:33:23,451 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.429e+02 1.736e+02 1.938e+02 2.240e+02 3.003e+02, threshold=3.876e+02, percent-clipped=0.0 2023-10-02 07:33:23,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:33:25,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 07:33:26,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 07:33:27,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:28,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:33:30,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:33:30,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:30,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:33:31,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:33:31,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:33:37,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:33:37,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:33:37,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:37,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 07:33:42,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:33:44,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 07:33:45,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:33:46,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 07:33:46,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:33:48,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:48,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 07:34:00,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 07:34:01,945 INFO [train.py:1046] (1/4) Epoch 23, batch 3150, loss[loss=0.1774, simple_loss=0.2517, pruned_loss=0.0516, over 23532.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2487, pruned_loss=0.04791, over 4714081.37 frames. ], batch size: 106, lr: 4.45e-03, grad_scale: 8.0 2023-10-02 07:34:03,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:34:03,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:34:04,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:34:04,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:34:06,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 07:34:07,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:34:07,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 07:34:09,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 07:34:11,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:34:13,299 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 07:34:14,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 07:34:16,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:34:16,190 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 07:34:17,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 07:34:17,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 07:34:19,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 07:34:19,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 07:34:21,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:34:21,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:34:21,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:34:24,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 07:34:25,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:34:26,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:34:27,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:34:28,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:34:32,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 07:34:33,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:34:36,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:34:37,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:34:39,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 07:34:41,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 07:34:42,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:34:43,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 07:34:43,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 07:34:43,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:34:43,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:34:43,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=800240.0, ans=0.0 2023-10-02 07:34:44,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:34:44,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 07:34:46,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 07:34:46,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:34:46,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:34:47,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:34:47,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:34:49,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 07:34:49,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:34:51,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 07:34:51,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:34:53,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 07:34:53,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 07:34:54,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:34:55,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:34:57,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 07:34:58,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 07:34:58,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:34:58,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=800306.6666666666, ans=0.125 2023-10-02 07:35:02,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:35:02,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:35:02,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:35:07,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:35:09,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:35:10,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 07:35:16,082 INFO [train.py:1046] (1/4) Epoch 23, batch 3200, loss[loss=0.1792, simple_loss=0.2434, pruned_loss=0.0575, over 23579.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2474, pruned_loss=0.04767, over 4707558.16 frames. ], batch size: 256, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:35:16,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:35:16,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 07:35:20,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:35:22,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:35:22,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 07:35:25,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:35:29,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:35:32,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:35:38,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:35:47,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=800573.3333333334, ans=0.125 2023-10-02 07:35:49,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 07:35:50,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:35:54,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 07:35:55,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 07:35:56,708 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.871e+02 2.114e+02 2.432e+02 3.533e+02, threshold=4.228e+02, percent-clipped=0.0 2023-10-02 07:35:58,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:35:58,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:35:58,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:36:04,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 07:36:05,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 07:36:06,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 07:36:10,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 07:36:11,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:36:18,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:36:18,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:36:18,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:36:18,740 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 07:36:18,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:36:20,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=800706.6666666666, ans=0.025 2023-10-02 07:36:24,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:36:24,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=800706.6666666666, ans=0.0 2023-10-02 07:36:25,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 07:36:25,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=800706.6666666666, ans=0.1 2023-10-02 07:36:26,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 07:36:26,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 07:36:28,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 07:36:29,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:36:30,766 INFO [train.py:1046] (1/4) Epoch 23, batch 3250, loss[loss=0.1671, simple_loss=0.2512, pruned_loss=0.04148, over 24665.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2474, pruned_loss=0.04743, over 4721324.49 frames. ], batch size: 68, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:36:32,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:36:32,254 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 07:36:32,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:36:32,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:36:35,350 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 07:36:35,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=800773.3333333334, ans=0.0 2023-10-02 07:36:36,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:36:40,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:36:47,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:36:47,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 07:36:48,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:36:48,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:36:48,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:36:50,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:36:52,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:36:55,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:36:55,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:36:55,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_na.min_abs, batch_count=800840.0, ans=0.02 2023-10-02 07:36:55,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=800840.0, ans=0.0 2023-10-02 07:36:55,746 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.17 vs. limit=6.0 2023-10-02 07:36:56,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:36:56,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:36:56,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:36:56,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:36:59,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:36:59,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:37:02,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:37:02,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:37:03,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:37:03,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:37:05,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:37:05,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=800906.6666666666, ans=0.05 2023-10-02 07:37:10,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 07:37:10,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:37:10,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:37:10,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=800906.6666666666, ans=0.0 2023-10-02 07:37:10,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=800906.6666666666, ans=0.0 2023-10-02 07:37:11,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:37:11,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:37:17,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:37:18,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=800973.3333333334, ans=0.125 2023-10-02 07:37:25,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:37:25,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:37:25,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 07:37:25,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:37:25,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 07:37:26,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:37:27,358 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.94 vs. limit=15.0 2023-10-02 07:37:29,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 07:37:29,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 07:37:30,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:37:30,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:37:30,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:37:32,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 07:37:33,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:37:37,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:37:37,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:37:39,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 07:37:39,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:37:42,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:37:42,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 07:37:44,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=801106.6666666666, ans=0.125 2023-10-02 07:37:45,279 INFO [train.py:1046] (1/4) Epoch 23, batch 3300, loss[loss=0.1835, simple_loss=0.267, pruned_loss=0.04998, over 24151.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2484, pruned_loss=0.04796, over 4714036.22 frames. ], batch size: 80, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:37:45,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:37:45,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 07:37:48,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 07:37:48,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 07:37:49,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:37:52,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:37:53,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:37:53,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:37:55,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 07:37:55,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:37:58,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:38:01,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:38:02,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 07:38:04,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:38:04,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:38:06,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=801173.3333333334, ans=0.07 2023-10-02 07:38:07,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:07,344 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 07:38:07,905 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.74 vs. limit=15.0 2023-10-02 07:38:08,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:38:08,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 07:38:10,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:38:10,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:38:10,269 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 07:38:14,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:38:14,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:38:15,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:17,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 07:38:19,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 07:38:19,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:20,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:38:20,672 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 07:38:22,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 07:38:23,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:38:24,715 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.780e+02 1.961e+02 2.106e+02 2.870e+02, threshold=3.921e+02, percent-clipped=0.0 2023-10-02 07:38:26,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 07:38:28,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:38:29,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:38:31,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:38:33,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:38:33,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:38:33,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:38:35,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:38:37,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:38:37,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:38,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:38:41,377 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 07:38:41,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 07:38:41,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=801306.6666666666, ans=0.0 2023-10-02 07:38:44,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:38:44,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:38:44,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:38:45,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:38:45,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:38:47,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:38:47,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:38:48,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:38:48,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:51,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:38:51,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=801373.3333333334, ans=0.1 2023-10-02 07:38:54,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 07:38:56,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:38:56,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:38:59,503 INFO [train.py:1046] (1/4) Epoch 23, batch 3350, loss[loss=0.1529, simple_loss=0.2288, pruned_loss=0.03851, over 24420.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2492, pruned_loss=0.04805, over 4711588.63 frames. ], batch size: 58, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:38:59,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:38:59,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:39:00,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:39:03,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:39:03,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:05,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:39:05,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:06,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:39:09,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:11,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:39:11,717 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.51 vs. limit=6.0 2023-10-02 07:39:12,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:39:12,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:39:13,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 07:39:16,811 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 07:39:16,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:39:19,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 07:39:19,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 07:39:20,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:39:20,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:39:22,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:39:22,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 07:39:24,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:24,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:39:27,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:29,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:29,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:29,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:39:32,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:39:33,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:34,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:39:34,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=801573.3333333334, ans=0.2 2023-10-02 07:39:37,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:39:39,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:42,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:42,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:39:44,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:39:47,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 07:39:47,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:39:47,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 07:39:49,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:39:49,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 07:39:50,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:39:52,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:55,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=801640.0, ans=0.1 2023-10-02 07:39:59,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:39:59,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 07:40:01,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:40:01,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=801706.6666666666, ans=0.125 2023-10-02 07:40:03,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:40:03,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:40:08,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:40:09,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 07:40:09,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=801706.6666666666, ans=0.125 2023-10-02 07:40:11,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:40:11,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:40:12,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:40:12,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 07:40:12,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:40:12,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 07:40:13,875 INFO [train.py:1046] (1/4) Epoch 23, batch 3400, loss[loss=0.1718, simple_loss=0.2392, pruned_loss=0.05223, over 22687.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2502, pruned_loss=0.04827, over 4706739.89 frames. ], batch size: 322, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:40:14,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=801773.3333333334, ans=0.0 2023-10-02 07:40:15,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:40:15,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:40:15,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:40:17,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:40:17,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 07:40:20,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 07:40:20,916 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 07:40:20,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:40:25,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:40:26,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:40:26,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:40:27,350 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.46 vs. limit=15.0 2023-10-02 07:40:28,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:40:30,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=801840.0, ans=0.1 2023-10-02 07:40:33,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=801840.0, ans=0.5 2023-10-02 07:40:34,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:40:34,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 07:40:36,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=801840.0, ans=0.125 2023-10-02 07:40:38,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:40:42,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:40:42,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:40:43,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 07:40:49,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:40:54,708 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.476e+02 1.893e+02 2.076e+02 2.373e+02 3.275e+02, threshold=4.152e+02, percent-clipped=0.0 2023-10-02 07:40:54,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 07:40:54,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=801906.6666666666, ans=0.125 2023-10-02 07:40:55,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=801906.6666666666, ans=0.0 2023-10-02 07:41:00,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:41:01,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:41:02,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 07:41:02,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:41:02,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:41:04,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:41:04,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:41:05,838 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:41:07,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:41:08,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=801973.3333333334, ans=0.0 2023-10-02 07:41:11,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:41:11,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:41:11,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=802040.0, ans=0.09899494936611666 2023-10-02 07:41:17,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:41:18,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 07:41:23,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 07:41:25,231 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.29 vs. limit=15.0 2023-10-02 07:41:26,580 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.00 vs. limit=10.0 2023-10-02 07:41:27,213 INFO [train.py:1046] (1/4) Epoch 23, batch 3450, loss[loss=0.1824, simple_loss=0.2508, pruned_loss=0.05705, over 23782.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2507, pruned_loss=0.04863, over 4713138.19 frames. ], batch size: 164, lr: 4.45e-03, grad_scale: 8.0 2023-10-02 07:41:27,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 07:41:30,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 07:41:30,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:41:32,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:41:34,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 07:41:34,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:41:38,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:41:42,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:41:45,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:41:45,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:41:45,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:41:47,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:41:52,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 07:41:58,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 07:41:59,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 07:41:59,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:42:00,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:42:07,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 07:42:07,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:42:11,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:42:12,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:42:12,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:42:14,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:42:17,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 07:42:17,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:42:17,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:42:20,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:42:23,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 07:42:25,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:42:30,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:42:32,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:42:35,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:42:40,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:42:40,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:42:40,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:42:41,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:42:42,235 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.09 vs. limit=15.0 2023-10-02 07:42:43,278 INFO [train.py:1046] (1/4) Epoch 23, batch 3500, loss[loss=0.1553, simple_loss=0.2209, pruned_loss=0.04484, over 23706.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2487, pruned_loss=0.04811, over 4697872.06 frames. ], batch size: 232, lr: 4.45e-03, grad_scale: 8.0 2023-10-02 07:42:44,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:42:47,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:42:48,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 07:42:48,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 07:42:53,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 07:42:57,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:42:57,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 07:43:01,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:43:01,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:43:03,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:43:03,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:43:03,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:43:04,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:04,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:43:04,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=802506.6666666666, ans=0.125 2023-10-02 07:43:05,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 07:43:06,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:06,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:43:09,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:43:09,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=802506.6666666666, ans=0.125 2023-10-02 07:43:10,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=802573.3333333334, ans=0.1 2023-10-02 07:43:13,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:13,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 07:43:14,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:43:14,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=802573.3333333334, ans=0.1 2023-10-02 07:43:16,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:43:18,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:43:19,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:19,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:43:19,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:43:21,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 07:43:22,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 07:43:22,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 07:43:23,848 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.940e+02 2.168e+02 2.529e+02 4.138e+02, threshold=4.336e+02, percent-clipped=0.0 2023-10-02 07:43:23,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:43:24,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:25,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:43:25,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:43:28,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:43:28,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:43:33,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:43:34,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 07:43:34,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 07:43:34,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:43:37,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:43:39,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:43:40,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:42,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=802706.6666666666, ans=0.1 2023-10-02 07:43:43,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 07:43:45,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:43:46,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:43:47,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 07:43:50,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 07:43:51,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:53,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:43:53,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:43:53,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:43:54,467 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.89 vs. limit=10.0 2023-10-02 07:43:56,094 INFO [train.py:1046] (1/4) Epoch 23, batch 3550, loss[loss=0.1503, simple_loss=0.2324, pruned_loss=0.03408, over 24674.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2467, pruned_loss=0.04777, over 4685451.29 frames. ], batch size: 65, lr: 4.45e-03, grad_scale: 8.0 2023-10-02 07:43:57,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:44:05,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:44:07,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 07:44:10,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:44:11,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:44:11,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=802840.0, ans=0.0 2023-10-02 07:44:12,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:44:12,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:44:14,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:44:16,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:44:17,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:44:17,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:44:18,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 07:44:20,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:44:21,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=802840.0, ans=0.025 2023-10-02 07:44:25,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:44:25,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:44:26,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:44:26,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:44:28,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:44:28,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 07:44:28,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:44:29,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=802906.6666666666, ans=0.1 2023-10-02 07:44:30,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:44:31,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 07:44:35,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:44:36,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:44:37,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:44:40,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 07:44:40,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:44:42,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 07:44:42,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:44:44,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:44:44,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:44:47,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 07:44:47,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=802973.3333333334, ans=0.125 2023-10-02 07:44:49,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:44:55,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:44:56,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 07:44:57,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:45:01,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:45:02,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 07:45:05,492 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.95 vs. limit=15.0 2023-10-02 07:45:07,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 07:45:07,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:45:09,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:45:10,511 INFO [train.py:1046] (1/4) Epoch 23, batch 3600, loss[loss=0.1578, simple_loss=0.2369, pruned_loss=0.03935, over 23623.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2469, pruned_loss=0.04775, over 4688717.83 frames. ], batch size: 149, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:45:10,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:45:10,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:45:12,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:45:14,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:45:16,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:45:18,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:45:18,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:45:19,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:45:19,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 07:45:23,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:45:24,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:45:26,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:45:29,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:45:31,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:45:31,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:45:31,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 07:45:33,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:45:36,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:45:36,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:45:37,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:45:39,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:45:40,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:45:42,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 07:45:50,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:45:52,130 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.380e+02 1.763e+02 1.994e+02 2.308e+02 3.230e+02, threshold=3.989e+02, percent-clipped=0.0 2023-10-02 07:45:52,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 07:45:53,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 07:45:54,158 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.48 vs. limit=22.5 2023-10-02 07:45:57,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:46:02,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:46:03,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:46:09,408 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.37 vs. limit=15.0 2023-10-02 07:46:10,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:46:11,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:46:11,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 07:46:12,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 07:46:13,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 07:46:15,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:46:15,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:46:17,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 07:46:17,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:46:18,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:46:18,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:46:20,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 07:46:20,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 07:46:20,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=803373.3333333334, ans=0.0 2023-10-02 07:46:24,919 INFO [train.py:1046] (1/4) Epoch 23, batch 3650, loss[loss=0.1596, simple_loss=0.2376, pruned_loss=0.04077, over 23473.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2475, pruned_loss=0.04772, over 4689404.41 frames. ], batch size: 106, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:46:24,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:46:25,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 07:46:27,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 07:46:29,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:46:35,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 07:46:37,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 07:46:37,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=803440.0, ans=0.0 2023-10-02 07:46:40,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:46:40,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:46:42,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:46:43,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:46:43,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:46:43,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 07:46:44,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:46:46,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:46:46,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 07:46:46,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:46:47,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:46:47,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:46:50,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:46:50,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=803506.6666666666, ans=0.125 2023-10-02 07:46:51,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 07:46:53,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 07:46:53,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:46:56,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 07:46:56,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:46:56,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:47:00,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=803573.3333333334, ans=0.05 2023-10-02 07:47:02,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:47:06,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:47:06,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:47:07,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:47:07,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:47:08,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:47:11,319 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.07 vs. limit=6.0 2023-10-02 07:47:12,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:47:14,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:47:14,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:47:14,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:47:15,036 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=803640.0, ans=0.2 2023-10-02 07:47:16,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:47:17,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:47:24,807 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 07:47:29,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:47:29,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:47:30,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:47:31,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:47:33,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:47:35,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:47:36,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 07:47:36,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:47:36,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=803706.6666666666, ans=0.1 2023-10-02 07:47:39,225 INFO [train.py:1046] (1/4) Epoch 23, batch 3700, loss[loss=0.1788, simple_loss=0.2454, pruned_loss=0.05608, over 23772.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2481, pruned_loss=0.0479, over 4703105.67 frames. ], batch size: 164, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:47:39,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:47:42,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:47:42,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:47:45,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:47:45,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 07:47:45,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:47:46,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 07:47:46,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:47:50,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 07:47:52,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:47:52,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:47:54,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:47:54,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=803840.0, ans=0.1 2023-10-02 07:47:55,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:47:56,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:47:59,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:48:00,434 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 07:48:03,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=803840.0, ans=0.125 2023-10-02 07:48:08,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:48:08,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 07:48:10,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:48:10,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 07:48:12,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:48:15,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:48:16,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 07:48:16,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:48:18,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:48:21,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:48:21,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:48:22,660 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.421e+02 1.851e+02 2.016e+02 2.342e+02 4.002e+02, threshold=4.032e+02, percent-clipped=1.0 2023-10-02 07:48:24,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 07:48:28,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:48:28,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 07:48:29,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:48:29,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 07:48:36,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:48:36,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:48:39,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:48:39,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 07:48:42,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:48:42,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:48:42,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:48:42,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:48:42,976 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.98 vs. limit=6.0 2023-10-02 07:48:45,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:48:45,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 07:48:45,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=804040.0, ans=0.2 2023-10-02 07:48:47,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 07:48:48,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:48:48,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:48:49,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:48:51,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:48:52,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:48:54,812 INFO [train.py:1046] (1/4) Epoch 23, batch 3750, loss[loss=0.1789, simple_loss=0.2597, pruned_loss=0.04903, over 23491.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.249, pruned_loss=0.04799, over 4712815.88 frames. ], batch size: 93, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:48:54,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:48:56,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:48:57,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 07:48:59,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 07:49:02,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:49:02,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 07:49:03,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:49:05,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:49:06,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:49:07,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:49:11,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:49:14,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:49:15,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:49:18,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:49:21,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:49:23,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 07:49:23,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:49:24,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:49:24,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:49:28,374 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.31 vs. limit=15.0 2023-10-02 07:49:29,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 07:49:32,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 07:49:32,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=804240.0, ans=0.125 2023-10-02 07:49:33,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:49:33,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:49:36,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:49:39,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:49:41,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:49:44,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 07:49:48,023 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.44 vs. limit=15.0 2023-10-02 07:49:48,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:49:51,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:49:52,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:49:55,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:49:59,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:50:00,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 07:50:02,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:50:03,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:50:07,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:50:08,918 INFO [train.py:1046] (1/4) Epoch 23, batch 3800, loss[loss=0.1504, simple_loss=0.2091, pruned_loss=0.04588, over 22790.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2492, pruned_loss=0.04797, over 4715851.28 frames. ], batch size: 322, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:50:09,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=804440.0, ans=0.07 2023-10-02 07:50:14,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:50:17,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:50:18,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:50:20,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 07:50:21,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:50:24,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:50:24,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 07:50:27,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 07:50:27,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:50:28,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:50:29,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:50:29,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:50:30,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:50:32,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 07:50:35,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 07:50:36,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:50:38,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:50:41,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:50:41,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 07:50:42,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:50:42,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:50:44,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:50:45,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:50:46,848 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.81 vs. limit=22.5 2023-10-02 07:50:49,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:50:49,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 07:50:52,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:50:53,401 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.883e+02 2.089e+02 2.519e+02 3.424e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 07:50:59,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:50:59,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=804640.0, ans=0.125 2023-10-02 07:51:04,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:51:05,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 07:51:06,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=804640.0, ans=0.0 2023-10-02 07:51:08,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 07:51:08,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:51:10,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:51:10,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:51:10,619 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.82 vs. limit=12.0 2023-10-02 07:51:11,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 07:51:16,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 07:51:16,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 07:51:16,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:51:18,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:51:23,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:51:23,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:51:23,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=804773.3333333334, ans=0.0 2023-10-02 07:51:25,057 INFO [train.py:1046] (1/4) Epoch 23, batch 3850, loss[loss=0.1574, simple_loss=0.2327, pruned_loss=0.04109, over 24301.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2482, pruned_loss=0.0474, over 4731921.49 frames. ], batch size: 56, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:51:28,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:51:29,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 07:51:31,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:51:31,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:51:35,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:51:37,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:51:40,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:51:41,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 07:51:48,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:51:49,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:51:51,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:51:51,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:51:55,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:51:57,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:51:57,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:51:57,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:51:59,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:00,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=804906.6666666666, ans=0.1 2023-10-02 07:52:01,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:01,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:52:03,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:52:03,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 07:52:05,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 07:52:05,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:52:06,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:52:09,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:10,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:52:10,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 07:52:13,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 07:52:14,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:16,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 07:52:18,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 07:52:23,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:24,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:52:28,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:28,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 07:52:30,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 07:52:30,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=805040.0, ans=0.125 2023-10-02 07:52:33,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:52:33,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:52:38,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:52:38,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:52:38,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:39,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:39,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:52:39,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 07:52:39,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:52:41,003 INFO [train.py:1046] (1/4) Epoch 23, batch 3900, loss[loss=0.1643, simple_loss=0.2476, pruned_loss=0.0405, over 24661.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2473, pruned_loss=0.04687, over 4743711.53 frames. ], batch size: 65, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:52:41,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 07:52:41,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:41,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:52:42,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:52:42,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:43,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:52:45,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:52:45,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:45,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:52:45,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 07:52:46,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:48,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=805106.6666666666, ans=0.1 2023-10-02 07:52:53,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:52:53,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:52:53,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:52:56,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:52:59,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:52:59,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:53:00,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:53:01,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 07:53:01,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:53:01,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 07:53:03,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:53:03,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 07:53:04,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 07:53:09,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:53:11,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:53:11,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:53:11,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:53:11,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=805240.0, ans=0.125 2023-10-02 07:53:14,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:53:16,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:53:18,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=805240.0, ans=0.035 2023-10-02 07:53:20,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:53:20,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:53:21,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:53:24,061 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.850e+02 2.030e+02 2.348e+02 4.332e+02, threshold=4.060e+02, percent-clipped=1.0 2023-10-02 07:53:27,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:53:28,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:53:34,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:53:35,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=805306.6666666666, ans=0.0 2023-10-02 07:53:36,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:53:45,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:53:48,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:53:48,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 07:53:49,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 07:53:49,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:53:51,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 07:53:51,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:53:52,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 07:53:53,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=805373.3333333334, ans=0.125 2023-10-02 07:53:56,305 INFO [train.py:1046] (1/4) Epoch 23, batch 3950, loss[loss=0.1856, simple_loss=0.273, pruned_loss=0.04906, over 24465.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2471, pruned_loss=0.04734, over 4722378.94 frames. ], batch size: 69, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:53:56,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=805440.0, ans=0.125 2023-10-02 07:53:57,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:53:59,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 07:53:59,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:54:02,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:54:04,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:54:09,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=805506.6666666666, ans=0.0 2023-10-02 07:54:10,849 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 07:54:10,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:54:12,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 07:54:12,267 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 07:54:13,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:54:16,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:54:16,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 07:54:16,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:54:19,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 07:54:22,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:54:23,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:54:23,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:54:25,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:54:26,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:54:31,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=805573.3333333334, ans=0.2 2023-10-02 07:54:37,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:54:37,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:54:40,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 07:54:42,658 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.11 vs. limit=22.5 2023-10-02 07:54:46,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 07:54:46,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 07:54:46,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:54:48,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:54:53,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=805640.0, ans=0.1 2023-10-02 07:54:54,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:54:54,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:54:55,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:54:55,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:54:55,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 07:54:59,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:55:01,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:55:05,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 07:55:11,778 INFO [train.py:1046] (1/4) Epoch 23, batch 4000, loss[loss=0.1752, simple_loss=0.2525, pruned_loss=0.04892, over 23698.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2481, pruned_loss=0.04804, over 4717857.25 frames. ], batch size: 149, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 07:55:13,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:55:16,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=805773.3333333334, ans=0.0 2023-10-02 07:55:19,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:55:19,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=805773.3333333334, ans=10.0 2023-10-02 07:55:24,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:55:24,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:55:26,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:55:26,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 07:55:28,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:55:29,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 07:55:29,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:55:29,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 07:55:30,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:55:33,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:55:33,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:55:33,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:55:33,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:55:33,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 07:55:35,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:55:36,549 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 07:55:36,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:55:36,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:55:40,146 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 07:55:41,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:55:42,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:55:42,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=805906.6666666666, ans=0.125 2023-10-02 07:55:46,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 07:55:47,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:55:50,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:55:50,973 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 07:55:51,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=805906.6666666666, ans=0.2 2023-10-02 07:55:52,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:55:52,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 07:55:52,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:55:53,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:55:55,532 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.792e+02 1.999e+02 2.170e+02 4.369e+02, threshold=3.999e+02, percent-clipped=1.0 2023-10-02 07:55:55,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:55:57,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:55:57,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 07:55:57,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:55:57,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=805973.3333333334, ans=0.025 2023-10-02 07:55:58,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=805973.3333333334, ans=0.1 2023-10-02 07:55:58,910 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:55:59,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 07:55:59,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:56:01,342 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 07:56:04,697 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.54 vs. limit=15.0 2023-10-02 07:56:06,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:56:08,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=805973.3333333334, ans=0.125 2023-10-02 07:56:09,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 07:56:10,405 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.73 vs. limit=6.0 2023-10-02 07:56:11,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:56:13,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:56:14,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:56:14,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:56:19,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:56:19,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=806040.0, ans=0.1 2023-10-02 07:56:19,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=806040.0, ans=0.0 2023-10-02 07:56:20,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 07:56:20,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 07:56:23,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:56:23,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:56:25,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:56:26,964 INFO [train.py:1046] (1/4) Epoch 23, batch 4050, loss[loss=0.146, simple_loss=0.2194, pruned_loss=0.0363, over 24364.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.249, pruned_loss=0.0482, over 4709231.25 frames. ], batch size: 56, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 07:56:27,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:56:27,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=806106.6666666666, ans=0.1 2023-10-02 07:56:28,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:56:31,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:56:32,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:56:33,324 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.73 vs. limit=22.5 2023-10-02 07:56:34,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:56:36,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:56:36,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:56:42,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:56:44,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:56:47,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 07:56:49,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 07:56:49,413 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 07:56:51,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=806173.3333333334, ans=0.1 2023-10-02 07:56:51,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=806173.3333333334, ans=0.04949747468305833 2023-10-02 07:56:52,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:56:57,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 07:56:58,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:57:01,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:57:04,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=806240.0, ans=0.1 2023-10-02 07:57:05,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:57:05,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:57:05,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:57:07,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:57:07,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=806240.0, ans=0.2 2023-10-02 07:57:08,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=806240.0, ans=0.95 2023-10-02 07:57:10,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 07:57:10,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:57:12,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:57:14,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 07:57:14,738 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:57:19,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:57:26,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 07:57:27,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:57:27,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:57:29,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 07:57:29,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 07:57:29,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:57:31,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:57:34,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:57:34,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:57:34,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=806373.3333333334, ans=0.125 2023-10-02 07:57:39,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 07:57:41,189 INFO [train.py:1046] (1/4) Epoch 23, batch 4100, loss[loss=0.1752, simple_loss=0.2565, pruned_loss=0.04698, over 24498.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2496, pruned_loss=0.04803, over 4713161.07 frames. ], batch size: 66, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 07:57:42,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 07:57:44,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 07:57:45,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 07:57:45,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:57:46,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:57:46,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:57:47,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:57:47,486 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 07:57:50,082 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.38 vs. limit=15.0 2023-10-02 07:57:52,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:57:52,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:57:52,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:57:52,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=806440.0, ans=0.125 2023-10-02 07:57:54,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:57:54,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=806440.0, ans=0.125 2023-10-02 07:57:58,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:58:00,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:58:00,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:58:01,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 07:58:01,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:58:03,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:58:03,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:58:03,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:58:03,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 07:58:05,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:58:07,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 07:58:08,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:58:10,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=806573.3333333334, ans=0.1 2023-10-02 07:58:11,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:58:11,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 07:58:12,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:58:13,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:58:13,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:58:16,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 07:58:17,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=806573.3333333334, ans=0.125 2023-10-02 07:58:18,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 07:58:18,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:58:20,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 07:58:20,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:58:22,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:58:25,105 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.962e+02 2.255e+02 2.740e+02 4.048e+02, threshold=4.511e+02, percent-clipped=1.0 2023-10-02 07:58:25,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:58:29,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:58:32,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:58:34,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:58:41,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:58:41,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:58:45,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:58:46,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:58:51,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:58:53,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:58:54,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:58:54,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:58:56,620 INFO [train.py:1046] (1/4) Epoch 23, batch 4150, loss[loss=0.1629, simple_loss=0.2511, pruned_loss=0.03729, over 24511.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2489, pruned_loss=0.04767, over 4717136.00 frames. ], batch size: 63, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 07:58:57,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 07:58:59,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:58:59,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 07:58:59,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 07:59:00,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 07:59:02,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:59:08,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:59:08,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:59:12,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:59:12,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:59:13,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:59:16,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 07:59:16,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:59:16,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:59:21,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:59:21,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=806840.0, ans=0.125 2023-10-02 07:59:26,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:59:27,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 07:59:29,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 07:59:30,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:59:30,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 07:59:30,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:59:30,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:59:33,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:59:35,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:59:38,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 07:59:41,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:59:42,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:59:44,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 07:59:44,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:59:45,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 07:59:45,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=806973.3333333334, ans=0.0 2023-10-02 07:59:47,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:59:48,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:59:49,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:59:52,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 07:59:52,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:59:52,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:59:53,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:59:57,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 07:59:58,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:59:58,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:59:58,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 08:00:00,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 08:00:00,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:00:00,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 08:00:00,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:00:01,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:00:01,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 08:00:03,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:00:08,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:00:09,589 INFO [train.py:1046] (1/4) Epoch 23, batch 4200, loss[loss=0.1652, simple_loss=0.2117, pruned_loss=0.05934, over 19681.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2481, pruned_loss=0.04747, over 4700331.78 frames. ], batch size: 388, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 08:00:09,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 08:00:09,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=807106.6666666666, ans=0.0 2023-10-02 08:00:12,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:00:13,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:00:13,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:00:15,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:00:15,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:00:19,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 08:00:21,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 08:00:23,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:00:25,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:00:27,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:00:31,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 08:00:31,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:00:32,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:00:32,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 08:00:32,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:00:34,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:00:35,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:00:35,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:00:37,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:00:39,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 08:00:39,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:00:40,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=807240.0, ans=0.2 2023-10-02 08:00:43,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:00:44,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:00:47,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:00:48,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:00:51,054 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.783e+02 1.946e+02 2.154e+02 3.104e+02, threshold=3.892e+02, percent-clipped=0.0 2023-10-02 08:00:51,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:00:51,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 08:00:51,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:00:52,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:00:57,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:00:58,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:01:05,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:01:07,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 08:01:11,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:01:14,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:01:14,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:01:16,137 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.38 vs. limit=15.0 2023-10-02 08:01:16,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 08:01:22,252 INFO [train.py:1046] (1/4) Epoch 23, batch 4250, loss[loss=0.1555, simple_loss=0.2053, pruned_loss=0.05283, over 19042.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2465, pruned_loss=0.04674, over 4707567.69 frames. ], batch size: 388, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:01:22,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:01:25,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:01:25,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:01:28,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:01:33,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:01:33,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 08:01:33,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:01:36,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:01:40,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:01:45,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:01:46,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:01:47,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:01:47,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:01:48,683 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.72 vs. limit=15.0 2023-10-02 08:01:49,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:01:50,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:01:50,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:01:52,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:01:52,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=807573.3333333334, ans=0.125 2023-10-02 08:01:54,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:01:54,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 08:01:59,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 08:01:59,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:01:59,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:02:00,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:02:00,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:02:00,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:02:02,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:02:05,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:02:07,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:02:13,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:02:14,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:02:14,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 08:02:14,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:02:15,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 08:02:17,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:02:18,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:02:18,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:02:19,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:02:20,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=807706.6666666666, ans=0.0 2023-10-02 08:02:21,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 08:02:22,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:02:24,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:02:27,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:02:29,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:02:30,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=807706.6666666666, ans=0.125 2023-10-02 08:02:31,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:02:31,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:02:32,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:02:34,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:02:35,935 INFO [train.py:1046] (1/4) Epoch 23, batch 4300, loss[loss=0.1652, simple_loss=0.2533, pruned_loss=0.03858, over 24561.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2459, pruned_loss=0.04641, over 4724567.67 frames. ], batch size: 71, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:02:36,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:02:36,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 08:02:37,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:02:42,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=807773.3333333334, ans=0.125 2023-10-02 08:02:44,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:02:44,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:02:47,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:02:54,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:02:54,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 08:02:55,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:02:56,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=807840.0, ans=0.2 2023-10-02 08:02:57,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:02:57,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=807840.0, ans=0.0 2023-10-02 08:02:58,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:02:58,842 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 08:03:01,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:03:02,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:03:06,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 08:03:06,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:03:06,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 08:03:09,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 08:03:09,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:03:12,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:03:12,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:03:12,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:03:15,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:03:15,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:03:15,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 08:03:17,962 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.923e+02 2.251e+02 2.726e+02 4.370e+02, threshold=4.501e+02, percent-clipped=2.0 2023-10-02 08:03:18,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 08:03:19,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:03:20,016 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.52 vs. limit=15.0 2023-10-02 08:03:20,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:03:20,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 08:03:20,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:03:22,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:03:22,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 08:03:22,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 08:03:22,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 08:03:23,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:03:23,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 08:03:23,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 08:03:27,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:03:28,651 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 08:03:29,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:03:30,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=807973.3333333334, ans=0.125 2023-10-02 08:03:31,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:03:31,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:03:33,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=808040.0, ans=0.0 2023-10-02 08:03:34,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 08:03:36,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:03:36,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:03:36,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:03:37,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:03:37,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:03:37,787 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:03:40,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:03:42,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:03:43,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:03:43,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:03:47,090 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.33 vs. limit=15.0 2023-10-02 08:03:47,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 08:03:47,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:03:49,204 INFO [train.py:1046] (1/4) Epoch 23, batch 4350, loss[loss=0.164, simple_loss=0.2464, pruned_loss=0.04077, over 24319.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2474, pruned_loss=0.04689, over 4721116.99 frames. ], batch size: 61, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:03:52,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=808106.6666666666, ans=0.07 2023-10-02 08:03:53,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:03:56,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:03:59,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:03:59,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:04:02,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:04:03,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=808173.3333333334, ans=0.1 2023-10-02 08:04:04,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:04:06,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=808173.3333333334, ans=0.125 2023-10-02 08:04:07,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:04:07,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:04:10,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:04:10,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=808173.3333333334, ans=0.0 2023-10-02 08:04:11,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:04:14,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:04:17,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 08:04:18,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:04:19,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:04:20,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=808240.0, ans=0.125 2023-10-02 08:04:22,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:04:26,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 08:04:28,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:04:28,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=808240.0, ans=0.1 2023-10-02 08:04:29,636 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:04:32,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:04:35,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=808306.6666666666, ans=0.125 2023-10-02 08:04:37,102 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 08:04:38,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:04:38,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:04:39,858 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 08:04:41,185 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 08:04:41,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:04:41,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:04:42,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:04:42,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:04:43,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:04:44,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:04:47,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 08:04:47,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:04:47,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:04:48,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:04:49,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 08:04:49,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=808373.3333333334, ans=0.0 2023-10-02 08:04:50,676 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 08:04:50,686 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 08:04:50,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 08:04:54,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:04:54,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:04:54,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:04:55,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:04:57,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 08:05:00,599 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 08:05:00,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:05:01,934 INFO [train.py:1046] (1/4) Epoch 23, batch 4400, loss[loss=0.1574, simple_loss=0.2429, pruned_loss=0.03589, over 24469.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2481, pruned_loss=0.04691, over 4735088.42 frames. ], batch size: 66, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:05:04,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:05:04,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:05:06,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:05:09,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 08:05:09,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 08:05:09,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 08:05:09,333 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 08:05:09,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 08:05:09,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:05:11,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 08:05:13,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:05:15,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:05:15,248 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 08:05:19,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:05:19,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 08:05:19,203 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 08:05:21,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 08:05:22,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 08:05:22,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 08:05:22,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:05:23,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:05:24,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:05:24,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:05:27,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 08:05:27,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 08:05:27,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:05:28,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:05:28,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:05:30,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:05:30,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:05:30,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 08:05:32,005 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 08:05:35,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:05:42,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:05:43,853 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.961e+02 2.160e+02 2.507e+02 3.743e+02, threshold=4.320e+02, percent-clipped=0.0 2023-10-02 08:05:44,805 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.24 vs. limit=15.0 2023-10-02 08:05:45,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 08:05:49,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:05:52,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:05:54,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:05:54,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 08:05:54,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:05:54,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:05:54,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:05:56,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:05:56,518 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:06:00,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 08:06:04,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 08:06:05,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 08:06:05,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:06:05,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 08:06:06,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:06:11,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:06:14,543 INFO [train.py:1046] (1/4) Epoch 23, batch 4450, loss[loss=0.179, simple_loss=0.2628, pruned_loss=0.04761, over 24570.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2494, pruned_loss=0.04738, over 4742813.28 frames. ], batch size: 71, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:06:14,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 08:06:17,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:06:20,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:06:20,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:06:25,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:06:25,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:06:28,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:06:31,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:06:34,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:06:34,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:06:35,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 08:06:35,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:06:35,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:06:37,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:06:37,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:06:38,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=808840.0, ans=0.025 2023-10-02 08:06:39,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 08:06:45,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:06:45,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:06:46,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:06:47,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:06:47,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:06:51,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 08:06:53,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 08:06:54,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 08:06:54,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:06:56,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:06:56,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 08:06:59,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:06:59,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=808973.3333333334, ans=0.0 2023-10-02 08:07:03,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:07:03,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 08:07:03,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:07:03,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:07:03,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:07:03,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:07:08,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:07:10,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 08:07:10,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 08:07:13,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:07:15,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:07:16,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:07:17,790 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.16 vs. limit=15.0 2023-10-02 08:07:18,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:07:18,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 08:07:19,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:07:22,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 08:07:23,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:07:26,433 INFO [train.py:1046] (1/4) Epoch 23, batch 4500, loss[loss=0.1669, simple_loss=0.2294, pruned_loss=0.05223, over 23420.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2498, pruned_loss=0.04823, over 4713853.38 frames. ], batch size: 285, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:07:27,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:07:29,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 08:07:29,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 08:07:30,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:07:35,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:07:37,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:07:38,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:07:38,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:07:40,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:07:40,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:07:51,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:07:52,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:07:54,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:07:54,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:07:55,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:07:56,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=809240.0, ans=0.125 2023-10-02 08:07:56,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=809240.0, ans=0.125 2023-10-02 08:08:01,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:08:05,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:08:07,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:08:09,150 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.850e+02 2.045e+02 2.410e+02 3.743e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-02 08:08:09,894 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.03 vs. limit=22.5 2023-10-02 08:08:10,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:08:10,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 08:08:10,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:08:12,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:08:12,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=809306.6666666666, ans=0.07 2023-10-02 08:08:14,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:08:14,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:08:17,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:08:17,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 08:08:17,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:08:17,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:08:23,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:08:24,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:08:26,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:08:28,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:08:28,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:08:31,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 08:08:32,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 08:08:32,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 08:08:38,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 08:08:39,845 INFO [train.py:1046] (1/4) Epoch 23, batch 4550, loss[loss=0.1825, simple_loss=0.2556, pruned_loss=0.05471, over 23296.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2483, pruned_loss=0.04748, over 4711459.31 frames. ], batch size: 119, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:08:39,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 08:08:40,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=809440.0, ans=0.125 2023-10-02 08:08:41,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:08:44,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:08:45,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:08:46,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:08:50,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:08:52,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:08:54,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:08:54,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:08:54,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:08:57,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:08:58,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:08:59,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:09:03,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 08:09:03,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 08:09:06,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:09:07,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 08:09:07,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=809573.3333333334, ans=0.2 2023-10-02 08:09:10,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 08:09:10,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:09:13,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 08:09:13,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=809573.3333333334, ans=0.2 2023-10-02 08:09:15,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:09:15,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=809573.3333333334, ans=0.125 2023-10-02 08:09:19,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:19,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:20,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:09:23,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 08:09:25,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:09:27,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:27,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:09:29,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:09:29,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 08:09:30,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 08:09:30,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:09:32,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 08:09:34,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 08:09:34,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:09:37,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:09:37,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:09:38,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:38,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:09:38,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:09:39,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 08:09:41,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:09:41,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 08:09:41,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 08:09:41,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:09:41,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 08:09:44,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:09:44,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:09:47,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:09:47,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:47,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=809706.6666666666, ans=0.125 2023-10-02 08:09:49,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:09:50,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:09:51,605 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.21 vs. limit=6.0 2023-10-02 08:09:51,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:09:53,332 INFO [train.py:1046] (1/4) Epoch 23, batch 4600, loss[loss=0.1874, simple_loss=0.2641, pruned_loss=0.05532, over 23996.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2466, pruned_loss=0.04706, over 4704004.64 frames. ], batch size: 80, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:09:54,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:09:54,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=809773.3333333334, ans=0.2 2023-10-02 08:09:56,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:09:57,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:09:57,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:09:59,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:10:00,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 08:10:02,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:10:08,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:10:08,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:10:09,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:16,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 08:10:16,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:18,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=809840.0, ans=0.125 2023-10-02 08:10:20,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:22,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:10:22,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:10:24,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=809906.6666666666, ans=0.2 2023-10-02 08:10:24,942 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.94 vs. limit=15.0 2023-10-02 08:10:27,554 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.59 vs. limit=10.0 2023-10-02 08:10:29,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 08:10:29,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:10:29,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:10:31,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=809906.6666666666, ans=0.0 2023-10-02 08:10:34,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:34,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=809906.6666666666, ans=0.1 2023-10-02 08:10:36,008 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.884e+02 2.130e+02 2.615e+02 3.495e+02, threshold=4.259e+02, percent-clipped=0.0 2023-10-02 08:10:36,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:10:37,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:10:41,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 08:10:42,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:10:46,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=809973.3333333334, ans=22.5 2023-10-02 08:10:47,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:10:47,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=809973.3333333334, ans=0.1 2023-10-02 08:10:48,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:10:50,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:10:50,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 08:10:50,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:52,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 08:10:52,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:10:52,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:10:53,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:10:54,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:10:54,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:10:56,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 08:10:56,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 08:10:56,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 08:10:57,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:10:58,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:11:00,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:11:00,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:11:03,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=810040.0, ans=0.0 2023-10-02 08:11:03,908 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.96 vs. limit=15.0 2023-10-02 08:11:04,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=810106.6666666666, ans=0.0 2023-10-02 08:11:05,883 INFO [train.py:1046] (1/4) Epoch 23, batch 4650, loss[loss=0.1839, simple_loss=0.2624, pruned_loss=0.05271, over 23389.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2466, pruned_loss=0.04697, over 4708919.69 frames. ], batch size: 93, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:11:10,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:11:12,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:11:13,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:11:13,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:11:13,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:11:13,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:11:14,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:11:17,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=810106.6666666666, ans=0.1 2023-10-02 08:11:18,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 08:11:21,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:11:23,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 08:11:23,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:11:24,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 08:11:24,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:11:24,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=810173.3333333334, ans=0.0 2023-10-02 08:11:26,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 08:11:26,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 08:11:26,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:11:27,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:11:31,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:11:31,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:11:31,625 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 08:11:34,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:11:36,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 08:11:38,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:11:38,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:11:38,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=810240.0, ans=0.1 2023-10-02 08:11:39,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 08:11:41,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:11:42,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:11:46,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:11:49,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:11:52,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:11:54,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:11:55,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:11:58,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 08:11:58,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 08:11:59,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 08:11:59,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 08:12:00,110 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.34 vs. limit=22.5 2023-10-02 08:12:01,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:12:01,770 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.97 vs. limit=15.0 2023-10-02 08:12:08,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:12:09,416 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.36 vs. limit=15.0 2023-10-02 08:12:09,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:12:09,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 08:12:09,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:12:09,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:12:09,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:12:12,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:12:15,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:12:15,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:12:17,766 INFO [train.py:1046] (1/4) Epoch 23, batch 4700, loss[loss=0.1482, simple_loss=0.2271, pruned_loss=0.03467, over 24439.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2471, pruned_loss=0.04724, over 4720425.71 frames. ], batch size: 58, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:12:17,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:12:21,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:12:21,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:12:21,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:12:21,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 08:12:21,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=810440.0, ans=0.125 2023-10-02 08:12:21,913 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.22 vs. limit=10.0 2023-10-02 08:12:23,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:12:24,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 08:12:24,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=810440.0, ans=0.125 2023-10-02 08:12:31,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:12:32,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:12:32,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:12:34,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:12:35,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:12:40,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 08:12:40,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=810506.6666666666, ans=0.125 2023-10-02 08:12:41,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 08:12:43,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:12:44,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:12:45,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:12:48,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:12:52,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:12:52,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=810573.3333333334, ans=0.1 2023-10-02 08:12:53,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 08:12:56,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:12:58,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=810573.3333333334, ans=0.1 2023-10-02 08:13:01,991 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.784e+02 2.018e+02 2.230e+02 2.934e+02, threshold=4.036e+02, percent-clipped=0.0 2023-10-02 08:13:02,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 08:13:03,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:13:06,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:09,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 08:13:10,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:13:13,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:13:15,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 08:13:16,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:16,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:13:19,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:13:19,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:13:19,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 08:13:21,264 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 08:13:24,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:13:25,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:25,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:25,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 08:13:27,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:29,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 08:13:30,687 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.13 vs. limit=15.0 2023-10-02 08:13:31,249 INFO [train.py:1046] (1/4) Epoch 23, batch 4750, loss[loss=0.179, simple_loss=0.267, pruned_loss=0.04546, over 24348.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2478, pruned_loss=0.04719, over 4726970.00 frames. ], batch size: 77, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:13:32,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:13:34,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:13:37,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:13:38,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:13:39,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 08:13:39,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:13:42,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 08:13:43,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:13:43,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:13:44,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:13:45,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=810840.0, ans=0.125 2023-10-02 08:13:50,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 08:13:53,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:13:54,069 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=810840.0, ans=0.125 2023-10-02 08:13:55,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 08:13:56,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:13:59,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:13:59,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:13:59,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:13:59,325 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 08:14:00,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 08:14:06,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 08:14:08,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:14:08,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=810906.6666666666, ans=0.2 2023-10-02 08:14:10,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:14:12,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:14:12,736 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 08:14:12,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:14:15,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:14:16,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:14:19,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 08:14:19,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=810973.3333333334, ans=0.015 2023-10-02 08:14:20,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 08:14:20,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:14:22,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:14:22,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:14:22,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 08:14:22,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=810973.3333333334, ans=0.2 2023-10-02 08:14:23,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 08:14:25,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=810973.3333333334, ans=0.1 2023-10-02 08:14:26,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 08:14:28,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:14:29,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:14:29,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 08:14:29,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:14:30,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.28 vs. limit=15.0 2023-10-02 08:14:30,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:14:32,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:14:34,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:14:35,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:14:39,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:14:39,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 08:14:41,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 08:14:42,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 08:14:43,876 INFO [train.py:1046] (1/4) Epoch 23, batch 4800, loss[loss=0.1506, simple_loss=0.2295, pruned_loss=0.03587, over 24348.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2488, pruned_loss=0.04792, over 4719960.99 frames. ], batch size: 56, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:14:45,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:14:45,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:14:46,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 08:14:50,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:14:52,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:14:58,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:15:00,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:15:01,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:15:01,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 08:15:01,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:15:01,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:15:03,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:15:07,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:08,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:09,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:15:10,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:10,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 08:15:10,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:15:10,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:15:12,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:14,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:15:16,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:15:16,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:15:17,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 08:15:19,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:15:20,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 08:15:20,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 08:15:22,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:15:22,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:15:23,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:15:23,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:15:23,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:15:26,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:15:26,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:15:28,384 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.842e+02 1.977e+02 2.264e+02 3.634e+02, threshold=3.953e+02, percent-clipped=0.0 2023-10-02 08:15:31,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:15:32,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:34,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:15:39,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 08:15:39,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:15:39,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=811306.6666666666, ans=0.0 2023-10-02 08:15:40,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:40,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:15:41,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:15:44,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:15:46,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:15:46,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:46,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:15:46,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:15:47,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:15:52,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:15:52,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:52,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:15:53,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 08:15:56,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 08:15:56,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:56,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:56,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:15:56,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:56,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=811440.0, ans=0.0 2023-10-02 08:15:57,959 INFO [train.py:1046] (1/4) Epoch 23, batch 4850, loss[loss=0.1679, simple_loss=0.2402, pruned_loss=0.04786, over 24412.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.249, pruned_loss=0.04804, over 4727820.61 frames. ], batch size: 58, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:15:59,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:16:01,234 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.52 vs. limit=15.0 2023-10-02 08:16:02,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=811440.0, ans=0.125 2023-10-02 08:16:07,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 08:16:09,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:16:15,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:16:15,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:16:16,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:16:19,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:16:20,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:16:20,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:16:20,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 08:16:22,796 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.21 vs. limit=15.0 2023-10-02 08:16:26,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:16:28,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:16:28,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:16:28,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:16:28,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 08:16:30,390 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.39 vs. limit=10.0 2023-10-02 08:16:32,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:16:32,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:16:34,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=811573.3333333334, ans=0.025 2023-10-02 08:16:35,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:16:35,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 08:16:37,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 08:16:40,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:16:42,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=811640.0, ans=0.125 2023-10-02 08:16:47,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:16:47,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 08:16:48,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:16:48,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:16:50,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:16:52,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 08:16:52,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:16:52,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 08:16:52,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:16:54,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:16:54,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 08:17:00,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=811706.6666666666, ans=0.125 2023-10-02 08:17:02,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:17:08,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:17:08,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:17:10,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=811773.3333333334, ans=0.2 2023-10-02 08:17:11,415 INFO [train.py:1046] (1/4) Epoch 23, batch 4900, loss[loss=0.1831, simple_loss=0.2709, pruned_loss=0.04763, over 24407.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2475, pruned_loss=0.04763, over 4729210.77 frames. ], batch size: 77, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:17:11,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=811773.3333333334, ans=0.0 2023-10-02 08:17:14,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 08:17:14,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:17:18,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:17:18,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=811773.3333333334, ans=0.1 2023-10-02 08:17:19,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:17:21,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:17:22,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 08:17:22,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=811773.3333333334, ans=0.1 2023-10-02 08:17:26,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 08:17:31,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 08:17:31,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 08:17:33,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:17:33,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:17:33,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:17:33,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:17:33,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:17:33,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 08:17:36,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=811840.0, ans=0.125 2023-10-02 08:17:38,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 08:17:38,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:17:41,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:17:41,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:17:43,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:17:43,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:17:45,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:17:45,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 08:17:46,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:17:46,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:17:46,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=811906.6666666666, ans=0.0 2023-10-02 08:17:48,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 08:17:48,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 08:17:50,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 08:17:52,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:17:53,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:17:53,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:17:53,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:17:54,803 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.777e+02 1.982e+02 2.157e+02 3.751e+02, threshold=3.964e+02, percent-clipped=0.0 2023-10-02 08:17:54,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 08:17:54,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:17:54,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 08:17:56,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=811973.3333333334, ans=0.125 2023-10-02 08:17:58,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:18:00,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:18:02,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:18:06,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 08:18:06,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:18:07,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 08:18:08,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 08:18:13,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:18:14,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:18:15,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 08:18:15,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 08:18:15,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:18:17,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:18:21,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:18:21,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:18:21,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:18:21,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 08:18:22,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:18:24,012 INFO [train.py:1046] (1/4) Epoch 23, batch 4950, loss[loss=0.1776, simple_loss=0.2474, pruned_loss=0.05389, over 23322.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2463, pruned_loss=0.0478, over 4701314.22 frames. ], batch size: 105, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:18:25,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:18:25,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 08:18:28,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 08:18:30,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 08:18:30,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:18:31,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 08:18:32,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:18:32,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:18:32,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:18:32,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=812106.6666666666, ans=0.125 2023-10-02 08:18:33,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:18:35,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:18:35,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:18:36,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:18:37,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:18:40,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:18:40,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:18:42,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:18:45,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:18:47,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:18:48,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:18:49,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:18:51,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:18:52,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 08:18:53,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 08:18:57,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:18:58,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:18:59,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:18:59,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:18:59,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:19:01,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:19:04,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:19:06,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=812306.6666666666, ans=0.0 2023-10-02 08:19:07,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:19:10,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:19:11,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:19:11,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:19:11,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=812306.6666666666, ans=0.125 2023-10-02 08:19:13,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 08:19:13,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:19:15,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:19:18,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:19:19,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:19:19,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:19:19,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:19:20,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:19:20,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=812373.3333333334, ans=0.125 2023-10-02 08:19:21,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:19:22,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:19:22,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:19:22,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:19:24,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 08:19:30,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:19:31,583 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.90 vs. limit=6.0 2023-10-02 08:19:34,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 08:19:34,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 08:19:36,744 INFO [train.py:1046] (1/4) Epoch 23, batch 5000, loss[loss=0.1768, simple_loss=0.2469, pruned_loss=0.0534, over 23844.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2464, pruned_loss=0.04753, over 4703801.72 frames. ], batch size: 195, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:19:41,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:19:41,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:19:43,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 08:19:43,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 08:19:46,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:19:48,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 08:19:48,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:19:48,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:19:49,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 08:19:50,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:19:52,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:19:53,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 08:19:53,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:19:53,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:19:54,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 08:19:56,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 08:19:57,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:19:58,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 08:19:58,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:19:59,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:19:59,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:19:59,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 08:19:59,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 08:20:02,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 08:20:02,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:20:02,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:20:05,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 08:20:05,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:20:05,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:20:06,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:20:08,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 08:20:10,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 08:20:11,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:20:12,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:20:14,766 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.20 vs. limit=15.0 2023-10-02 08:20:15,588 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 08:20:19,581 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.897e+02 2.081e+02 2.436e+02 3.526e+02, threshold=4.162e+02, percent-clipped=0.0 2023-10-02 08:20:19,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:20:19,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:20:19,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:22,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 08:20:22,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:20:23,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:20:23,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:20:24,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=812640.0, ans=0.125 2023-10-02 08:20:27,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 08:20:27,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:20:29,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:20:29,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:20:37,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 08:20:42,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:48,597 INFO [train.py:1046] (1/4) Epoch 23, batch 5050, loss[loss=0.1744, simple_loss=0.2449, pruned_loss=0.05198, over 23856.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2474, pruned_loss=0.04701, over 4721474.32 frames. ], batch size: 195, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:20:48,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:20:50,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:50,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:20:50,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:20:51,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:20:51,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:20:51,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:55,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:55,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 08:20:55,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=812773.3333333334, ans=0.1 2023-10-02 08:20:57,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:20:58,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:20:59,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:21:00,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 08:21:01,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:21:01,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:21:02,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:21:04,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:21:04,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:21:07,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=812840.0, ans=0.0 2023-10-02 08:21:14,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 08:21:14,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:21:15,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:21:16,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 08:21:17,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:21:18,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:21:18,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:21:18,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:21:18,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 08:21:20,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 08:21:21,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:21:23,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:21:27,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:21:29,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 08:21:30,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:21:33,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 08:21:34,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:21:34,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:21:34,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:21:36,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:21:37,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:21:39,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:21:41,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:21:41,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:21:41,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:21:41,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 08:21:43,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:21:44,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:21:47,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:21:47,431 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 08:21:47,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 08:21:48,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:21:49,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:21:50,006 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 08:21:52,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:21:52,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 08:21:52,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:21:56,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:21:56,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:21:58,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 08:21:58,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 08:22:01,096 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.62 vs. limit=22.5 2023-10-02 08:22:01,340 INFO [train.py:1046] (1/4) Epoch 23, batch 5100, loss[loss=0.1805, simple_loss=0.2471, pruned_loss=0.05693, over 22830.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2486, pruned_loss=0.04747, over 4711841.62 frames. ], batch size: 322, lr: 4.42e-03, grad_scale: 16.0 2023-10-02 08:22:01,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:22:01,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:22:01,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:22:05,482 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 08:22:07,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:22:09,315 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:22:10,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 08:22:10,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 08:22:11,246 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.00 vs. limit=15.0 2023-10-02 08:22:12,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:22:13,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:22:15,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=813173.3333333334, ans=0.125 2023-10-02 08:22:16,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:22:16,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 08:22:16,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 08:22:20,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:22:20,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:22:24,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:22:26,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=813173.3333333334, ans=0.125 2023-10-02 08:22:27,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 08:22:27,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:22:30,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:22:30,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 08:22:32,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:22:33,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:22:33,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 08:22:33,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=813240.0, ans=0.0 2023-10-02 08:22:36,142 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 08:22:36,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:22:36,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 08:22:37,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 08:22:42,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:22:47,180 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.431e+02 1.831e+02 1.978e+02 2.257e+02 3.540e+02, threshold=3.956e+02, percent-clipped=0.0 2023-10-02 08:22:48,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:22:51,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 08:22:52,929 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 08:22:52,943 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 08:22:54,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 08:22:54,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:22:55,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 08:22:56,501 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.72 vs. limit=10.0 2023-10-02 08:22:59,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 08:23:00,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 08:23:03,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:23:04,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 08:23:07,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:23:07,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 08:23:11,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=813373.3333333334, ans=0.1 2023-10-02 08:23:13,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:23:14,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:23:14,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:23:15,859 INFO [train.py:1046] (1/4) Epoch 23, batch 5150, loss[loss=0.1893, simple_loss=0.2525, pruned_loss=0.063, over 23709.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2491, pruned_loss=0.0477, over 4714640.89 frames. ], batch size: 164, lr: 4.42e-03, grad_scale: 16.0 2023-10-02 08:23:15,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:23:15,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:23:17,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:23:17,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 08:23:17,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 08:23:17,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 08:23:18,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:23:18,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 08:23:20,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:23:20,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 08:23:21,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:23:24,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:23:28,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 08:23:28,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 08:23:31,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:23:31,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:23:34,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:23:34,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:23:34,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:23:35,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:23:35,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:23:35,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 08:23:37,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:23:38,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:23:39,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:23:41,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 08:23:43,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:23:43,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=813573.3333333334, ans=0.2 2023-10-02 08:23:48,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=813573.3333333334, ans=0.125 2023-10-02 08:23:49,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:23:51,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=813573.3333333334, ans=0.125 2023-10-02 08:23:52,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 08:23:55,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:23:59,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:24:01,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:24:05,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:24:06,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:24:08,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 08:24:12,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:24:13,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:24:13,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:24:16,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:24:18,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:24:19,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 08:24:23,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:24:24,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=813706.6666666666, ans=0.2 2023-10-02 08:24:25,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:24:26,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:24:26,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:24:26,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 08:24:26,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=813773.3333333334, ans=0.125 2023-10-02 08:24:28,068 INFO [train.py:1046] (1/4) Epoch 23, batch 5200, loss[loss=0.1618, simple_loss=0.2357, pruned_loss=0.04394, over 18902.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2496, pruned_loss=0.04783, over 4709279.27 frames. ], batch size: 41, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:24:28,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:24:28,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:24:28,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:24:29,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=813773.3333333334, ans=0.0 2023-10-02 08:24:31,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:24:32,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:24:34,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:24:38,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 08:24:39,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:24:39,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:24:42,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:24:42,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:24:44,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:24:44,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 08:24:47,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:24:47,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:24:49,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 08:24:50,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=813840.0, ans=0.0 2023-10-02 08:24:52,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:24:52,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:24:52,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 08:24:53,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 08:24:56,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 08:24:56,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:24:56,752 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 08:24:56,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:24:59,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:24:59,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:24:59,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 08:25:01,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:25:02,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:25:02,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=813906.6666666666, ans=0.025 2023-10-02 08:25:06,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 08:25:08,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 08:25:08,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 08:25:11,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 08:25:11,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:25:13,378 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.905e+02 1.999e+02 2.249e+02 3.561e+02, threshold=3.998e+02, percent-clipped=0.0 2023-10-02 08:25:19,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:25:19,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:25:20,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 08:25:22,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:25:22,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 08:25:22,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:25:22,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:25:26,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:25:26,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:25:30,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:25:32,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:25:32,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:25:36,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:25:37,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 08:25:37,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:25:37,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:25:39,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:25:39,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:25:40,417 INFO [train.py:1046] (1/4) Epoch 23, batch 5250, loss[loss=0.1563, simple_loss=0.236, pruned_loss=0.03832, over 24608.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.249, pruned_loss=0.04767, over 4707806.14 frames. ], batch size: 60, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:25:40,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:25:43,262 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.18 vs. limit=6.0 2023-10-02 08:25:43,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:25:45,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:25:47,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:25:48,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:25:50,647 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.26 vs. limit=10.0 2023-10-02 08:25:52,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:25:54,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:25:55,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:25:56,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:26:00,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 08:26:00,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:26:00,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:26:12,705 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.53 vs. limit=22.5 2023-10-02 08:26:13,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=814240.0, ans=0.125 2023-10-02 08:26:31,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=814306.6666666666, ans=0.0 2023-10-02 08:26:48,279 INFO [train.py:1046] (1/4) Epoch 23, batch 5300, loss[loss=0.1736, simple_loss=0.2474, pruned_loss=0.04984, over 23396.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2474, pruned_loss=0.04727, over 4707963.26 frames. ], batch size: 119, lr: 4.42e-03, grad_scale: 16.0 2023-10-02 08:27:02,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:27:02,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 08:27:02,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 08:27:02,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:27:02,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:27:03,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:27:03,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:27:03,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:27:03,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:03,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:27:03,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:27:03,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:27:03,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 08:27:03,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 08:27:03,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 08:27:04,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:27:04,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 08:27:04,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 08:27:04,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:27:04,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:27:04,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:27:04,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:27:04,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:27:04,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:27:05,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:27:05,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:27:05,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:27:05,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:27:05,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:27:05,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:27:05,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:27:06,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 08:27:06,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:27:06,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:27:06,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 08:27:06,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 08:27:06,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:27:06,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:27:06,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 08:27:06,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 08:27:06,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:27:07,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:27:07,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:27:07,367 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 08:27:07,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 08:27:07,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:27:07,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:27:07,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 08:27:07,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 08:27:07,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 08:27:07,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:27:14,102 INFO [train.py:1046] (1/4) Epoch 24, batch 0, loss[loss=0.1684, simple_loss=0.2484, pruned_loss=0.0442, over 24442.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2484, pruned_loss=0.0442, over 24442.00 frames. ], batch size: 63, lr: 4.32e-03, grad_scale: 32.0 2023-10-02 08:27:14,102 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 08:27:27,311 INFO [train.py:1078] (1/4) Epoch 24, validation: loss=0.3245, simple_loss=0.2712, pruned_loss=0.1889, over 1125622.00 frames. 2023-10-02 08:27:27,311 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 08:27:28,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 08:27:28,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:27:30,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:27:34,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=814520.0, ans=0.05 2023-10-02 08:27:35,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:27:35,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:27:35,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:36,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 08:27:40,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 08:27:42,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:43,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:47,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:47,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:27:48,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:27:48,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:27:49,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 08:27:52,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:27:56,503 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 1.867e+02 2.102e+02 2.500e+02 3.375e+02, threshold=4.204e+02, percent-clipped=0.0 2023-10-02 08:27:58,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:28:00,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:28:01,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 08:28:04,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:28:04,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:28:05,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:28:09,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:28:13,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:28:18,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 08:28:21,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 08:28:21,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:28:21,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:28:23,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:28:24,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:28:27,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 08:28:28,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:28:29,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:28:31,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=814786.6666666666, ans=0.125 2023-10-02 08:28:33,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:28:35,916 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 08:28:37,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:28:40,398 INFO [train.py:1046] (1/4) Epoch 24, batch 50, loss[loss=0.1669, simple_loss=0.255, pruned_loss=0.03941, over 24565.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2492, pruned_loss=0.04803, over 1060577.96 frames. ], batch size: 71, lr: 4.32e-03, grad_scale: 32.0 2023-10-02 08:28:40,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:28:41,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:28:41,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 08:28:43,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:28:43,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:28:46,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:28:47,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:28:49,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:28:53,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 08:28:53,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:28:59,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:29:00,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 08:29:02,614 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.48 vs. limit=10.0 2023-10-02 08:29:03,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 08:29:04,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:29:04,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:29:04,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:29:06,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:29:07,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:29:07,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:29:07,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:29:15,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:29:16,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:29:16,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:29:16,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 08:29:18,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:29:19,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:29:19,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 08:29:20,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:29:22,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 08:29:25,065 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.84 vs. limit=15.0 2023-10-02 08:29:28,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:29:30,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:29:31,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:29:32,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:29:32,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:29:35,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 08:29:35,857 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:29:36,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 08:29:38,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:29:38,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:29:39,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:29:41,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:29:41,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 08:29:41,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 08:29:43,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 08:29:46,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:29:46,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:29:46,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 08:29:47,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 08:29:48,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:29:49,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:29:50,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 08:29:51,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:29:53,011 INFO [train.py:1046] (1/4) Epoch 24, batch 100, loss[loss=0.1711, simple_loss=0.2457, pruned_loss=0.04823, over 23452.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2497, pruned_loss=0.04781, over 1872505.57 frames. ], batch size: 93, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:29:53,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:29:56,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:29:59,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:30:02,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 08:30:02,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:30:05,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:30:05,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:30:06,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:30:06,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:30:06,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:30:07,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 08:30:10,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:30:10,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=815253.3333333334, ans=0.0 2023-10-02 08:30:11,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:30:11,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:30:11,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:30:14,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 08:30:15,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:30:16,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=815253.3333333334, ans=0.1 2023-10-02 08:30:17,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:30:17,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:30:19,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:30:23,760 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.848e+02 2.049e+02 2.242e+02 3.447e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-02 08:30:23,850 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 08:30:23,863 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 08:30:25,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:30:25,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:30:28,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:30:30,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:30:30,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:33,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=815320.0, ans=0.0 2023-10-02 08:30:34,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:36,177 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 08:30:37,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 08:30:40,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:30:41,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:30:43,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:47,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:30:50,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:30:51,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:30:54,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:54,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:30:57,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:30:57,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:30:57,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:57,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 08:30:59,237 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 08:30:59,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:31:00,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:31:02,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:02,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:31:02,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 08:31:02,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 08:31:02,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:31:02,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:02,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:31:03,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:31:03,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:31:05,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:31:06,732 INFO [train.py:1046] (1/4) Epoch 24, batch 150, loss[loss=0.1449, simple_loss=0.2221, pruned_loss=0.03382, over 20289.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2492, pruned_loss=0.04754, over 2506953.37 frames. ], batch size: 44, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:31:08,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:31:10,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:31:10,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:31:12,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:13,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:31:13,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:16,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:31:18,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:24,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 08:31:24,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 08:31:24,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 08:31:26,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:31:26,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:31:28,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:31:30,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:31:30,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:31:30,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:31,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:31,656 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 08:31:34,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:31:40,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:31:42,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:31:44,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 08:31:44,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=815653.3333333334, ans=0.125 2023-10-02 08:31:48,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:31:48,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:31:49,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:31:51,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:31:53,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:31:53,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=815720.0, ans=0.125 2023-10-02 08:31:54,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:31:54,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=815720.0, ans=0.2 2023-10-02 08:31:55,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:31:56,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 08:32:01,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:32:01,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=815720.0, ans=0.05 2023-10-02 08:32:02,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:32:02,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:32:02,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:32:04,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:32:05,116 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:32:06,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 08:32:09,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:32:10,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:32:12,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:32:13,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:32:13,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 08:32:13,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:32:13,747 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 08:32:15,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:32:19,663 INFO [train.py:1046] (1/4) Epoch 24, batch 200, loss[loss=0.1886, simple_loss=0.2579, pruned_loss=0.05969, over 23881.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2509, pruned_loss=0.04805, over 2993395.88 frames. ], batch size: 195, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:32:19,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:32:19,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:32:21,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 08:32:23,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:32:23,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:32:23,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=815853.3333333334, ans=0.125 2023-10-02 08:32:25,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 08:32:25,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 08:32:27,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:32:27,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=815853.3333333334, ans=0.1 2023-10-02 08:32:28,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:32:28,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=815853.3333333334, ans=0.125 2023-10-02 08:32:33,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:32:34,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:32:34,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:32:49,985 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.435e+02 1.916e+02 2.110e+02 2.574e+02 4.556e+02, threshold=4.220e+02, percent-clipped=1.0 2023-10-02 08:32:54,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:32:54,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:32:56,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:32:56,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:32:56,702 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.77 vs. limit=15.0 2023-10-02 08:32:57,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 08:32:57,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:32:58,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:00,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:33:00,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:33:00,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:33:02,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 08:33:02,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 08:33:04,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:33:08,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:33:13,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:33:21,591 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.36 vs. limit=10.0 2023-10-02 08:33:22,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:22,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:33:30,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:32,203 INFO [train.py:1046] (1/4) Epoch 24, batch 250, loss[loss=0.171, simple_loss=0.2409, pruned_loss=0.05056, over 23711.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2505, pruned_loss=0.04782, over 3379314.25 frames. ], batch size: 232, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:33:32,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 08:33:33,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:33:33,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:33:33,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:33:35,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:33:35,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 08:33:36,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:33:36,599 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 08:33:38,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:38,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=816186.6666666666, ans=0.125 2023-10-02 08:33:40,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:33:42,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:42,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:33:45,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:33:45,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:47,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:33:49,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:33:53,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=816253.3333333334, ans=0.0 2023-10-02 08:33:58,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:34:01,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:34:01,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:34:04,057 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.32 vs. limit=22.5 2023-10-02 08:34:04,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=816320.0, ans=0.1 2023-10-02 08:34:08,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:34:08,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:34:10,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:34:10,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:34:11,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:34:11,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:34:13,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:34:14,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:34:17,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 08:34:17,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:34:19,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:34:19,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:34:19,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:34:20,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:34:22,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:34:22,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:34:23,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:34:24,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:34:24,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:34:26,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=816386.6666666666, ans=0.0 2023-10-02 08:34:27,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:34:32,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:34:33,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:34:39,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:34:41,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:34:44,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 08:34:45,739 INFO [train.py:1046] (1/4) Epoch 24, batch 300, loss[loss=0.1691, simple_loss=0.2596, pruned_loss=0.03932, over 24645.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2491, pruned_loss=0.0467, over 3673466.53 frames. ], batch size: 73, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:34:45,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:34:45,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:34:45,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 08:34:45,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 08:34:47,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:34:47,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 08:34:51,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:34:52,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:34:56,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:34:57,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 08:34:58,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:34:59,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:34:59,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 08:35:00,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:35:00,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=816586.6666666666, ans=0.0 2023-10-02 08:35:01,282 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.93 vs. limit=15.0 2023-10-02 08:35:04,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:35:06,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=816586.6666666666, ans=0.125 2023-10-02 08:35:09,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:35:09,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 08:35:14,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 08:35:14,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:15,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:35:16,751 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.883e+02 2.136e+02 2.437e+02 4.219e+02, threshold=4.271e+02, percent-clipped=0.0 2023-10-02 08:35:16,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=816653.3333333334, ans=0.125 2023-10-02 08:35:18,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:18,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 08:35:18,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:35:20,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:35:22,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:35:22,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:35:25,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 08:35:25,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 08:35:27,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:35:27,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=816653.3333333334, ans=0.125 2023-10-02 08:35:30,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:31,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 08:35:32,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:35:33,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=816720.0, ans=0.125 2023-10-02 08:35:34,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=816720.0, ans=0.125 2023-10-02 08:35:36,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:35:40,454 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.08 vs. limit=12.0 2023-10-02 08:35:41,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:35:41,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 08:35:43,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:43,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:35:46,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:46,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=816786.6666666666, ans=0.0 2023-10-02 08:35:48,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:35:48,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 08:35:49,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:35:49,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:35:50,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 08:35:50,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:52,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:35:53,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:35:53,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:35:55,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:35:57,665 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.26 vs. limit=15.0 2023-10-02 08:35:59,417 INFO [train.py:1046] (1/4) Epoch 24, batch 350, loss[loss=0.1703, simple_loss=0.2389, pruned_loss=0.05084, over 23856.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2468, pruned_loss=0.04671, over 3879900.06 frames. ], batch size: 195, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:35:59,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:35:59,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 08:35:59,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=816853.3333333334, ans=0.2 2023-10-02 08:36:02,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:07,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:36:10,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:36:10,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:11,382 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.59 vs. limit=15.0 2023-10-02 08:36:13,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 08:36:15,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:36:15,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 08:36:18,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:18,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 08:36:19,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:36:21,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 08:36:22,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:36:24,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:36:25,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:36:25,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=816920.0, ans=0.125 2023-10-02 08:36:26,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:36:26,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:36:28,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:36:28,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:36:28,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:36:30,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=816986.6666666666, ans=0.0 2023-10-02 08:36:31,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:36:31,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:35,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=816986.6666666666, ans=0.125 2023-10-02 08:36:38,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:36:38,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:36:38,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:36:38,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:36:43,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=817053.3333333334, ans=0.1 2023-10-02 08:36:44,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 08:36:44,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:47,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:36:47,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:36:49,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:36:49,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 08:36:50,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:36:52,100 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 08:36:52,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=817053.3333333334, ans=0.125 2023-10-02 08:36:53,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 08:36:54,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:36:57,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:36:57,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 08:37:00,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:37:03,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:37:03,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:37:05,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:37:05,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:37:07,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:37:08,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:37:10,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:37:11,553 INFO [train.py:1046] (1/4) Epoch 24, batch 400, loss[loss=0.1699, simple_loss=0.2557, pruned_loss=0.04211, over 24303.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2463, pruned_loss=0.04637, over 4069715.35 frames. ], batch size: 74, lr: 4.31e-03, grad_scale: 32.0 2023-10-02 08:37:11,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 08:37:11,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:37:11,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:37:15,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:37:15,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:37:17,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:37:18,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:37:18,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 08:37:18,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=817186.6666666666, ans=0.125 2023-10-02 08:37:19,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 08:37:19,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:37:21,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 08:37:21,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:37:21,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=817186.6666666666, ans=0.125 2023-10-02 08:37:25,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:37:25,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:37:25,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 08:37:25,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:37:27,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:37:27,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:37:27,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:37:28,551 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 08:37:28,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=817253.3333333334, ans=0.2 2023-10-02 08:37:29,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 08:37:35,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:37:38,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:37:38,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 08:37:38,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=817253.3333333334, ans=0.1 2023-10-02 08:37:39,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 08:37:42,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:37:44,458 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.820e+02 2.038e+02 2.400e+02 4.228e+02, threshold=4.076e+02, percent-clipped=0.0 2023-10-02 08:37:47,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:37:53,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 08:37:56,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=817386.6666666666, ans=0.2 2023-10-02 08:37:57,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:37:57,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=817386.6666666666, ans=0.0 2023-10-02 08:38:00,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 08:38:01,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:38:01,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=817386.6666666666, ans=0.125 2023-10-02 08:38:03,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:38:04,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 08:38:05,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:38:08,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:38:09,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:38:10,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=817453.3333333334, ans=0.125 2023-10-02 08:38:13,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:38:13,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 08:38:14,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:38:14,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 08:38:17,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:38:17,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:38:19,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 08:38:20,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=817453.3333333334, ans=0.1 2023-10-02 08:38:22,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:38:22,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:38:22,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 08:38:24,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 08:38:24,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:38:24,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:38:25,679 INFO [train.py:1046] (1/4) Epoch 24, batch 450, loss[loss=0.1777, simple_loss=0.2439, pruned_loss=0.05573, over 23583.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2478, pruned_loss=0.0476, over 4186149.65 frames. ], batch size: 256, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:38:25,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:38:25,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 08:38:25,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:38:27,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:38:29,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:38:30,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=817520.0, ans=0.0 2023-10-02 08:38:34,911 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.51 vs. limit=10.0 2023-10-02 08:38:38,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:38:38,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:38:38,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 08:38:39,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 08:38:39,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=817586.6666666666, ans=0.09899494936611666 2023-10-02 08:38:42,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:38:45,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:38:47,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:38:50,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:38:52,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:38:55,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 08:38:55,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 08:38:57,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 08:38:58,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:38:59,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:38:59,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:39:02,028 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 08:39:02,036 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 08:39:02,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:39:03,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:39:04,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 08:39:07,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:39:07,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:39:08,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 08:39:10,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 08:39:12,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:39:14,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:39:14,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:39:14,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=817720.0, ans=0.125 2023-10-02 08:39:15,295 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.59 vs. limit=15.0 2023-10-02 08:39:16,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 08:39:19,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:39:20,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 08:39:21,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 08:39:23,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:39:28,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:39:29,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:39:31,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:39:31,362 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 08:39:34,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=817786.6666666666, ans=0.125 2023-10-02 08:39:36,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:39:38,130 INFO [train.py:1046] (1/4) Epoch 24, batch 500, loss[loss=0.1724, simple_loss=0.2422, pruned_loss=0.05132, over 23330.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2489, pruned_loss=0.04844, over 4293038.58 frames. ], batch size: 119, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:39:38,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:39:39,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:39:39,559 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 08:39:40,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 08:39:40,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:39:43,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:39:48,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:39:48,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=817853.3333333334, ans=0.0 2023-10-02 08:39:50,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:39:51,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:39:51,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:39:53,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:39:53,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=817920.0, ans=0.125 2023-10-02 08:40:02,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:02,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 08:40:02,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:40:02,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:02,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 08:40:03,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:40:06,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:40:07,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:40:07,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:40:07,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:08,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 08:40:10,158 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.880e+02 2.084e+02 2.383e+02 5.306e+02, threshold=4.168e+02, percent-clipped=1.0 2023-10-02 08:40:10,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=817986.6666666666, ans=0.0 2023-10-02 08:40:11,655 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 08:40:14,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:40:15,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:40:16,285 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.14 vs. limit=6.0 2023-10-02 08:40:16,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:40:18,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:40:18,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:40:20,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=818053.3333333334, ans=0.0 2023-10-02 08:40:22,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 08:40:25,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:40:26,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:40:31,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:40:32,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:40:38,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:40:41,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 08:40:41,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:40:41,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:40:42,981 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:40:45,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 08:40:45,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 08:40:45,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:40:51,341 INFO [train.py:1046] (1/4) Epoch 24, batch 550, loss[loss=0.1872, simple_loss=0.2594, pruned_loss=0.05751, over 23413.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2493, pruned_loss=0.04851, over 4388578.97 frames. ], batch size: 285, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:40:51,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 08:40:51,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=818186.6666666666, ans=0.125 2023-10-02 08:40:52,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 08:40:54,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:40:54,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 08:40:56,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:40:56,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:40:56,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:57,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:57,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:40:58,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:40:59,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:41:00,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 08:41:00,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:41:06,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:06,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:41:07,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:41:07,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:41:12,492 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.22 vs. limit=15.0 2023-10-02 08:41:13,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 08:41:14,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 08:41:14,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:41:20,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:41:20,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:41:22,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:41:25,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:27,426 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 08:41:27,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:41:28,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 08:41:31,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:41:31,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:41:31,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:41:33,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:34,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 08:41:35,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 08:41:37,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:41:37,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:41:38,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:41:38,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:41:42,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:41:44,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:41:45,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:41:46,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:48,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 08:41:49,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:41:51,793 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.87 vs. limit=12.0 2023-10-02 08:41:52,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:41:52,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:41:52,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:54,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:41:54,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 08:42:00,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 08:42:02,087 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=818453.3333333334, ans=0.2 2023-10-02 08:42:03,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 08:42:04,496 INFO [train.py:1046] (1/4) Epoch 24, batch 600, loss[loss=0.1676, simple_loss=0.2489, pruned_loss=0.04316, over 24400.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2496, pruned_loss=0.0483, over 4449161.20 frames. ], batch size: 69, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:42:04,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:42:04,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:42:04,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:42:10,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:42:11,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:42:12,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 08:42:14,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:42:14,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=818520.0, ans=0.0 2023-10-02 08:42:15,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:42:17,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:42:20,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 08:42:20,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:42:28,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 08:42:32,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:42:32,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:42:32,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:42:37,053 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.807e+02 2.027e+02 2.301e+02 3.422e+02, threshold=4.053e+02, percent-clipped=0.0 2023-10-02 08:42:38,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:42:38,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:42:38,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:42:45,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:42:45,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=818653.3333333334, ans=0.95 2023-10-02 08:42:49,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:42:50,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:42:50,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:42:58,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 08:43:03,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:43:03,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:43:06,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 08:43:06,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:43:09,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 08:43:10,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:43:12,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:43:17,585 INFO [train.py:1046] (1/4) Epoch 24, batch 650, loss[loss=0.1673, simple_loss=0.2527, pruned_loss=0.04089, over 24557.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2482, pruned_loss=0.04758, over 4510748.35 frames. ], batch size: 71, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:43:17,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 08:43:19,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:43:21,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:43:23,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:43:25,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:43:26,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 08:43:27,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:43:33,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:43:33,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:43:37,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:43:37,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=818920.0, ans=0.025 2023-10-02 08:43:38,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=818920.0, ans=0.125 2023-10-02 08:43:39,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 08:43:42,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:43:42,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:43:46,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:43:46,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 08:43:48,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:43:48,677 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.86 vs. limit=12.0 2023-10-02 08:43:49,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:43:50,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:43:50,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:43:51,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:43:54,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:43:54,549 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 08:43:55,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:43:55,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:43:57,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:43:59,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:44:00,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:44:01,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:44:01,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 08:44:03,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:44:03,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:44:05,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 08:44:05,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:44:06,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:44:06,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 08:44:07,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 08:44:09,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:09,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:44:09,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:44:09,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:44:11,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:44:12,492 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.84 vs. limit=22.5 2023-10-02 08:44:17,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:17,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:44:18,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:44:22,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:44:22,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 08:44:23,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:44:28,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:44:28,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:44:29,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:44:29,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:44:31,242 INFO [train.py:1046] (1/4) Epoch 24, batch 700, loss[loss=0.1591, simple_loss=0.2419, pruned_loss=0.03811, over 24464.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2471, pruned_loss=0.04727, over 4551661.87 frames. ], batch size: 63, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:44:34,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 08:44:34,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 08:44:36,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 08:44:36,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:38,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:44:40,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 08:44:42,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:44:45,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:44:47,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:48,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:44:48,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:44:51,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:51,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_na.min_abs, batch_count=819253.3333333334, ans=0.02 2023-10-02 08:44:54,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 08:44:54,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:44:56,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 08:44:59,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 08:45:03,640 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.854e+02 2.026e+02 2.248e+02 3.229e+02, threshold=4.051e+02, percent-clipped=0.0 2023-10-02 08:45:03,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:45:03,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:45:05,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:45:09,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:45:09,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 08:45:13,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:45:13,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:45:13,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 08:45:19,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:45:20,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:45:23,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:45:31,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:45:31,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 08:45:34,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 08:45:35,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 08:45:37,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:45:38,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:45:38,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:45:41,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:45:41,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 08:45:44,210 INFO [train.py:1046] (1/4) Epoch 24, batch 750, loss[loss=0.1733, simple_loss=0.2407, pruned_loss=0.05296, over 22731.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2467, pruned_loss=0.04675, over 4598557.75 frames. ], batch size: 322, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:45:45,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 08:45:46,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 08:45:46,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 08:45:48,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 08:45:48,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 08:45:50,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:45:51,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 08:45:52,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:45:52,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:45:54,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:45:55,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=819520.0, ans=0.125 2023-10-02 08:45:56,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:45:56,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:45:56,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:46:00,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:46:01,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:46:03,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:46:06,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:46:07,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:46:07,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 08:46:09,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:46:09,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:46:10,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:46:13,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 08:46:13,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=819653.3333333334, ans=0.125 2023-10-02 08:46:14,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 08:46:14,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:46:16,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 08:46:16,104 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 08:46:16,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 08:46:17,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:46:17,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:46:18,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:46:19,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=819653.3333333334, ans=0.1 2023-10-02 08:46:26,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:46:26,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:46:26,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:46:29,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:46:30,037 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.68 vs. limit=22.5 2023-10-02 08:46:30,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:46:30,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 08:46:31,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:46:33,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 08:46:34,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:46:35,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:46:37,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 08:46:37,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:46:41,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:46:44,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:46:44,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:46:46,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:46:48,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=819786.6666666666, ans=0.125 2023-10-02 08:46:49,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 08:46:49,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:46:51,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:46:53,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:46:53,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:46:55,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:46:55,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:46:56,923 INFO [train.py:1046] (1/4) Epoch 24, batch 800, loss[loss=0.1935, simple_loss=0.2587, pruned_loss=0.06414, over 23793.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2476, pruned_loss=0.047, over 4631970.75 frames. ], batch size: 164, lr: 4.31e-03, grad_scale: 32.0 2023-10-02 08:47:03,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:47:03,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:05,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:47:05,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:47:06,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:07,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:08,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:08,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=819853.3333333334, ans=0.2 2023-10-02 08:47:12,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:47:13,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:47:13,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=819920.0, ans=0.125 2023-10-02 08:47:16,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 08:47:17,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:17,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:47:18,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:47:18,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:47:18,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 08:47:18,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:47:18,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 08:47:22,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:22,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=819920.0, ans=0.125 2023-10-02 08:47:22,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=819920.0, ans=0.125 2023-10-02 08:47:24,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:47:27,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:47:27,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:47:30,994 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.828e+02 1.971e+02 2.242e+02 3.409e+02, threshold=3.942e+02, percent-clipped=0.0 2023-10-02 08:47:31,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:31,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:37,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:47:38,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:47:38,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 08:47:40,247 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 08:47:40,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 08:47:40,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:47:40,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:47:42,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:42,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:47:44,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=820053.3333333334, ans=0.125 2023-10-02 08:47:47,018 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 08:47:47,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 08:47:47,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=820053.3333333334, ans=0.125 2023-10-02 08:47:47,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=820053.3333333334, ans=0.95 2023-10-02 08:47:48,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:47:49,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:47:54,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:47:55,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:57,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 08:47:58,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:47:58,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=820120.0, ans=0.0 2023-10-02 08:48:01,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 08:48:09,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:48:09,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=820186.6666666666, ans=0.2 2023-10-02 08:48:10,751 INFO [train.py:1046] (1/4) Epoch 24, batch 850, loss[loss=0.1666, simple_loss=0.2415, pruned_loss=0.04589, over 23337.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2487, pruned_loss=0.04741, over 4651961.80 frames. ], batch size: 119, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:48:12,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:48:12,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 08:48:13,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:48:13,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:48:13,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=820186.6666666666, ans=0.125 2023-10-02 08:48:15,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 08:48:15,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:48:16,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:48:17,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:48:18,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=820186.6666666666, ans=0.2 2023-10-02 08:48:19,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:48:19,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:48:20,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 08:48:20,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=820186.6666666666, ans=0.0 2023-10-02 08:48:21,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 08:48:21,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 08:48:22,329 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=5.257e-03 2023-10-02 08:48:23,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:48:23,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:48:26,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:48:26,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=820253.3333333334, ans=0.0 2023-10-02 08:48:26,988 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.93 vs. limit=15.0 2023-10-02 08:48:27,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:48:27,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:48:28,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=820253.3333333334, ans=0.125 2023-10-02 08:48:30,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:48:32,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:48:32,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 08:48:34,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=820253.3333333334, ans=0.125 2023-10-02 08:48:37,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 08:48:40,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:48:41,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 08:48:43,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=820320.0, ans=0.0 2023-10-02 08:48:44,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 08:48:44,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 08:48:47,451 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 08:48:47,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:48:47,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:48:47,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 08:48:50,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:48:51,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:48:51,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 08:48:54,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:48:55,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:48:57,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:48:57,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:48:58,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:49:00,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:49:00,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 08:49:04,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:49:04,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:49:05,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:49:05,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:49:06,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:49:06,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=820386.6666666666, ans=0.125 2023-10-02 08:49:10,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:49:11,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:49:13,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:49:15,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:49:15,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:49:20,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 08:49:22,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:49:22,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 08:49:23,572 INFO [train.py:1046] (1/4) Epoch 24, batch 900, loss[loss=0.1746, simple_loss=0.2466, pruned_loss=0.05129, over 23633.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2497, pruned_loss=0.0478, over 4680528.01 frames. ], batch size: 135, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 08:49:23,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:49:23,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:49:24,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 08:49:31,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:49:31,244 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:49:34,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:49:34,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 08:49:38,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:49:38,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 08:49:40,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 08:49:41,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:49:41,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:49:41,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:49:41,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:49:50,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=820586.6666666666, ans=0.0 2023-10-02 08:49:51,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:49:51,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:49:52,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:49:55,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:49:58,098 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.66 vs. limit=22.5 2023-10-02 08:49:58,566 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.862e+02 2.065e+02 2.310e+02 4.708e+02, threshold=4.130e+02, percent-clipped=1.0 2023-10-02 08:49:59,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 08:50:02,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:50:06,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 08:50:07,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:50:09,199 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 08:50:10,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 08:50:15,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:50:15,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:50:16,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:50:22,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:50:22,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:50:22,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 08:50:22,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:50:25,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 08:50:26,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:50:26,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:50:30,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:50:30,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:50:34,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 08:50:34,588 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 08:50:35,956 INFO [train.py:1046] (1/4) Epoch 24, batch 950, loss[loss=0.1732, simple_loss=0.2633, pruned_loss=0.04151, over 24667.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2495, pruned_loss=0.04743, over 4707295.38 frames. ], batch size: 73, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:50:36,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 08:50:36,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 08:50:37,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:50:40,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 08:50:46,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:50:48,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:50:48,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:50:50,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:50:51,604 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 08:50:55,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:50:56,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:50:57,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:50:57,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:50:57,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 08:50:58,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 08:51:00,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:01,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 08:51:01,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:51:07,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:07,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:51:08,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:51:08,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 08:51:09,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:51:10,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:51:11,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=820986.6666666666, ans=0.0 2023-10-02 08:51:12,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:51:15,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:51:15,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:51:18,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 08:51:19,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 08:51:19,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:51:21,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:51:22,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:22,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:51:23,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=821053.3333333334, ans=0.125 2023-10-02 08:51:26,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 08:51:28,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:51:29,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:51:30,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:30,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 08:51:30,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:51:30,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:51:31,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 08:51:33,629 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.14 vs. limit=15.0 2023-10-02 08:51:37,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:51:38,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:51:43,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:51:45,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 08:51:45,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 08:51:47,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:49,060 INFO [train.py:1046] (1/4) Epoch 24, batch 1000, loss[loss=0.1576, simple_loss=0.2304, pruned_loss=0.0424, over 24284.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.249, pruned_loss=0.04798, over 4688391.88 frames. ], batch size: 56, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:51:51,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 08:51:51,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:51:56,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:51:59,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 08:51:59,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 08:52:03,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:52:03,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:52:05,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:52:07,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=821253.3333333334, ans=0.2 2023-10-02 08:52:09,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 08:52:11,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 08:52:12,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 08:52:12,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:52:14,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 08:52:15,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 08:52:17,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 08:52:18,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:52:18,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:52:23,907 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.463e+02 1.855e+02 2.029e+02 2.321e+02 3.051e+02, threshold=4.057e+02, percent-clipped=0.0 2023-10-02 08:52:25,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:52:25,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:52:27,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:52:27,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:52:27,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 08:52:27,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:52:28,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:52:28,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:52:31,174 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 08:52:33,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 08:52:35,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 08:52:39,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 08:52:40,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:52:46,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:52:46,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:52:48,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:52:48,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:52:50,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 08:52:50,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:52:50,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 08:52:51,553 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.97 vs. limit=15.0 2023-10-02 08:52:52,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 08:52:52,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:52:52,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:52:54,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:52:58,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:52:58,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=821453.3333333334, ans=0.125 2023-10-02 08:52:59,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:53:02,889 INFO [train.py:1046] (1/4) Epoch 24, batch 1050, loss[loss=0.1645, simple_loss=0.238, pruned_loss=0.04554, over 23254.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2472, pruned_loss=0.0478, over 4682889.04 frames. ], batch size: 119, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:53:04,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:53:04,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:53:05,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:53:07,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:53:09,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:53:11,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:53:11,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=821520.0, ans=0.1 2023-10-02 08:53:13,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:53:15,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:53:17,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:53:18,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:53:18,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:53:18,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=821586.6666666666, ans=0.2 2023-10-02 08:53:19,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 08:53:19,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:53:20,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 08:53:21,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:53:21,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 08:53:21,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 08:53:28,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=821586.6666666666, ans=0.125 2023-10-02 08:53:28,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=821586.6666666666, ans=0.2 2023-10-02 08:53:30,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:53:31,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:53:31,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:53:34,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=821653.3333333334, ans=0.125 2023-10-02 08:53:35,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 08:53:35,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 08:53:35,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:53:36,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 08:53:39,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 08:53:41,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:53:45,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 08:53:46,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 08:53:46,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:53:46,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:53:48,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=821720.0, ans=0.125 2023-10-02 08:53:50,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:53:53,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 08:53:56,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 08:53:57,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 08:53:57,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:53:57,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:53:59,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 08:54:04,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:54:05,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:54:05,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:54:06,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:54:06,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:10,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:10,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 08:54:12,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:54:12,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 08:54:12,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 08:54:12,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:54:16,460 INFO [train.py:1046] (1/4) Epoch 24, batch 1100, loss[loss=0.1716, simple_loss=0.2591, pruned_loss=0.04198, over 24671.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2467, pruned_loss=0.04712, over 4709102.43 frames. ], batch size: 73, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:54:16,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:54:21,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:54:25,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:54:27,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:54:27,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:54:27,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 08:54:29,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:54:32,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:54:32,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=821920.0, ans=0.125 2023-10-02 08:54:34,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:54:34,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=821920.0, ans=0.0 2023-10-02 08:54:35,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:54:36,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 08:54:37,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 08:54:38,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:54:38,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:54:41,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:54:44,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:54:48,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:54:50,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=821986.6666666666, ans=0.125 2023-10-02 08:54:51,384 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.803e+02 1.981e+02 2.331e+02 3.257e+02, threshold=3.962e+02, percent-clipped=0.0 2023-10-02 08:54:51,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 08:54:51,534 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 08:54:51,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:53,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:54,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:54:54,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:54:55,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 08:54:57,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:54:57,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:54:57,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:54:58,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:58,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 08:55:02,263 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.73 vs. limit=15.0 2023-10-02 08:55:05,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:55:06,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 08:55:08,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:55:11,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=822053.3333333334, ans=0.125 2023-10-02 08:55:14,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:55:16,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 08:55:16,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:55:18,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:55:21,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:55:21,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:55:22,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 08:55:22,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:55:23,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:55:23,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 08:55:25,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:55:25,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 08:55:27,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:55:27,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:55:28,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:55:29,869 INFO [train.py:1046] (1/4) Epoch 24, batch 1150, loss[loss=0.175, simple_loss=0.2634, pruned_loss=0.04333, over 24452.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2474, pruned_loss=0.04698, over 4726277.73 frames. ], batch size: 69, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:55:33,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:55:34,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=822186.6666666666, ans=0.2 2023-10-02 08:55:35,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:55:37,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:55:37,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:55:37,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 08:55:37,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:55:40,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=822186.6666666666, ans=0.1 2023-10-02 08:55:42,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 08:55:42,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=822186.6666666666, ans=0.125 2023-10-02 08:55:43,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:55:43,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:55:50,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 08:55:52,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:55:54,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:55:56,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:55:56,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 08:55:56,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:55:56,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:56:00,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 08:56:02,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:56:03,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:56:14,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:56:20,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:56:20,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 08:56:21,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:56:21,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:56:25,961 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 08:56:28,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:56:36,035 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 08:56:40,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:56:40,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:56:40,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:56:41,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:56:43,608 INFO [train.py:1046] (1/4) Epoch 24, batch 1200, loss[loss=0.1418, simple_loss=0.2282, pruned_loss=0.02774, over 24598.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2476, pruned_loss=0.04661, over 4722583.60 frames. ], batch size: 60, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 08:56:44,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:56:49,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:56:49,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:56:52,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:56:52,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:56:52,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:56:53,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:56:55,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:56:56,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:56:56,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:56:57,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=822586.6666666666, ans=0.0 2023-10-02 08:56:59,659 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 08:57:02,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 08:57:07,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:57:09,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:57:11,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:57:13,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:57:13,900 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 08:57:13,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:57:19,062 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.901e+02 2.150e+02 2.548e+02 4.088e+02, threshold=4.300e+02, percent-clipped=1.0 2023-10-02 08:57:22,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:57:22,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:57:23,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 08:57:23,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:57:27,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 08:57:30,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 08:57:31,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:57:31,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:57:31,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:57:33,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 08:57:33,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:57:34,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:57:35,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:57:35,182 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:57:36,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 08:57:36,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:57:37,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:57:37,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 08:57:40,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:57:40,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:57:41,785 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.84 vs. limit=22.5 2023-10-02 08:57:43,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 08:57:46,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:57:49,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 08:57:54,548 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 08:57:55,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:57:57,174 INFO [train.py:1046] (1/4) Epoch 24, batch 1250, loss[loss=0.157, simple_loss=0.2376, pruned_loss=0.03822, over 24572.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2483, pruned_loss=0.04731, over 4717662.63 frames. ], batch size: 60, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 08:57:57,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:57:59,444 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.50 vs. limit=15.0 2023-10-02 08:58:00,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:58:00,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:58:01,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=822853.3333333334, ans=0.125 2023-10-02 08:58:03,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 08:58:06,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:58:08,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:58:10,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 08:58:11,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:58:13,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:58:17,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:58:17,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:58:19,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:58:19,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:58:22,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:58:25,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 08:58:27,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:58:27,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:58:28,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:58:28,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:58:31,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:58:34,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 08:58:37,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 08:58:38,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:58:40,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:58:41,122 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.85 vs. limit=15.0 2023-10-02 08:58:41,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 08:58:43,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:58:43,123 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 08:58:43,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:58:44,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:58:47,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:58:50,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:58:50,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:58:50,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 08:58:51,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 08:58:51,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 08:58:55,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:58:57,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 08:58:57,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:58:59,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 08:59:01,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:59:01,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=823120.0, ans=0.125 2023-10-02 08:59:02,053 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.69 vs. limit=15.0 2023-10-02 08:59:02,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 08:59:03,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:59:03,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:59:03,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 08:59:05,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:59:06,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 08:59:08,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:59:09,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:59:11,138 INFO [train.py:1046] (1/4) Epoch 24, batch 1300, loss[loss=0.2595, simple_loss=0.3152, pruned_loss=0.1019, over 19596.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2485, pruned_loss=0.04728, over 4717885.73 frames. ], batch size: 389, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 08:59:11,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:59:13,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:59:16,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:59:16,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 08:59:20,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:59:22,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:59:22,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:59:24,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:59:26,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:59:27,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 08:59:31,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:59:32,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 08:59:32,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 08:59:36,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:59:40,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:59:40,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:59:41,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:59:43,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:59:43,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:59:44,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 08:59:44,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 08:59:46,394 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.906e+02 2.045e+02 2.364e+02 3.578e+02, threshold=4.090e+02, percent-clipped=0.0 2023-10-02 08:59:50,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:59:50,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:59:52,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 08:59:52,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:59:54,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:59:54,551 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.24 vs. limit=15.0 2023-10-02 08:59:56,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:59:58,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 08:59:58,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:59:58,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 09:00:00,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:00:05,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:00:05,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:00:09,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 09:00:09,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 09:00:10,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 09:00:15,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:00:16,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 09:00:19,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:00:23,792 INFO [train.py:1046] (1/4) Epoch 24, batch 1350, loss[loss=0.1718, simple_loss=0.2614, pruned_loss=0.04111, over 24606.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2476, pruned_loss=0.04769, over 4697636.30 frames. ], batch size: 68, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 09:00:25,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 09:00:30,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:00:30,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:00:33,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:00:34,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:00:34,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:00:36,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:00:42,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:00:43,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 09:00:45,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 09:00:45,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:00:45,838 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.09 vs. limit=22.5 2023-10-02 09:00:47,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 09:00:49,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:00:50,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:00:50,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 09:00:52,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 09:00:52,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 09:00:55,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:00:55,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 09:00:55,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=823653.3333333334, ans=0.125 2023-10-02 09:00:55,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=823653.3333333334, ans=0.1 2023-10-02 09:01:06,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:01:09,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=823720.0, ans=0.0 2023-10-02 09:01:15,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:01:15,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:01:15,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 09:01:16,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:01:17,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 09:01:17,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 09:01:19,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:01:21,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:01:25,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 09:01:27,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:01:30,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 09:01:32,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 09:01:37,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 09:01:38,750 INFO [train.py:1046] (1/4) Epoch 24, batch 1400, loss[loss=0.1736, simple_loss=0.2448, pruned_loss=0.05117, over 23477.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2456, pruned_loss=0.04743, over 4680600.75 frames. ], batch size: 120, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 09:01:39,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:01:43,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:01:43,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:01:48,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 09:01:50,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 09:01:52,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=823920.0, ans=0.125 2023-10-02 09:01:52,526 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.17 vs. limit=6.0 2023-10-02 09:01:55,379 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.17 vs. limit=22.5 2023-10-02 09:01:58,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:02:00,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:02:02,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:02:02,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:02:05,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:02:06,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 09:02:15,580 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.897e+02 2.132e+02 2.372e+02 2.905e+02, threshold=4.263e+02, percent-clipped=0.0 2023-10-02 09:02:15,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:02:15,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:02:18,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 09:02:20,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:02:21,139 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.65 vs. limit=15.0 2023-10-02 09:02:21,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:02:23,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:02:23,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:02:24,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:02:24,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:02:25,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:02:25,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 09:02:25,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:02:30,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:02:30,844 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:02:33,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:02:37,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=824120.0, ans=0.07 2023-10-02 09:02:41,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 09:02:42,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 09:02:44,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:02:45,180 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.86 vs. limit=15.0 2023-10-02 09:02:45,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 09:02:46,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:02:48,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:02:51,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:02:52,960 INFO [train.py:1046] (1/4) Epoch 24, batch 1450, loss[loss=0.146, simple_loss=0.2291, pruned_loss=0.03149, over 24476.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2458, pruned_loss=0.047, over 4701168.77 frames. ], batch size: 63, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 09:02:53,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:02:53,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:02:53,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 09:02:57,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:02:59,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:03:00,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:03:00,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 09:03:01,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:03:03,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 09:03:03,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:03:05,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:03:05,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 09:03:06,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:03:06,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:03:07,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 09:03:07,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:03:09,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:03:12,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:03:13,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:03:16,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:03:16,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:03:18,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:03:18,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:03:21,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:03:22,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:03:22,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:03:23,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:03:27,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 09:03:31,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:03:32,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=824320.0, ans=0.95 2023-10-02 09:03:34,181 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 09:03:35,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:03:36,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:03:38,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:03:40,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 09:03:44,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:03:46,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 09:03:46,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=824386.6666666666, ans=0.04949747468305833 2023-10-02 09:03:47,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 09:03:48,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:03:53,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:03:54,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:03:55,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 09:03:56,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=824453.3333333334, ans=0.07 2023-10-02 09:03:57,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 09:03:58,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 09:04:00,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:04:00,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:04:06,031 INFO [train.py:1046] (1/4) Epoch 24, batch 1500, loss[loss=0.1876, simple_loss=0.2574, pruned_loss=0.05895, over 22874.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2463, pruned_loss=0.04717, over 4713482.20 frames. ], batch size: 322, lr: 4.29e-03, grad_scale: 8.0 2023-10-02 09:04:07,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=824520.0, ans=0.1 2023-10-02 09:04:12,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 09:04:12,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:04:12,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:04:13,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:04:13,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:04:13,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=824520.0, ans=0.125 2023-10-02 09:04:13,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=824520.0, ans=0.125 2023-10-02 09:04:14,136 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.24 vs. limit=22.5 2023-10-02 09:04:16,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:04:16,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 09:04:18,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:04:18,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 09:04:18,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:04:19,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:04:20,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:04:22,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:04:26,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=824586.6666666666, ans=0.0 2023-10-02 09:04:29,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:04:29,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 09:04:30,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:04:30,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:04:32,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:04:33,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=824653.3333333334, ans=0.1 2023-10-02 09:04:34,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 09:04:38,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 09:04:39,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:04:40,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 09:04:42,143 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.830e+02 2.016e+02 2.339e+02 3.423e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-02 09:04:42,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:04:42,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=824653.3333333334, ans=0.125 2023-10-02 09:04:44,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:04:45,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:04:45,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:04:48,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 09:04:50,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:04:50,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:04:50,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 09:04:50,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:04:54,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:04:54,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 09:04:58,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 09:05:00,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:05:04,276 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 09:05:04,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:05:04,341 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 09:05:06,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:05:07,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:05:07,598 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 09:05:08,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:05:11,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 09:05:12,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:05:16,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:05:16,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:05:18,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:05:18,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:05:18,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:05:19,577 INFO [train.py:1046] (1/4) Epoch 24, batch 1550, loss[loss=0.1561, simple_loss=0.2415, pruned_loss=0.03531, over 24424.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2474, pruned_loss=0.04744, over 4722740.83 frames. ], batch size: 69, lr: 4.29e-03, grad_scale: 8.0 2023-10-02 09:05:19,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 09:05:19,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=824853.3333333334, ans=0.125 2023-10-02 09:05:20,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 09:05:21,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:05:22,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 09:05:22,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 09:05:26,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:05:26,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:05:28,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:05:28,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:05:29,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:05:29,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:05:32,626 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 09:05:32,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:05:32,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=824920.0, ans=0.125 2023-10-02 09:05:33,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:05:33,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 09:05:37,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:05:37,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 09:05:38,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=824920.0, ans=0.1 2023-10-02 09:05:39,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:05:39,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 09:05:41,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 09:05:41,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 09:05:41,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:05:42,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:05:44,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=824920.0, ans=0.2 2023-10-02 09:05:47,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:05:50,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 09:05:50,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 09:05:56,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:05:59,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:06:01,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:06:01,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:06:01,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 09:06:02,127 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.68 vs. limit=6.0 2023-10-02 09:06:07,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:06:08,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:06:10,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:06:12,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:06:13,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:06:13,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 09:06:13,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:06:14,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:06:15,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:06:17,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 09:06:17,151 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 09:06:19,185 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.49 vs. limit=15.0 2023-10-02 09:06:19,640 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.23 vs. limit=15.0 2023-10-02 09:06:20,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:06:23,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 09:06:24,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=825120.0, ans=0.125 2023-10-02 09:06:28,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:06:29,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:06:30,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 09:06:32,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:06:32,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:06:32,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:06:32,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:06:33,670 INFO [train.py:1046] (1/4) Epoch 24, batch 1600, loss[loss=0.1839, simple_loss=0.2616, pruned_loss=0.05314, over 23499.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2486, pruned_loss=0.04755, over 4736487.99 frames. ], batch size: 94, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:06:35,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:06:38,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:06:39,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 09:06:40,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 09:06:42,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 09:06:43,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:06:45,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 09:06:45,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:06:46,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:06:53,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:06:56,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 09:06:59,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:07:00,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 09:07:00,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:01,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 09:07:06,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 09:07:11,393 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.850e+02 2.023e+02 2.271e+02 3.157e+02, threshold=4.047e+02, percent-clipped=0.0 2023-10-02 09:07:12,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:07:12,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 09:07:14,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:07:14,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:07:14,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:07:17,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 09:07:22,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 09:07:24,483 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.29 vs. limit=15.0 2023-10-02 09:07:25,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:07:26,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:26,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:28,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:07:29,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:07:31,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:07:32,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:07:38,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=825453.3333333334, ans=0.09899494936611666 2023-10-02 09:07:39,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:39,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:07:40,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 09:07:40,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:07:42,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 09:07:48,252 INFO [train.py:1046] (1/4) Epoch 24, batch 1650, loss[loss=0.1681, simple_loss=0.2388, pruned_loss=0.04866, over 23663.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2489, pruned_loss=0.04745, over 4736586.35 frames. ], batch size: 232, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:07:48,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:07:49,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:07:51,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:07:51,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 09:07:51,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 09:07:51,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 09:07:51,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 09:07:57,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:57,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:07:57,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:07:59,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:07:59,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:08:01,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=825520.0, ans=0.0 2023-10-02 09:08:02,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 09:08:05,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:08:05,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:08:05,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:08:05,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:08:05,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 09:08:05,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 09:08:08,722 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.16 vs. limit=15.0 2023-10-02 09:08:11,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:08:15,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:08:22,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 09:08:24,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:08:25,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 09:08:28,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:08:29,615 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.58 vs. limit=6.0 2023-10-02 09:08:32,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:08:32,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:08:33,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:08:33,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:08:34,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:08:37,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:08:37,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:08:37,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:08:38,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:08:40,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:08:40,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:08:40,794 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.34 vs. limit=15.0 2023-10-02 09:08:43,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:08:44,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 09:08:46,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:08:46,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 09:08:47,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 09:08:47,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 09:08:47,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:08:49,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:08:49,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:08:49,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:08:49,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 09:08:54,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:08:54,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=825786.6666666666, ans=0.125 2023-10-02 09:08:56,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:08:56,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:08:57,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 09:09:02,451 INFO [train.py:1046] (1/4) Epoch 24, batch 1700, loss[loss=0.1751, simple_loss=0.2379, pruned_loss=0.05612, over 23863.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2484, pruned_loss=0.04711, over 4737659.29 frames. ], batch size: 195, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:09:03,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:09:03,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:09:03,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 09:09:05,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:09:05,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:09:05,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:09:06,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:09:06,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:09:08,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 09:09:09,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:09:18,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:09:19,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:09:25,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:09:25,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:09:25,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:09:26,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:09:29,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 09:09:30,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=825986.6666666666, ans=0.125 2023-10-02 09:09:31,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:09:31,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:09:34,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:09:35,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 09:09:38,222 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.847e+02 2.054e+02 2.399e+02 3.587e+02, threshold=4.108e+02, percent-clipped=0.0 2023-10-02 09:09:38,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 09:09:38,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 09:09:39,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:09:41,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 09:09:41,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:09:48,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:09:50,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:09:50,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:09:52,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:09:52,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 09:09:52,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:09:56,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:09:56,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 09:09:58,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:09:58,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:09:58,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:09:58,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:10:00,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:10:00,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:10:01,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:10:01,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:10:03,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:10:05,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:10:07,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 09:10:08,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:10:09,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:10:11,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 09:10:15,716 INFO [train.py:1046] (1/4) Epoch 24, batch 1750, loss[loss=0.1605, simple_loss=0.2381, pruned_loss=0.04143, over 24499.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2473, pruned_loss=0.04669, over 4736208.36 frames. ], batch size: 63, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:10:17,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:10:19,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:10:19,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:10:21,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 09:10:21,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:10:22,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:10:24,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:10:24,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=826186.6666666666, ans=0.125 2023-10-02 09:10:27,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 09:10:30,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:10:33,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 09:10:33,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:10:34,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:10:37,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 09:10:38,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 09:10:41,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:10:41,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 09:10:48,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=826320.0, ans=0.2 2023-10-02 09:10:49,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:10:52,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:10:52,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:10:55,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:10:55,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:10:56,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:10:58,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:11:01,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:11:01,988 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:11:02,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=826386.6666666666, ans=0.125 2023-10-02 09:11:03,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:11:04,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 09:11:05,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:11:07,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 09:11:08,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:11:09,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:11:11,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:11:12,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=826453.3333333334, ans=0.125 2023-10-02 09:11:14,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:11:15,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 09:11:15,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:11:18,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:11:20,649 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.37 vs. limit=22.5 2023-10-02 09:11:21,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:11:22,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:11:24,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:11:26,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 09:11:26,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:11:26,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:11:27,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:11:27,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:11:28,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:11:29,228 INFO [train.py:1046] (1/4) Epoch 24, batch 1800, loss[loss=0.1497, simple_loss=0.2012, pruned_loss=0.04916, over 19373.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.247, pruned_loss=0.047, over 4727105.99 frames. ], batch size: 388, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:11:29,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:11:30,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:11:32,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:11:33,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 09:11:36,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:11:40,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:11:40,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:11:42,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:11:43,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:11:43,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:11:45,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:11:46,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:11:46,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 09:11:48,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:11:51,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=826586.6666666666, ans=0.0 2023-10-02 09:11:52,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:11:55,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 09:11:59,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 09:11:59,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 09:11:59,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:12:00,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:12:00,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:12:04,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:12:06,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=826653.3333333334, ans=0.2 2023-10-02 09:12:08,722 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.863e+02 2.128e+02 2.344e+02 3.480e+02, threshold=4.257e+02, percent-clipped=0.0 2023-10-02 09:12:11,502 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 09:12:11,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:12:12,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:12:14,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 09:12:14,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 09:12:14,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:12:16,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:12:17,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:12:17,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=826720.0, ans=0.125 2023-10-02 09:12:20,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 09:12:27,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:12:29,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 09:12:29,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:12:29,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=826786.6666666666, ans=0.1 2023-10-02 09:12:31,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:12:31,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:12:32,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 09:12:34,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:12:34,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:12:34,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=826786.6666666666, ans=0.125 2023-10-02 09:12:36,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 09:12:36,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:12:39,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:12:39,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:12:39,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:12:42,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:12:42,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:12:44,920 INFO [train.py:1046] (1/4) Epoch 24, batch 1850, loss[loss=0.1509, simple_loss=0.227, pruned_loss=0.03742, over 24421.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2473, pruned_loss=0.04717, over 4709660.39 frames. ], batch size: 58, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:12:45,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:12:45,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:12:45,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=826853.3333333334, ans=0.1 2023-10-02 09:12:47,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:12:47,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:12:52,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:12:54,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 09:12:56,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 09:12:59,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 09:13:02,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:13:04,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 09:13:04,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 09:13:09,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:13:12,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 09:13:15,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:13:15,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:13:18,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 09:13:18,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=826986.6666666666, ans=0.0 2023-10-02 09:13:19,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:19,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:13:21,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:13:23,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:13:26,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:13:29,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:13:30,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:32,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 09:13:32,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:13:32,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:13:33,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:13:36,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 09:13:37,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:13:40,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:13:42,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:13:42,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 09:13:42,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 09:13:44,781 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 09:13:44,850 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 09:13:47,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:13:47,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:13:47,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:13:47,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:48,870 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 09:13:48,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:13:50,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:52,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:13:52,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:13:53,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:13:53,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 09:13:55,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:55,091 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 09:13:55,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=827120.0, ans=0.2 2023-10-02 09:13:56,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:13:56,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:13:59,747 INFO [train.py:1046] (1/4) Epoch 24, batch 1900, loss[loss=0.1666, simple_loss=0.2499, pruned_loss=0.04162, over 24397.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2478, pruned_loss=0.04727, over 4716403.95 frames. ], batch size: 77, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:14:03,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:14:04,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:14:05,949 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 09:14:07,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 09:14:08,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:14:10,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:14:10,148 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 09:14:10,188 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 09:14:11,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=827186.6666666666, ans=0.0 2023-10-02 09:14:14,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 09:14:15,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:14:20,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 09:14:22,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 09:14:22,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=827253.3333333334, ans=0.2 2023-10-02 09:14:22,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=827253.3333333334, ans=0.125 2023-10-02 09:14:23,302 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.01 vs. limit=15.0 2023-10-02 09:14:27,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=827320.0, ans=0.125 2023-10-02 09:14:30,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 09:14:33,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 09:14:35,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:14:35,206 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 09:14:35,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 09:14:35,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 09:14:35,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 09:14:35,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:14:36,440 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.881e+02 2.012e+02 2.248e+02 2.968e+02, threshold=4.023e+02, percent-clipped=0.0 2023-10-02 09:14:36,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=827320.0, ans=0.125 2023-10-02 09:14:38,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=827320.0, ans=0.125 2023-10-02 09:14:40,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 09:14:41,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:14:44,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:14:44,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 09:14:45,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=827386.6666666666, ans=10.0 2023-10-02 09:14:47,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:14:49,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 09:14:50,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:14:56,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:14:56,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:14:56,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:14:56,934 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.26 vs. limit=15.0 2023-10-02 09:14:57,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:14:59,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:14:59,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 09:15:01,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:15:03,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:15:03,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:15:05,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:15:05,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:15:07,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:15:09,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:15:12,549 INFO [train.py:1046] (1/4) Epoch 24, batch 1950, loss[loss=0.166, simple_loss=0.2481, pruned_loss=0.04198, over 24670.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2485, pruned_loss=0.0476, over 4720151.74 frames. ], batch size: 65, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:15:12,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:15:14,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:15:14,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:15:14,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:15:18,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 09:15:18,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 09:15:18,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:15:20,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:15:23,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:15:23,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:15:23,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:15:23,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=827520.0, ans=0.025 2023-10-02 09:15:26,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:15:29,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:15:29,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:15:29,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:15:31,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:15:34,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:15:37,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:15:37,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:15:37,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:15:37,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 09:15:39,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:15:39,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:15:39,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:15:44,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:15:46,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:15:49,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:15:52,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=827653.3333333334, ans=0.125 2023-10-02 09:15:54,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:15:54,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:15:55,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 09:15:55,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:15:58,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:16:00,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:16:00,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:16:04,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=827720.0, ans=0.0 2023-10-02 09:16:07,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=827720.0, ans=0.05 2023-10-02 09:16:09,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:16:09,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:16:11,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:16:14,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:16:15,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:16:17,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:16:17,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 09:16:17,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:16:18,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:16:19,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 09:16:22,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:16:25,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:16:26,780 INFO [train.py:1046] (1/4) Epoch 24, batch 2000, loss[loss=0.1677, simple_loss=0.2577, pruned_loss=0.03882, over 24633.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2488, pruned_loss=0.04744, over 4725438.28 frames. ], batch size: 68, lr: 4.29e-03, grad_scale: 32.0 2023-10-02 09:16:26,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:16:26,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:16:30,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:16:30,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:16:33,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 09:16:34,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:16:38,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:16:40,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 09:16:42,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:16:44,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:16:45,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:16:47,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 09:16:47,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=827920.0, ans=0.2 2023-10-02 09:16:49,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:16:51,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:16:51,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:16:53,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 09:16:53,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:16:55,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 09:16:55,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:16:58,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:16:58,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 09:16:58,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:16:59,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:17:01,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:17:01,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 09:17:04,255 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.913e+02 2.238e+02 2.677e+02 4.135e+02, threshold=4.476e+02, percent-clipped=1.0 2023-10-02 09:17:05,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 09:17:05,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:17:05,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:17:12,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:17:13,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:17:13,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:17:13,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:17:15,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:17:17,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:17:17,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:17:17,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:17:18,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:18,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=828053.3333333334, ans=0.125 2023-10-02 09:17:20,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:17:21,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 09:17:23,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=828053.3333333334, ans=0.0 2023-10-02 09:17:24,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:17:26,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:17:28,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:17:28,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:17:31,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=828120.0, ans=0.1 2023-10-02 09:17:33,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:35,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:17:35,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:37,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:17:37,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:17:38,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:17:39,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=828120.0, ans=0.1 2023-10-02 09:17:40,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:42,163 INFO [train.py:1046] (1/4) Epoch 24, batch 2050, loss[loss=0.1691, simple_loss=0.2431, pruned_loss=0.04751, over 23385.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2483, pruned_loss=0.04738, over 4728999.36 frames. ], batch size: 93, lr: 4.28e-03, grad_scale: 32.0 2023-10-02 09:17:43,077 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.01 vs. limit=15.0 2023-10-02 09:17:45,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:17:46,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:52,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:17:53,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:17:53,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:55,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:17:57,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 09:17:57,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:17:59,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:17:59,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:18:08,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:18:08,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:18:10,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 09:18:12,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:18:14,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 09:18:15,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:18:17,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=828320.0, ans=0.1 2023-10-02 09:18:18,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:18:20,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:18:21,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:18:21,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:18:24,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:18:25,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:18:25,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:18:28,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:18:29,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:18:31,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:18:32,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:18:35,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:18:39,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:18:41,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 09:18:46,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:18:47,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:18:48,204 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.66 vs. limit=10.0 2023-10-02 09:18:50,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:18:51,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 09:18:54,436 INFO [train.py:1046] (1/4) Epoch 24, batch 2100, loss[loss=0.1838, simple_loss=0.2574, pruned_loss=0.05515, over 23309.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2465, pruned_loss=0.0471, over 4723713.02 frames. ], batch size: 93, lr: 4.28e-03, grad_scale: 16.0 2023-10-02 09:18:55,893 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 09:18:55,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:18:57,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:18:57,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:18:57,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:18:58,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 09:18:58,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 09:18:58,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=828520.0, ans=0.0 2023-10-02 09:19:00,571 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.72 vs. limit=6.0 2023-10-02 09:19:01,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:19:02,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:19:04,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:19:04,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:19:07,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:19:07,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 09:19:08,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:19:08,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 09:19:08,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 09:19:10,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:19:10,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:19:10,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 09:19:12,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 09:19:17,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 09:19:17,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:19:17,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=828586.6666666666, ans=0.125 2023-10-02 09:19:21,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:19:21,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:19:25,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:19:25,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 09:19:25,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:19:25,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 09:19:27,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 09:19:28,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:19:28,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 09:19:28,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 09:19:29,147 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.83 vs. limit=22.5 2023-10-02 09:19:29,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 09:19:32,603 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.934e+02 2.200e+02 2.587e+02 4.169e+02, threshold=4.400e+02, percent-clipped=0.0 2023-10-02 09:19:32,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:19:34,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:19:35,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:19:35,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=828653.3333333334, ans=0.125 2023-10-02 09:19:36,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:19:38,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:19:40,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:19:40,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 09:19:40,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:19:41,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:19:42,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:19:42,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 09:19:45,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 09:19:45,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 09:19:49,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:19:52,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:19:52,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 09:19:52,822 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:19:54,515 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.90 vs. limit=22.5 2023-10-02 09:19:56,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:19:59,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:19:59,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:19:59,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:20:00,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 09:20:00,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:20:02,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:20:02,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:20:02,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=828786.6666666666, ans=0.125 2023-10-02 09:20:03,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:20:03,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:06,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 09:20:08,322 INFO [train.py:1046] (1/4) Epoch 24, batch 2150, loss[loss=0.1743, simple_loss=0.2427, pruned_loss=0.05295, over 23531.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2454, pruned_loss=0.0472, over 4712913.91 frames. ], batch size: 135, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:20:08,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 09:20:08,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:20:11,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:20:11,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:20:11,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:20:11,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:20:17,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 09:20:18,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:20:20,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:23,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:20:23,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:23,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:20:24,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:24,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:20:26,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:20:28,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:30,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 09:20:34,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:20:36,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:20:36,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=828986.6666666666, ans=0.1 2023-10-02 09:20:37,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:37,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:20:37,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:39,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:20:40,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:20:40,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:20:40,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:20:42,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 09:20:42,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:20:44,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:44,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:20:45,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:20:47,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:20:50,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:50,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=828986.6666666666, ans=0.0 2023-10-02 09:20:52,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:20:52,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:20:52,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 09:20:52,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:20:54,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:20:56,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:57,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:20:58,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:21:00,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:01,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:01,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 09:21:03,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 09:21:03,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:21:04,526 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 09:21:04,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:05,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:21:07,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 09:21:07,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:21:07,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 09:21:07,268 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 09:21:07,268 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 09:21:07,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 09:21:08,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:10,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:21:10,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:21:11,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:12,795 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.92 vs. limit=15.0 2023-10-02 09:21:13,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 09:21:14,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:14,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:21,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:21:23,013 INFO [train.py:1046] (1/4) Epoch 24, batch 2200, loss[loss=0.1908, simple_loss=0.2555, pruned_loss=0.0631, over 23424.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2455, pruned_loss=0.04673, over 4717851.72 frames. ], batch size: 285, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:21:23,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 09:21:27,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:21:28,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:29,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:21:29,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:21:31,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:21:32,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:34,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:21:34,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 09:21:38,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 09:21:39,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=829253.3333333334, ans=0.125 2023-10-02 09:21:40,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:21:44,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 09:21:47,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:48,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:21:48,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:21:52,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:21:52,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 09:21:56,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:21:57,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:59,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 09:22:01,832 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.810e+02 2.099e+02 2.472e+02 3.900e+02, threshold=4.197e+02, percent-clipped=0.0 2023-10-02 09:22:02,208 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:22:04,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:22:06,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:22:07,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:22:10,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:22:12,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 09:22:13,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:22:15,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 09:22:16,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:22:16,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:22:16,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:22:18,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:22:19,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:22:19,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:22:19,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:22:20,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:22:20,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:22:21,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=829453.3333333334, ans=0.125 2023-10-02 09:22:22,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:22:25,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 09:22:26,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:22:28,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:22:30,015 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 09:22:31,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:22:32,758 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 09:22:34,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 09:22:34,119 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 09:22:35,409 INFO [train.py:1046] (1/4) Epoch 24, batch 2250, loss[loss=0.1718, simple_loss=0.2614, pruned_loss=0.0411, over 24036.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2467, pruned_loss=0.04678, over 4716665.40 frames. ], batch size: 80, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:22:35,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:22:35,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 09:22:36,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:22:38,359 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 09:22:40,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:22:41,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=829520.0, ans=0.125 2023-10-02 09:22:42,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:22:43,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=829520.0, ans=0.1 2023-10-02 09:22:45,408 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:22:47,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:22:49,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:22:51,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:22:51,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:22:52,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:22:55,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 09:22:55,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:22:55,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:22:58,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 09:22:58,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=829586.6666666666, ans=0.125 2023-10-02 09:22:59,011 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.06 vs. limit=15.0 2023-10-02 09:22:59,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:23:01,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:23:02,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:23:08,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:23:08,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 09:23:08,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:23:09,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 09:23:11,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:23:14,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:23:17,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:23:18,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:23:19,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:23:19,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:23:23,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:23:23,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:23:23,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=829720.0, ans=0.0 2023-10-02 09:23:24,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=829720.0, ans=0.0 2023-10-02 09:23:27,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:23:29,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 09:23:31,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:23:32,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:23:33,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:23:39,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 09:23:40,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 09:23:40,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 09:23:40,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:23:40,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:23:42,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 09:23:44,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=829786.6666666666, ans=0.5 2023-10-02 09:23:46,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:23:46,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:23:49,572 INFO [train.py:1046] (1/4) Epoch 24, batch 2300, loss[loss=0.155, simple_loss=0.2389, pruned_loss=0.03552, over 24662.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.248, pruned_loss=0.04761, over 4706209.20 frames. ], batch size: 73, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:23:51,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:23:52,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:23:54,011 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 09:23:54,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:24:02,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:24:02,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 09:24:03,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:24:03,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:24:03,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 09:24:04,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:24:07,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:24:07,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:24:11,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:24:13,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:24:15,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:24:16,195 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.02 vs. limit=15.0 2023-10-02 09:24:20,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=829986.6666666666, ans=0.0 2023-10-02 09:24:20,730 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.21 vs. limit=6.0 2023-10-02 09:24:21,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:24:22,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:24:25,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:24:28,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:24:29,936 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.846e+02 2.063e+02 2.350e+02 3.320e+02, threshold=4.127e+02, percent-clipped=0.0 2023-10-02 09:24:31,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:24:32,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:24:32,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:24:32,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 09:24:35,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 09:24:35,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:24:35,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:24:35,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:24:37,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:24:37,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 09:24:37,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:24:39,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 09:24:39,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:24:39,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:24:40,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 09:24:43,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=830053.3333333334, ans=0.125 2023-10-02 09:24:46,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:24:48,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=830120.0, ans=0.125 2023-10-02 09:24:50,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:24:53,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:24:53,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:24:55,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:24:56,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 09:24:56,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:24:56,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:24:58,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 09:25:03,539 INFO [train.py:1046] (1/4) Epoch 24, batch 2350, loss[loss=0.1673, simple_loss=0.2532, pruned_loss=0.04068, over 24414.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2487, pruned_loss=0.0478, over 4709643.50 frames. ], batch size: 69, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:25:03,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:25:03,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 09:25:09,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 09:25:11,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=830186.6666666666, ans=0.125 2023-10-02 09:25:12,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:25:15,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:25:15,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:25:15,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:25:16,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:25:18,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 09:25:22,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:25:23,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=830253.3333333334, ans=0.125 2023-10-02 09:25:29,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 09:25:29,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:25:33,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:25:34,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:25:35,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:25:37,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 09:25:38,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:25:40,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:25:40,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:25:40,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:25:43,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:25:45,724 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=12.04 vs. limit=15.0 2023-10-02 09:25:47,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 09:25:47,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:25:50,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:25:50,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:25:52,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 09:25:53,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:25:54,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 09:25:54,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:26:01,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 09:26:05,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=830453.3333333334, ans=0.1 2023-10-02 09:26:06,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 09:26:08,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:26:08,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 09:26:08,339 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 09:26:08,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 09:26:08,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=830453.3333333334, ans=0.125 2023-10-02 09:26:11,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 09:26:14,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:26:17,299 INFO [train.py:1046] (1/4) Epoch 24, batch 2400, loss[loss=0.1837, simple_loss=0.2591, pruned_loss=0.05415, over 23226.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.248, pruned_loss=0.0476, over 4703500.38 frames. ], batch size: 105, lr: 4.28e-03, grad_scale: 16.0 2023-10-02 09:26:17,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:26:20,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:26:22,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:26:23,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 09:26:23,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 09:26:28,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:26:28,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:26:30,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 09:26:30,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:26:32,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:26:32,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 09:26:38,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:26:41,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 09:26:46,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:26:50,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 09:26:51,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:26:53,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:26:56,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:26:57,839 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.801e+02 2.069e+02 2.462e+02 3.779e+02, threshold=4.137e+02, percent-clipped=0.0 2023-10-02 09:26:57,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 09:26:57,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:27:05,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:27:06,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:27:08,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:27:08,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:27:10,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 09:27:10,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:27:10,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:27:11,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:27:11,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 09:27:12,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=830720.0, ans=0.125 2023-10-02 09:27:15,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:27:17,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:27:17,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 09:27:18,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 09:27:20,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:27:20,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:27:21,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 09:27:21,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 09:27:21,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 09:27:21,893 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 09:27:21,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 09:27:23,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:27:25,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:27:25,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:27:25,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=830786.6666666666, ans=0.125 2023-10-02 09:27:26,940 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 09:27:28,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:27:28,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 09:27:31,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:27:31,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:27:32,531 INFO [train.py:1046] (1/4) Epoch 24, batch 2450, loss[loss=0.1757, simple_loss=0.2639, pruned_loss=0.0438, over 24381.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.246, pruned_loss=0.0473, over 4689902.60 frames. ], batch size: 77, lr: 4.28e-03, grad_scale: 16.0 2023-10-02 09:27:35,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:27:35,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:27:37,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 09:27:40,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:27:40,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:27:43,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:27:45,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:27:45,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:27:45,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 09:27:49,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:27:50,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:27:50,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:27:55,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 09:27:55,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:27:56,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:27:56,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:27:59,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 09:27:59,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:28:08,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:28:10,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:28:11,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:28:12,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:28:12,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:28:14,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:28:15,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 09:28:19,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:28:19,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:28:21,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:28:21,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:28:22,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=831053.3333333334, ans=0.1 2023-10-02 09:28:25,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:28:26,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 09:28:26,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:28:28,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:28:28,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 09:28:29,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:28:29,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:28:32,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:28:35,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:28:35,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:28:41,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 09:28:41,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:28:46,891 INFO [train.py:1046] (1/4) Epoch 24, batch 2500, loss[loss=0.1654, simple_loss=0.2521, pruned_loss=0.03938, over 24575.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2459, pruned_loss=0.04688, over 4704006.33 frames. ], batch size: 71, lr: 4.28e-03, grad_scale: 16.0 2023-10-02 09:28:48,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:28:55,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:28:57,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:28:59,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:28:59,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 09:29:04,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:29:04,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:29:06,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 09:29:06,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 09:29:06,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 09:29:07,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:29:08,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:29:10,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 09:29:10,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:29:12,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 09:29:12,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:29:16,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:29:18,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:29:20,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:29:21,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 09:29:22,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:29:25,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:29:25,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=831320.0, ans=0.125 2023-10-02 09:29:28,278 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.796e+02 1.940e+02 2.141e+02 3.270e+02, threshold=3.880e+02, percent-clipped=0.0 2023-10-02 09:29:29,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:29:33,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:29:36,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:29:40,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:29:41,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 09:29:42,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:29:42,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 09:29:44,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:29:44,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:29:47,484 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 09:29:47,484 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 09:29:47,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 09:29:50,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:29:53,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 09:29:53,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 09:29:53,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:29:53,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 09:29:57,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 09:30:00,297 INFO [train.py:1046] (1/4) Epoch 24, batch 2550, loss[loss=0.188, simple_loss=0.274, pruned_loss=0.05099, over 24568.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2466, pruned_loss=0.04694, over 4714291.67 frames. ], batch size: 71, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:30:00,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:30:01,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:30:01,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:30:05,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:30:06,050 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.99 vs. limit=15.0 2023-10-02 09:30:06,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 09:30:06,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:30:10,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 09:30:10,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:30:13,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:30:16,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:30:16,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 09:30:18,000 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.56 vs. limit=15.0 2023-10-02 09:30:18,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:30:18,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:30:18,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=831586.6666666666, ans=0.125 2023-10-02 09:30:20,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:30:21,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:30:21,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 09:30:23,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 09:30:23,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:30:23,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 09:30:34,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:30:39,041 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:30:40,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:30:40,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:30:40,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:30:41,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:30:46,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:30:49,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:30:49,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:30:49,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:30:51,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:30:51,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:30:55,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:30:55,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:30:56,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=831720.0, ans=0.125 2023-10-02 09:30:58,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:30:58,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 09:30:58,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:30:58,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:31:00,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:31:01,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=831786.6666666666, ans=0.125 2023-10-02 09:31:02,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:31:04,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:31:05,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=831786.6666666666, ans=0.0 2023-10-02 09:31:10,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:31:10,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=831786.6666666666, ans=0.125 2023-10-02 09:31:11,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:31:13,117 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 09:31:14,776 INFO [train.py:1046] (1/4) Epoch 24, batch 2600, loss[loss=0.1816, simple_loss=0.2544, pruned_loss=0.05444, over 23556.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2479, pruned_loss=0.04757, over 4706862.70 frames. ], batch size: 256, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:31:16,200 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 09:31:16,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:31:16,263 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 09:31:18,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 09:31:19,330 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 09:31:20,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:31:20,874 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 09:31:22,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 09:31:24,071 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 09:31:24,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=831853.3333333334, ans=0.04949747468305833 2023-10-02 09:31:24,973 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.75 vs. limit=10.0 2023-10-02 09:31:26,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:31:28,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 09:31:29,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 09:31:30,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:31:32,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 09:31:33,791 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 09:31:33,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 09:31:33,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=831920.0, ans=0.125 2023-10-02 09:31:42,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:31:42,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:31:42,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:31:42,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 09:31:45,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:31:50,615 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:31:51,625 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 09:31:55,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:31:55,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:31:56,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 09:31:58,175 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.850e+02 2.063e+02 2.376e+02 4.529e+02, threshold=4.127e+02, percent-clipped=3.0 2023-10-02 09:31:58,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:31:58,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:31:59,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 09:32:01,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:32:02,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:32:03,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:32:06,621 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 09:32:06,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:32:06,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:32:12,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:32:13,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:32:14,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 09:32:15,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:32:17,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:32:18,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:32:24,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 09:32:24,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:32:24,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=832120.0, ans=0.07 2023-10-02 09:32:26,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=832120.0, ans=0.125 2023-10-02 09:32:27,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:32:29,176 INFO [train.py:1046] (1/4) Epoch 24, batch 2650, loss[loss=0.1828, simple_loss=0.2517, pruned_loss=0.05701, over 23729.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2485, pruned_loss=0.04803, over 4709492.77 frames. ], batch size: 232, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:32:30,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 09:32:30,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:32:32,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=832186.6666666666, ans=0.125 2023-10-02 09:32:33,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 09:32:33,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=832186.6666666666, ans=0.2 2023-10-02 09:32:34,660 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 09:32:34,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:32:38,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:32:40,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:32:41,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:32:42,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:32:44,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 09:32:44,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:32:44,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:32:47,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 09:32:49,787 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 09:32:51,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:32:54,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 09:32:55,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:32:55,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 09:32:58,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=832320.0, ans=0.125 2023-10-02 09:33:01,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:33:01,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:33:01,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:33:01,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:07,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 09:33:07,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 09:33:08,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:33:11,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 09:33:11,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:33:14,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:14,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:33:14,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:33:15,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:33:17,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:33:18,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:33:20,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:33:21,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:33:22,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:33:22,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=832386.6666666666, ans=0.0 2023-10-02 09:33:23,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:33:25,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:33:25,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:33:26,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:33:28,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 09:33:30,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:30,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:33:30,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:33:32,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 09:33:35,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:33:36,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:36,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:38,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:33:38,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:33:39,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:33:39,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=832453.3333333334, ans=0.125 2023-10-02 09:33:42,196 INFO [train.py:1046] (1/4) Epoch 24, batch 2700, loss[loss=0.2251, simple_loss=0.292, pruned_loss=0.07911, over 19486.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2487, pruned_loss=0.04759, over 4716826.98 frames. ], batch size: 389, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:33:42,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:33:42,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 09:33:45,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:33:46,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=832520.0, ans=0.125 2023-10-02 09:33:48,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 09:33:50,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:33:50,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:33:50,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:33:51,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:33:51,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:33:51,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:33:52,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:33:52,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 09:33:52,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:33:54,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:33:56,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:33:57,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:58,170 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.56 vs. limit=10.0 2023-10-02 09:34:00,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:34:02,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 09:34:02,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:34:06,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:34:06,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:34:11,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:34:11,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:34:11,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:34:13,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:34:17,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:34:18,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:34:18,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:34:18,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:34:19,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=832653.3333333334, ans=0.04949747468305833 2023-10-02 09:34:23,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:34:23,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:34:24,654 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.849e+02 2.017e+02 2.176e+02 3.250e+02, threshold=4.034e+02, percent-clipped=0.0 2023-10-02 09:34:31,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:34:31,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:34:35,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:34:35,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:34:36,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:34:38,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:34:39,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:34:41,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:34:41,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:34:41,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:34:43,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:34:43,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:34:45,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:34:49,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 09:34:49,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:34:51,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:34:51,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 09:34:53,479 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:34:54,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 09:34:54,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:34:56,394 INFO [train.py:1046] (1/4) Epoch 24, batch 2750, loss[loss=0.1948, simple_loss=0.2638, pruned_loss=0.06292, over 23783.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2483, pruned_loss=0.04741, over 4720998.20 frames. ], batch size: 195, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:34:56,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:34:56,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:34:59,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:34:59,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:35:01,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:02,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:35:03,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 09:35:03,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:35:03,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:03,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 09:35:03,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:35:03,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:35:10,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 09:35:12,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:35:13,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:13,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:35:13,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 09:35:15,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:35:15,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:35:16,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:35:16,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:35:19,901 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=832920.0, ans=0.125 2023-10-02 09:35:20,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:35:20,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 09:35:20,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:35:22,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:24,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:35:28,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:35:31,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:35:31,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:35:35,243 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.51 vs. limit=15.0 2023-10-02 09:35:35,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:35,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:35:35,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:35:42,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:35:44,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:35:44,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 09:35:44,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=833053.3333333334, ans=0.0 2023-10-02 09:35:48,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:35:50,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 09:35:54,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 09:35:57,061 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:35:58,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:35:58,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 09:35:59,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:36:02,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:36:02,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 09:36:03,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:36:07,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 09:36:07,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:36:09,153 INFO [train.py:1046] (1/4) Epoch 24, batch 2800, loss[loss=0.1596, simple_loss=0.2182, pruned_loss=0.05049, over 23421.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2464, pruned_loss=0.04672, over 4704831.89 frames. ], batch size: 285, lr: 4.27e-03, grad_scale: 16.0 2023-10-02 09:36:09,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:36:09,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 09:36:09,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:36:09,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:36:12,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:36:12,070 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 09:36:12,071 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 09:36:16,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:36:17,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:36:17,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:36:22,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:36:23,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 09:36:25,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 09:36:26,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 09:36:28,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:36:28,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:36:30,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:36:34,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:36:34,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:36:34,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 09:36:34,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:36:43,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:36:44,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:36:47,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:36:47,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:36:49,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:36:52,041 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.465e+02 1.978e+02 2.120e+02 2.446e+02 3.436e+02, threshold=4.241e+02, percent-clipped=0.0 2023-10-02 09:36:52,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:36:52,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 09:36:52,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=833386.6666666666, ans=0.0 2023-10-02 09:36:53,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:36:54,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:36:54,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:36:59,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:36:59,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:37:04,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:37:06,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:37:06,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:37:06,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:37:06,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:37:08,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:37:08,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:37:08,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 09:37:08,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:37:09,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:37:09,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:37:11,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 09:37:12,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:37:12,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:37:12,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:37:16,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 09:37:20,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=833453.3333333334, ans=0.125 2023-10-02 09:37:21,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:37:21,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:37:21,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=833520.0, ans=0.0 2023-10-02 09:37:22,928 INFO [train.py:1046] (1/4) Epoch 24, batch 2850, loss[loss=0.1658, simple_loss=0.2553, pruned_loss=0.03816, over 24461.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2464, pruned_loss=0.04612, over 4721200.45 frames. ], batch size: 69, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:37:22,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:37:23,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:37:27,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:37:27,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:37:28,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:37:28,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=833520.0, ans=0.0 2023-10-02 09:37:29,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:37:31,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:37:32,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:37:34,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 09:37:38,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=833586.6666666666, ans=0.125 2023-10-02 09:37:39,094 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.00 vs. limit=15.0 2023-10-02 09:37:39,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 09:37:39,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:37:41,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 09:37:41,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:37:43,231 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.25 vs. limit=15.0 2023-10-02 09:37:43,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 09:37:45,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 09:37:46,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:37:57,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:37:58,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:37:58,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:38:01,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:38:01,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:38:01,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:38:04,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:38:04,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 09:38:06,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:38:07,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:38:08,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:38:08,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:38:11,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:38:11,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:38:12,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:38:14,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:38:15,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:38:15,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:38:17,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:38:20,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:38:23,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:38:25,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 09:38:25,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 09:38:28,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:38:28,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:38:28,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 09:38:30,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:38:30,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:38:32,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:38:32,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:38:32,482 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 09:38:32,528 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 09:38:32,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:38:33,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:38:37,055 INFO [train.py:1046] (1/4) Epoch 24, batch 2900, loss[loss=0.1644, simple_loss=0.2401, pruned_loss=0.04428, over 23640.00 frames. ], tot_loss[loss=0.169, simple_loss=0.246, pruned_loss=0.04597, over 4715683.66 frames. ], batch size: 149, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:38:37,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 09:38:38,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:38:38,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:38:39,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 09:38:40,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=833853.3333333334, ans=0.125 2023-10-02 09:38:41,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=833853.3333333334, ans=0.035 2023-10-02 09:38:44,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:38:44,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 09:38:45,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 09:38:46,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:38:46,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:38:49,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:38:49,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:38:54,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:38:54,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:38:57,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:38:58,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 09:38:58,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:39:01,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:39:03,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 09:39:04,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 09:39:07,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:39:07,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 09:39:07,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:39:09,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:39:09,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 09:39:09,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=833986.6666666666, ans=0.125 2023-10-02 09:39:12,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:39:13,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:39:16,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:39:19,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:39:20,639 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.897e+02 2.159e+02 2.476e+02 3.741e+02, threshold=4.318e+02, percent-clipped=0.0 2023-10-02 09:39:22,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 09:39:22,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 09:39:22,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:39:25,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=834053.3333333334, ans=0.125 2023-10-02 09:39:26,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:39:28,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 09:39:29,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:39:34,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:39:43,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:39:44,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:39:46,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 09:39:47,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:39:47,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 09:39:47,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:39:49,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:39:50,570 INFO [train.py:1046] (1/4) Epoch 24, batch 2950, loss[loss=0.1686, simple_loss=0.2529, pruned_loss=0.04211, over 23419.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2475, pruned_loss=0.04655, over 4721747.00 frames. ], batch size: 93, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:39:53,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:39:55,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=834186.6666666666, ans=0.07 2023-10-02 09:39:56,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 09:39:56,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:39:56,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:39:59,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:40:00,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:40:00,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 09:40:02,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 09:40:02,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:40:02,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:40:04,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=834253.3333333334, ans=0.125 2023-10-02 09:40:05,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=834253.3333333334, ans=0.0 2023-10-02 09:40:10,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:40:12,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:40:12,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:40:14,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:40:16,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:40:16,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:40:17,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:40:18,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:40:18,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:40:21,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=834320.0, ans=0.125 2023-10-02 09:40:23,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 09:40:27,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 09:40:27,815 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 09:40:29,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:40:30,608 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 09:40:30,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 09:40:30,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:40:30,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=834320.0, ans=0.0 2023-10-02 09:40:32,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:40:32,542 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 09:40:32,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 09:40:35,766 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.19 vs. limit=15.0 2023-10-02 09:40:36,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 09:40:36,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:40:38,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:40:39,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:40:41,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:40:42,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:40:42,591 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 09:40:42,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:40:42,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 09:40:49,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:40:51,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:40:52,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 09:40:52,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:40:52,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 09:40:55,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:40:56,608 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.44 vs. limit=15.0 2023-10-02 09:40:57,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:40:58,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:40:58,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:40:58,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 09:41:00,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:41:01,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:41:01,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:41:01,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:41:02,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:41:04,749 INFO [train.py:1046] (1/4) Epoch 24, batch 3000, loss[loss=0.1699, simple_loss=0.2413, pruned_loss=0.04925, over 23632.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2481, pruned_loss=0.04705, over 4718125.65 frames. ], batch size: 149, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:41:04,750 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 09:41:17,335 INFO [train.py:1078] (1/4) Epoch 24, validation: loss=0.349, simple_loss=0.2892, pruned_loss=0.2044, over 1125622.00 frames. 2023-10-02 09:41:17,336 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 09:41:17,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:41:19,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:41:19,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 09:41:20,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:41:22,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:41:24,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 09:41:25,772 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 09:41:25,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 09:41:29,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:41:29,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:41:29,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 09:41:31,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:41:37,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:41:42,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=834586.6666666666, ans=0.125 2023-10-02 09:41:44,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:41:45,546 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=8.37 vs. limit=22.5 2023-10-02 09:41:53,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 09:41:53,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:41:55,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:41:55,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:41:56,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:41:57,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:41:57,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 09:42:00,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 09:42:01,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:42:03,110 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.840e+02 2.041e+02 2.384e+02 3.232e+02, threshold=4.082e+02, percent-clipped=0.0 2023-10-02 09:42:03,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 09:42:05,136 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.99 vs. limit=12.0 2023-10-02 09:42:05,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:42:07,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:42:07,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:42:07,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:42:10,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:42:11,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:42:11,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:42:13,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:42:16,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 09:42:16,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:42:16,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:42:17,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:42:21,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:42:23,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:42:23,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 09:42:23,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 09:42:24,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:42:24,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 09:42:24,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:42:27,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 09:42:29,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:42:31,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 09:42:31,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 09:42:31,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 09:42:31,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 09:42:32,483 INFO [train.py:1046] (1/4) Epoch 24, batch 3050, loss[loss=0.1565, simple_loss=0.2359, pruned_loss=0.03857, over 23535.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2496, pruned_loss=0.04763, over 4712350.68 frames. ], batch size: 134, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:42:32,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:42:34,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:42:34,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:42:34,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:42:35,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:42:35,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=834853.3333333334, ans=0.125 2023-10-02 09:42:38,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 09:42:41,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:42:43,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:42:43,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:42:43,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=834853.3333333334, ans=0.1 2023-10-02 09:42:47,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:42:47,793 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.71 vs. limit=15.0 2023-10-02 09:42:50,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 09:42:50,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=834920.0, ans=0.025 2023-10-02 09:42:56,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 09:42:56,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 09:42:56,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:42:59,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:43:03,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:43:03,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:43:03,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:43:04,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=834986.6666666666, ans=0.125 2023-10-02 09:43:06,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:43:07,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=834986.6666666666, ans=0.125 2023-10-02 09:43:08,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:43:08,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:43:08,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:43:08,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:43:09,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:43:11,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:43:14,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:43:15,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 09:43:15,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:43:15,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:43:18,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:43:19,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:43:20,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:43:21,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:43:26,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:43:26,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:43:34,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:43:34,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:43:34,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:43:35,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:43:37,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 09:43:37,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:43:38,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 09:43:39,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:43:39,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:43:41,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 09:43:42,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:43:47,137 INFO [train.py:1046] (1/4) Epoch 24, batch 3100, loss[loss=0.1592, simple_loss=0.2452, pruned_loss=0.03657, over 24665.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2493, pruned_loss=0.04804, over 4706356.39 frames. ], batch size: 73, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:43:47,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:43:48,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:43:49,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=835186.6666666666, ans=0.1 2023-10-02 09:43:49,697 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.36 vs. limit=15.0 2023-10-02 09:43:51,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:43:52,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 09:43:56,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 09:43:56,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 09:43:58,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:44:00,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:44:02,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:44:04,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 09:44:04,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=835253.3333333334, ans=0.125 2023-10-02 09:44:08,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:44:12,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 09:44:16,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 09:44:16,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:18,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:44:18,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:44:19,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 09:44:21,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:44:21,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 09:44:21,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:44:21,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=835320.0, ans=0.0 2023-10-02 09:44:22,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:44:24,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 09:44:27,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:44:30,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:44:30,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 09:44:31,465 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.869e+02 2.084e+02 2.364e+02 3.517e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-02 09:44:31,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 09:44:33,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:33,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:44:36,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:44:36,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:36,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:44:37,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:44:37,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:44:39,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:44:39,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:44:39,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:39,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 09:44:40,009 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.75 vs. limit=15.0 2023-10-02 09:44:44,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:44:45,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 09:44:48,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:44:48,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 09:44:50,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:44:50,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:50,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 09:45:00,580 INFO [train.py:1046] (1/4) Epoch 24, batch 3150, loss[loss=0.1507, simple_loss=0.2244, pruned_loss=0.03853, over 24310.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2479, pruned_loss=0.04763, over 4709165.47 frames. ], batch size: 56, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:45:00,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 09:45:04,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:45:05,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:45:05,993 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.21 vs. limit=6.0 2023-10-02 09:45:06,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:45:06,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:45:06,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 09:45:08,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:45:08,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 09:45:08,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 09:45:09,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=835520.0, ans=0.0 2023-10-02 09:45:10,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:45:13,674 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 09:45:15,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 09:45:15,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:45:15,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=835586.6666666666, ans=0.125 2023-10-02 09:45:16,010 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.91 vs. limit=15.0 2023-10-02 09:45:16,488 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 09:45:17,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 09:45:19,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 09:45:19,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 09:45:19,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 09:45:19,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:45:19,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:45:21,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:45:22,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=835586.6666666666, ans=0.1 2023-10-02 09:45:23,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 09:45:24,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=835586.6666666666, ans=0.0 2023-10-02 09:45:25,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:45:25,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:45:27,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:45:28,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:45:28,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=835653.3333333334, ans=0.025 2023-10-02 09:45:32,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 09:45:33,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:45:37,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:45:37,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:45:37,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 09:45:40,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 09:45:41,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:45:41,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 09:45:42,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 09:45:42,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:45:44,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:45:44,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=835720.0, ans=0.125 2023-10-02 09:45:45,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:45:45,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 09:45:45,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 09:45:45,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 09:45:46,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:45:48,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:45:48,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:45:48,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 09:45:48,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:45:49,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 09:45:49,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:45:51,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 09:45:53,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 09:45:54,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:45:54,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:45:56,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 09:45:57,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 09:45:58,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:46:00,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=835786.6666666666, ans=0.1 2023-10-02 09:46:01,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:46:03,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:46:03,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:46:07,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:46:07,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:46:10,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 09:46:14,236 INFO [train.py:1046] (1/4) Epoch 24, batch 3200, loss[loss=0.1781, simple_loss=0.2617, pruned_loss=0.04727, over 24079.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2475, pruned_loss=0.04733, over 4705198.98 frames. ], batch size: 80, lr: 4.27e-03, grad_scale: 16.0 2023-10-02 09:46:14,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:46:14,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:46:18,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:46:18,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:46:18,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 09:46:23,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:46:26,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:46:29,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:46:38,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:46:46,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 09:46:49,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:46:52,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 09:46:54,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 09:46:57,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:46:57,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:46:57,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:46:58,332 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.82 vs. limit=22.5 2023-10-02 09:46:58,938 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.959e+02 2.375e+02 3.040e+02 4.854e+02, threshold=4.749e+02, percent-clipped=3.0 2023-10-02 09:47:01,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 09:47:02,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=836053.3333333334, ans=0.0 2023-10-02 09:47:04,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 09:47:05,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 09:47:08,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 09:47:09,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:47:17,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:47:17,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:47:17,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:47:17,723 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 09:47:17,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 09:47:20,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=836120.0, ans=0.125 2023-10-02 09:47:20,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=836120.0, ans=15.0 2023-10-02 09:47:21,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:47:22,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=836120.0, ans=0.125 2023-10-02 09:47:23,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 09:47:23,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 09:47:25,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 09:47:26,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 09:47:28,241 INFO [train.py:1046] (1/4) Epoch 24, batch 3250, loss[loss=0.1585, simple_loss=0.2473, pruned_loss=0.03481, over 24649.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.247, pruned_loss=0.04668, over 4715627.17 frames. ], batch size: 68, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:47:29,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:47:30,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:47:30,917 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 09:47:30,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:47:30,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:47:33,443 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 09:47:33,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=836186.6666666666, ans=0.125 2023-10-02 09:47:37,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:47:40,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:47:41,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=836253.3333333334, ans=0.125 2023-10-02 09:47:47,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:47:47,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 09:47:47,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=836253.3333333334, ans=0.07 2023-10-02 09:47:48,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:47:48,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:47:48,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:47:51,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:47:51,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 09:47:52,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:47:52,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:47:54,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:47:54,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:47:54,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:47:54,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:47:57,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:47:59,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:48:01,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:48:01,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:48:03,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:48:03,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:48:03,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:48:04,548 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.33 vs. limit=15.0 2023-10-02 09:48:09,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 09:48:09,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:48:09,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:48:10,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:48:10,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:48:13,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=836386.6666666666, ans=0.1 2023-10-02 09:48:15,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:48:20,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:48:20,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:48:20,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 09:48:20,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:48:20,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 09:48:20,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:48:25,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 09:48:25,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 09:48:26,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:48:26,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:48:28,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:48:29,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 09:48:29,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:48:33,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:48:33,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:48:33,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=836453.3333333334, ans=0.125 2023-10-02 09:48:34,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 09:48:34,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:48:36,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=836453.3333333334, ans=0.125 2023-10-02 09:48:38,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=836453.3333333334, ans=0.125 2023-10-02 09:48:39,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:48:39,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 09:48:41,740 INFO [train.py:1046] (1/4) Epoch 24, batch 3300, loss[loss=0.1583, simple_loss=0.2339, pruned_loss=0.04128, over 23496.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2475, pruned_loss=0.04679, over 4716957.05 frames. ], batch size: 134, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:48:41,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:48:41,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 09:48:43,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 09:48:44,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 09:48:44,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:48:44,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=836520.0, ans=0.2 2023-10-02 09:48:48,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:48:48,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:48:48,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:48:50,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 09:48:52,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:48:54,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:48:55,332 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.83 vs. limit=22.5 2023-10-02 09:48:57,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:49:01,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 09:49:02,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:49:02,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:49:03,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=836586.6666666666, ans=0.2 2023-10-02 09:49:04,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:04,425 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 09:49:05,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:49:05,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 09:49:07,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:49:07,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:49:07,137 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 09:49:11,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:49:11,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 09:49:13,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:13,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 09:49:13,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 09:49:13,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:14,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:49:17,538 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 09:49:18,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 09:49:18,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:49:21,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 09:49:26,018 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.884e+02 2.119e+02 2.530e+02 4.061e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-02 09:49:26,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:49:27,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:49:27,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:49:30,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:49:32,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:49:32,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:49:32,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:49:34,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:49:34,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:35,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:49:35,766 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 09:49:35,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 09:49:38,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 09:49:38,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:49:38,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:49:40,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:49:40,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:49:42,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:49:42,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:49:42,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:49:42,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:43,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=836786.6666666666, ans=0.125 2023-10-02 09:49:44,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:49:47,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 09:49:47,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:49:48,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:49:50,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:49:50,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:49:51,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:49:54,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:49:54,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:49:55,969 INFO [train.py:1046] (1/4) Epoch 24, batch 3350, loss[loss=0.1873, simple_loss=0.2651, pruned_loss=0.05472, over 23928.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2485, pruned_loss=0.04707, over 4718427.85 frames. ], batch size: 86, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:49:58,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:50:01,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:02,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:50:06,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:08,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:50:09,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:50:09,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:50:10,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 09:50:12,266 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 09:50:14,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:50:16,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 09:50:16,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 09:50:18,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:50:18,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:50:18,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:50:19,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 09:50:19,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:19,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:50:19,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=836920.0, ans=0.125 2023-10-02 09:50:21,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:22,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:22,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:23,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:50:27,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:50:29,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:30,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:50:34,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:50:36,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:37,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:39,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:50:42,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:50:42,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=837053.3333333334, ans=0.125 2023-10-02 09:50:43,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 09:50:43,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:50:43,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 09:50:43,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:50:45,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 09:50:45,250 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:50:46,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:50:49,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:53,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:50:54,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 09:50:54,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:50:55,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:50:57,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:51:04,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:51:06,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 09:51:06,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:51:08,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:51:09,644 INFO [train.py:1046] (1/4) Epoch 24, batch 3400, loss[loss=0.1606, simple_loss=0.2488, pruned_loss=0.03616, over 24644.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.249, pruned_loss=0.04711, over 4731709.90 frames. ], batch size: 73, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:51:09,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:51:11,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 09:51:11,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:51:12,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 09:51:12,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:51:13,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:51:13,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:51:16,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:51:16,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 09:51:19,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 09:51:19,800 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 09:51:19,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:51:22,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:51:22,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:51:23,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:51:25,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:51:26,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=837253.3333333334, ans=0.015 2023-10-02 09:51:29,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:51:31,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 09:51:31,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=837253.3333333334, ans=0.0 2023-10-02 09:51:35,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:51:37,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:51:37,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:51:37,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=837320.0, ans=0.125 2023-10-02 09:51:40,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 09:51:45,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:51:47,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 09:51:53,138 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.845e+02 2.030e+02 2.228e+02 3.234e+02, threshold=4.061e+02, percent-clipped=0.0 2023-10-02 09:51:53,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:51:53,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:51:54,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 09:51:54,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:51:56,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:51:56,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:51:56,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:51:59,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:52:02,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=837386.6666666666, ans=0.125 2023-10-02 09:52:03,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:52:03,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:52:07,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:52:11,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 09:52:15,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 09:52:20,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=837453.3333333334, ans=0.09899494936611666 2023-10-02 09:52:21,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 09:52:22,562 INFO [train.py:1046] (1/4) Epoch 24, batch 3450, loss[loss=0.1616, simple_loss=0.2304, pruned_loss=0.04636, over 23821.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2493, pruned_loss=0.04762, over 4723119.53 frames. ], batch size: 195, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:52:23,578 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.26 vs. limit=12.0 2023-10-02 09:52:25,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 09:52:25,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:52:28,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:52:28,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 09:52:29,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:52:32,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:52:37,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:52:37,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:52:38,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:52:38,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:52:41,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:52:46,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 09:52:48,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=837586.6666666666, ans=0.1 2023-10-02 09:52:52,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 09:52:52,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 09:52:52,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:52:55,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:52:58,685 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.89 vs. limit=15.0 2023-10-02 09:52:59,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 09:53:01,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:53:05,412 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.18 vs. limit=22.5 2023-10-02 09:53:05,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:53:07,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:53:08,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:53:08,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:53:11,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 09:53:11,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:53:11,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:53:13,373 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.15 vs. limit=10.0 2023-10-02 09:53:14,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:53:17,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 09:53:22,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:53:26,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:53:27,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:53:30,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:53:32,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=837786.6666666666, ans=0.2 2023-10-02 09:53:33,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:53:33,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:53:35,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:53:35,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:53:36,705 INFO [train.py:1046] (1/4) Epoch 24, batch 3500, loss[loss=0.1762, simple_loss=0.2557, pruned_loss=0.04841, over 23327.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2468, pruned_loss=0.04711, over 4703266.92 frames. ], batch size: 93, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:53:39,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:53:42,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:53:42,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 09:53:44,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 09:53:46,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 09:53:49,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:53:49,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 09:53:54,706 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.51 vs. limit=15.0 2023-10-02 09:53:55,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:53:55,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:53:56,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:53:56,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:53:58,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:53:58,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:53:58,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:53:59,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 09:54:03,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:03,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:54:03,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:54:08,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:10,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 09:54:10,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:54:12,307 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:54:13,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:54:14,479 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.81 vs. limit=6.0 2023-10-02 09:54:14,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:54:16,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:16,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=837986.6666666666, ans=0.2 2023-10-02 09:54:17,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:54:18,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:54:20,198 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.879e+02 2.062e+02 2.374e+02 3.315e+02, threshold=4.124e+02, percent-clipped=0.0 2023-10-02 09:54:20,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 09:54:21,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 09:54:21,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 09:54:21,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:54:23,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:23,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:54:23,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:54:27,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 09:54:29,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:54:33,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:54:33,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 09:54:35,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 09:54:35,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:54:37,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:54:39,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:54:40,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:42,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 09:54:43,027 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.84 vs. limit=5.0 2023-10-02 09:54:44,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:54:45,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:54:46,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 09:54:48,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 09:54:49,507 INFO [train.py:1046] (1/4) Epoch 24, batch 3550, loss[loss=0.1676, simple_loss=0.2301, pruned_loss=0.05249, over 22809.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2457, pruned_loss=0.04674, over 4692278.50 frames. ], batch size: 322, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:54:50,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:50,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:54:50,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:54:52,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:54:53,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:55:04,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:55:06,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 09:55:09,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:55:10,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:55:12,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:55:12,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:55:13,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:55:14,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:55:15,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:55:16,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:55:16,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 09:55:16,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:55:21,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:55:22,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:55:22,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:55:22,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:55:23,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:55:24,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 09:55:24,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:55:24,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=838320.0, ans=0.125 2023-10-02 09:55:26,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:55:26,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 09:55:30,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:55:31,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=838320.0, ans=0.0 2023-10-02 09:55:33,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:55:33,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:55:35,334 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.77 vs. limit=15.0 2023-10-02 09:55:35,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 09:55:36,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=838386.6666666666, ans=0.07 2023-10-02 09:55:37,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:55:37,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 09:55:39,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:55:42,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:55:43,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:55:44,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 09:55:46,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:55:51,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:55:52,387 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.78 vs. limit=10.0 2023-10-02 09:55:53,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 09:55:54,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:55:57,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:55:59,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 09:56:04,261 INFO [train.py:1046] (1/4) Epoch 24, batch 3600, loss[loss=0.1659, simple_loss=0.2551, pruned_loss=0.03834, over 24249.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2458, pruned_loss=0.04624, over 4709648.76 frames. ], batch size: 74, lr: 4.26e-03, grad_scale: 32.0 2023-10-02 09:56:04,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 09:56:04,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:56:06,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:56:06,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=838520.0, ans=0.125 2023-10-02 09:56:07,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:56:07,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:56:09,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:56:11,627 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.99 vs. limit=12.0 2023-10-02 09:56:13,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:56:16,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:56:16,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:56:17,258 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.44 vs. limit=15.0 2023-10-02 09:56:17,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:56:17,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:56:17,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 09:56:20,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:56:22,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:56:24,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:56:26,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:56:27,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:56:29,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:56:29,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 09:56:29,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:56:32,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:56:34,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:56:36,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:56:36,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=838653.3333333334, ans=0.125 2023-10-02 09:56:39,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:56:39,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:56:40,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 09:56:46,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:56:46,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=838653.3333333334, ans=0.0 2023-10-02 09:56:47,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 09:56:49,066 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.434e+02 1.867e+02 2.105e+02 2.448e+02 3.419e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-02 09:56:49,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 09:56:53,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:56:57,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:57:00,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:57:04,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=838786.6666666666, ans=0.125 2023-10-02 09:57:05,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:57:05,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:57:05,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 09:57:07,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 09:57:08,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 09:57:09,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:57:10,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:57:11,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 09:57:13,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:57:13,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:57:13,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:57:14,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 09:57:15,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 09:57:17,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:57:18,585 INFO [train.py:1046] (1/4) Epoch 24, batch 3650, loss[loss=0.1756, simple_loss=0.2613, pruned_loss=0.04493, over 24625.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2469, pruned_loss=0.04729, over 4690082.55 frames. ], batch size: 68, lr: 4.26e-03, grad_scale: 32.0 2023-10-02 09:57:18,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 09:57:22,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 09:57:24,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:57:26,452 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.88 vs. limit=22.5 2023-10-02 09:57:27,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 09:57:30,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 09:57:33,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:57:33,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:57:34,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 09:57:36,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:57:36,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:57:36,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 09:57:37,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:57:38,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:57:38,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 09:57:40,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:57:40,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:57:40,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:57:42,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:57:45,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 09:57:45,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 09:57:45,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:57:48,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 09:57:50,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:57:50,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:57:54,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:57:55,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:57:56,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:57:56,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:57:58,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:58:00,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:58:04,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:58:04,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:58:04,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:58:08,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:58:09,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:58:09,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:58:16,831 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 09:58:19,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:58:19,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:58:20,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:58:22,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:58:23,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:58:25,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:58:25,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 09:58:25,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:58:27,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=839120.0, ans=0.0 2023-10-02 09:58:28,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=839120.0, ans=0.125 2023-10-02 09:58:28,598 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.61 vs. limit=15.0 2023-10-02 09:58:29,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 09:58:30,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:58:31,841 INFO [train.py:1046] (1/4) Epoch 24, batch 3700, loss[loss=0.1855, simple_loss=0.2547, pruned_loss=0.05815, over 23357.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2481, pruned_loss=0.04754, over 4698209.98 frames. ], batch size: 119, lr: 4.26e-03, grad_scale: 32.0 2023-10-02 09:58:33,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:58:35,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:58:35,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 09:58:35,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:58:36,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 09:58:36,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:58:41,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 09:58:44,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:58:45,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:58:45,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:58:45,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:58:46,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:58:48,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:58:49,752 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 09:58:56,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:58:56,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 09:58:58,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:58:58,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 09:58:58,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:59:02,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:59:03,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 09:59:05,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:59:06,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:59:11,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:59:11,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:59:13,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 09:59:16,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:59:16,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 09:59:17,441 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.801e+02 2.013e+02 2.164e+02 3.512e+02, threshold=4.027e+02, percent-clipped=0.0 2023-10-02 09:59:17,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:59:18,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 09:59:23,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:59:23,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:59:23,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=839386.6666666666, ans=0.125 2023-10-02 09:59:25,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:59:25,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 09:59:27,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:59:28,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:59:28,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:59:28,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:59:31,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:59:32,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 09:59:32,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 09:59:34,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:59:34,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:59:35,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:59:37,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:59:40,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:59:40,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:59:40,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=839453.3333333334, ans=0.125 2023-10-02 09:59:41,224 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.00 vs. limit=15.0 2023-10-02 09:59:42,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:59:45,148 INFO [train.py:1046] (1/4) Epoch 24, batch 3750, loss[loss=0.1489, simple_loss=0.2311, pruned_loss=0.03335, over 24457.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2484, pruned_loss=0.04743, over 4707041.81 frames. ], batch size: 63, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:59:45,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 09:59:46,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 09:59:47,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:59:49,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 09:59:49,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=839520.0, ans=0.0 2023-10-02 09:59:50,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:59:50,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:59:52,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:59:52,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:59:56,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:00:00,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:00:00,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:00:01,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:00:05,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:00:06,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 10:00:07,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:00:09,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:00:09,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:00:14,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 10:00:17,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 10:00:19,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:00:19,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:00:22,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:00:25,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:00:25,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 10:00:29,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 10:00:32,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=839720.0, ans=0.125 2023-10-02 10:00:33,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:00:34,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=839720.0, ans=0.2 2023-10-02 10:00:37,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:00:37,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:00:41,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:00:43,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=839786.6666666666, ans=0.2 2023-10-02 10:00:45,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 10:00:47,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:00:49,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:00:50,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:00:53,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:00:57,804 INFO [train.py:1046] (1/4) Epoch 24, batch 3800, loss[loss=0.1844, simple_loss=0.2387, pruned_loss=0.06507, over 19732.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2479, pruned_loss=0.04712, over 4707713.52 frames. ], batch size: 388, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:01:01,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:01:02,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=839853.3333333334, ans=0.2 2023-10-02 10:01:04,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:01:06,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 10:01:06,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=839853.3333333334, ans=0.2 2023-10-02 10:01:07,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 10:01:08,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:01:10,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:01:12,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 10:01:14,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=839920.0, ans=0.1 2023-10-02 10:01:15,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 10:01:15,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:01:15,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:01:17,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:01:18,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:01:18,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:01:20,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 10:01:20,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=839920.0, ans=0.0 2023-10-02 10:01:21,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 10:01:21,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:01:23,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=839920.0, ans=0.05 2023-10-02 10:01:24,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:01:26,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:01:27,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:01:28,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 10:01:30,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:01:31,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:01:33,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:01:36,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 10:01:36,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 10:01:37,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:01:42,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:01:44,021 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.979e+02 2.395e+02 2.879e+02 4.810e+02, threshold=4.790e+02, percent-clipped=5.0 2023-10-02 10:01:48,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:01:50,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 10:01:51,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=840053.3333333334, ans=0.1 2023-10-02 10:01:52,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 10:01:54,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:01:54,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:01:55,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:01:56,375 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.17 vs. limit=15.0 2023-10-02 10:01:56,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 10:01:59,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 10:01:59,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 10:01:59,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:02:01,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:02:07,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:02:08,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:02:11,994 INFO [train.py:1046] (1/4) Epoch 24, batch 3850, loss[loss=0.1567, simple_loss=0.2366, pruned_loss=0.03834, over 24512.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2469, pruned_loss=0.04703, over 4706480.42 frames. ], batch size: 63, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:02:13,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:02:13,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 10:02:14,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:02:16,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:02:18,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:02:21,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:02:23,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 10:02:23,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 10:02:29,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:30,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:02:32,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:02:32,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=840253.3333333334, ans=0.0 2023-10-02 10:02:33,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:02:36,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:38,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:02:38,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:02:38,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:02:40,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:02:40,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=840320.0, ans=0.125 2023-10-02 10:02:43,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:02:44,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:44,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:02:45,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 10:02:45,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 10:02:46,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:02:46,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:47,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:02:47,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:49,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 10:02:52,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 10:02:53,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:02:54,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 10:02:56,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 10:02:57,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=840386.6666666666, ans=0.2 2023-10-02 10:03:00,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:03:01,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:03:06,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:03:06,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 10:03:09,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 10:03:13,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:03:13,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:03:16,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 10:03:16,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:03:17,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:18,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:18,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:03:18,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 10:03:20,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:03:21,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 10:03:21,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:21,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:03:24,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:03:24,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:25,751 INFO [train.py:1046] (1/4) Epoch 24, batch 3900, loss[loss=0.1773, simple_loss=0.2564, pruned_loss=0.04913, over 23450.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2457, pruned_loss=0.04649, over 4695490.44 frames. ], batch size: 93, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:03:25,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:03:27,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:03:27,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:03:27,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:03:27,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 10:03:28,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:32,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:03:32,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 10:03:32,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:03:35,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:03:35,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=840520.0, ans=0.125 2023-10-02 10:03:36,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 10:03:38,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:39,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:03:41,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 10:03:41,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:03:42,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 10:03:44,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:44,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 10:03:46,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 10:03:50,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=840586.6666666666, ans=0.125 2023-10-02 10:03:52,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:03:52,832 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.59 vs. limit=22.5 2023-10-02 10:03:53,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:03:53,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:03:53,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:03:56,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:03:57,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:03:57,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=840653.3333333334, ans=0.1 2023-10-02 10:04:00,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:04:00,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:04:00,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:04:05,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:04:05,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:04:09,706 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.982e+02 2.222e+02 2.624e+02 4.261e+02, threshold=4.444e+02, percent-clipped=0.0 2023-10-02 10:04:11,111 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.58 vs. limit=15.0 2023-10-02 10:04:14,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:04:16,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:04:25,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:04:28,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:04:29,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 10:04:29,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 10:04:29,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:04:30,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 10:04:32,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:04:32,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 10:04:38,034 INFO [train.py:1046] (1/4) Epoch 24, batch 3950, loss[loss=0.1769, simple_loss=0.2654, pruned_loss=0.04418, over 24384.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2456, pruned_loss=0.04619, over 4705935.82 frames. ], batch size: 69, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:04:40,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:04:42,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 10:04:44,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:04:45,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=840853.3333333334, ans=0.125 2023-10-02 10:04:46,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:04:46,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:04:53,181 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 10:04:53,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:04:53,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 10:04:55,014 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 10:04:55,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:04:57,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:04:57,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:04:57,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:05:00,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 10:05:03,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:05:04,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:05:04,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:05:04,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:05:06,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:05:09,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=840986.6666666666, ans=0.125 2023-10-02 10:05:16,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:05:16,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:05:21,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 10:05:27,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 10:05:27,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 10:05:27,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:05:29,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:05:36,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:05:36,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:05:36,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:05:36,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:05:38,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 10:05:41,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:05:42,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:05:46,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 10:05:51,370 INFO [train.py:1046] (1/4) Epoch 24, batch 4000, loss[loss=0.171, simple_loss=0.2606, pruned_loss=0.04071, over 24566.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2465, pruned_loss=0.04651, over 4710494.55 frames. ], batch size: 71, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:05:57,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:06:01,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=841186.6666666666, ans=0.0 2023-10-02 10:06:02,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:06:02,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=841186.6666666666, ans=0.1 2023-10-02 10:06:03,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=841186.6666666666, ans=0.125 2023-10-02 10:06:05,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=841253.3333333334, ans=0.2 2023-10-02 10:06:08,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:06:08,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:06:09,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:06:09,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 10:06:09,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:06:11,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 10:06:11,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:06:11,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 10:06:11,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=841253.3333333334, ans=0.125 2023-10-02 10:06:13,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:06:17,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:06:17,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:06:17,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:06:17,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:06:17,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 10:06:19,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:06:21,206 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 10:06:22,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:06:22,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:06:25,532 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 10:06:27,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 10:06:27,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:06:27,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=841320.0, ans=0.0 2023-10-02 10:06:32,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 10:06:32,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:06:35,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:06:35,405 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 10:06:36,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:06:38,065 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.830e+02 2.096e+02 2.397e+02 3.466e+02, threshold=4.191e+02, percent-clipped=0.0 2023-10-02 10:06:38,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 10:06:38,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:06:39,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:06:39,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:06:41,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:06:42,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:06:42,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:06:43,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 10:06:45,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:06:46,910 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 10:06:52,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:06:54,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 10:06:54,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:06:56,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:06:57,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:06:57,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:07:03,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:07:04,745 INFO [train.py:1046] (1/4) Epoch 24, batch 4050, loss[loss=0.2217, simple_loss=0.2785, pruned_loss=0.08244, over 19497.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2474, pruned_loss=0.04696, over 4709320.78 frames. ], batch size: 388, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:07:04,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 10:07:06,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 10:07:07,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:07:07,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:07:08,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:07:10,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:07:11,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:07:14,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:07:18,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:07:19,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 10:07:21,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:07:21,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:07:24,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:07:25,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:07:28,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 10:07:31,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 10:07:31,640 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 10:07:34,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:07:39,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 10:07:40,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:07:40,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=841653.3333333334, ans=0.04949747468305833 2023-10-02 10:07:44,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:07:45,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:07:46,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:07:47,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:07:48,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=841720.0, ans=0.125 2023-10-02 10:07:51,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:07:52,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=841720.0, ans=0.125 2023-10-02 10:07:55,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 10:07:55,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:07:57,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:07:58,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 10:08:02,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:08:08,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 10:08:09,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:08:09,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:08:12,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 10:08:12,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 10:08:12,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:08:15,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:08:16,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:08:16,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:08:18,182 INFO [train.py:1046] (1/4) Epoch 24, batch 4100, loss[loss=0.1523, simple_loss=0.2289, pruned_loss=0.03785, over 24615.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2486, pruned_loss=0.04728, over 4708574.90 frames. ], batch size: 60, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:08:23,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 10:08:24,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 10:08:25,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 10:08:27,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 10:08:27,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:08:28,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:08:28,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:08:29,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:08:29,961 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 10:08:31,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:08:32,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:08:32,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:08:34,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:08:38,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=841920.0, ans=0.125 2023-10-02 10:08:39,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:08:39,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:08:41,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:08:41,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 10:08:42,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:08:42,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:08:42,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:08:42,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:08:44,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 10:08:47,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:08:48,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 10:08:50,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:08:52,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:08:52,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 10:08:53,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:08:53,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:08:53,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:08:55,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 10:08:57,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:08:57,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:08:59,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 10:09:00,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:09:02,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:09:03,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:09:06,741 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.453e+02 1.905e+02 2.091e+02 2.337e+02 3.438e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-02 10:09:07,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=842053.3333333334, ans=0.2 2023-10-02 10:09:08,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=842053.3333333334, ans=0.1 2023-10-02 10:09:09,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:09:10,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:09:12,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:09:22,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:09:22,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:09:24,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:09:24,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:09:30,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:09:30,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:09:31,401 INFO [train.py:1046] (1/4) Epoch 24, batch 4150, loss[loss=0.1725, simple_loss=0.2414, pruned_loss=0.05183, over 23772.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2485, pruned_loss=0.04709, over 4712355.21 frames. ], batch size: 164, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:09:31,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:09:31,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:09:32,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 10:09:34,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:09:34,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 10:09:36,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 10:09:36,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 10:09:37,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:09:41,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:09:41,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:09:41,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=842186.6666666666, ans=0.0 2023-10-02 10:09:44,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:09:46,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:09:48,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 10:09:50,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:09:50,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:09:51,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 10:09:51,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=842253.3333333334, ans=0.1 2023-10-02 10:09:55,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:10:00,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:10:02,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 10:10:02,662 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:10:05,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 10:10:05,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:10:06,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 10:10:06,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:10:06,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:10:09,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:10:10,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:10:11,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=842320.0, ans=0.0 2023-10-02 10:10:11,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=842320.0, ans=10.0 2023-10-02 10:10:14,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 10:10:17,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:10:21,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:10:21,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 10:10:23,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:10:23,682 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.15 vs. limit=15.0 2023-10-02 10:10:24,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 10:10:26,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:10:26,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:10:27,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:10:29,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 10:10:29,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:10:29,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 10:10:30,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:10:33,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 10:10:33,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:10:34,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:10:34,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 10:10:34,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 10:10:34,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:10:36,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 10:10:36,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:10:39,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:10:39,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 10:10:40,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:10:45,064 INFO [train.py:1046] (1/4) Epoch 24, batch 4200, loss[loss=0.1686, simple_loss=0.2183, pruned_loss=0.05951, over 19530.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2469, pruned_loss=0.047, over 4703488.89 frames. ], batch size: 389, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:10:45,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:10:47,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 10:10:49,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:10:51,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:10:51,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=842520.0, ans=0.09899494936611666 2023-10-02 10:10:52,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:10:54,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:10:54,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:10:56,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 10:10:57,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 10:10:58,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:11:01,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:11:01,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=842586.6666666666, ans=0.2 2023-10-02 10:11:04,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:11:04,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=842586.6666666666, ans=0.0 2023-10-02 10:11:07,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 10:11:08,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:11:09,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:11:09,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 10:11:09,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:11:11,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:11:11,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=842586.6666666666, ans=0.0 2023-10-02 10:11:12,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:11:12,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:11:12,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=842653.3333333334, ans=0.125 2023-10-02 10:11:13,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:11:15,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 10:11:17,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:11:21,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 10:11:21,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:11:24,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:11:24,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:11:27,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:11:27,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 10:11:27,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:11:28,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:11:30,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=842720.0, ans=0.125 2023-10-02 10:11:34,515 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.468e+02 1.872e+02 2.085e+02 2.351e+02 3.667e+02, threshold=4.171e+02, percent-clipped=0.0 2023-10-02 10:11:34,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:11:36,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:11:40,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:11:43,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 10:11:46,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:11:50,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:11:52,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:11:54,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 10:11:59,831 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.47 vs. limit=15.0 2023-10-02 10:12:00,370 INFO [train.py:1046] (1/4) Epoch 24, batch 4250, loss[loss=0.1504, simple_loss=0.2225, pruned_loss=0.03915, over 23583.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2454, pruned_loss=0.04658, over 4696879.71 frames. ], batch size: 135, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:12:00,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:12:04,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:12:04,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:12:07,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:12:09,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=842853.3333333334, ans=0.125 2023-10-02 10:12:10,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:12:11,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 10:12:11,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:12:14,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:12:17,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:12:23,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:12:23,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:12:24,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:12:24,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:12:27,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:12:27,503 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:12:29,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:12:30,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:12:33,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:12:33,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:12:33,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=842986.6666666666, ans=0.125 2023-10-02 10:12:34,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 10:12:37,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 10:12:37,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:12:38,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=842986.6666666666, ans=0.1 2023-10-02 10:12:39,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:12:39,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:12:40,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:12:40,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:12:41,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:12:43,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 10:12:44,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:12:48,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:12:52,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:12:52,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 10:12:52,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:12:54,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 10:12:55,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:12:55,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:12:57,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:12:58,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:12:59,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 10:13:03,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:13:03,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:13:06,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:13:06,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=843120.0, ans=0.5 2023-10-02 10:13:10,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:13:10,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:13:10,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=843120.0, ans=0.0 2023-10-02 10:13:11,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:13:12,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:13:14,173 INFO [train.py:1046] (1/4) Epoch 24, batch 4300, loss[loss=0.1853, simple_loss=0.2773, pruned_loss=0.04664, over 24523.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2446, pruned_loss=0.04647, over 4693757.03 frames. ], batch size: 71, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:13:14,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:13:14,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:13:14,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 10:13:15,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:13:16,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=843186.6666666666, ans=0.125 2023-10-02 10:13:20,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:13:20,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:13:26,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:13:28,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=843253.3333333334, ans=0.0 2023-10-02 10:13:34,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:13:34,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 10:13:34,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:13:37,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:13:37,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:13:38,438 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 10:13:41,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 10:13:42,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:13:44,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 10:13:44,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:13:44,699 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.90 vs. limit=15.0 2023-10-02 10:13:45,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 10:13:46,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 10:13:49,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:13:53,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:13:53,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:13:54,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:13:56,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:13:56,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:13:56,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 10:13:58,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 10:13:59,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=843386.6666666666, ans=0.2 2023-10-02 10:14:00,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:14:02,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:14:02,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 10:14:02,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:14:02,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:14:03,758 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.778e+02 1.987e+02 2.231e+02 3.215e+02, threshold=3.973e+02, percent-clipped=0.0 2023-10-02 10:14:03,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 10:14:03,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 10:14:03,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 10:14:05,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:14:05,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 10:14:06,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 10:14:09,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:14:10,838 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 10:14:10,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:14:12,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:14:12,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:14:12,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=843453.3333333334, ans=0.125 2023-10-02 10:14:14,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 10:14:14,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:14:16,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:14:16,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:14:16,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:14:17,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:14:22,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:14:23,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:14:25,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:14:25,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:14:28,396 INFO [train.py:1046] (1/4) Epoch 24, batch 4350, loss[loss=0.1795, simple_loss=0.2499, pruned_loss=0.05449, over 23486.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2461, pruned_loss=0.04678, over 4695028.61 frames. ], batch size: 285, lr: 4.25e-03, grad_scale: 4.0 2023-10-02 10:14:31,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 10:14:32,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 10:14:37,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:14:38,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:14:40,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:14:40,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:14:45,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:14:48,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:14:48,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=843586.6666666666, ans=0.0 2023-10-02 10:14:49,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:14:49,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:14:52,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:14:53,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:14:55,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:15:01,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 10:15:01,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:15:03,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:05,411 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.92 vs. limit=10.0 2023-10-02 10:15:07,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:08,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 10:15:11,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:15:12,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:15:16,273 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.90 vs. limit=15.0 2023-10-02 10:15:17,149 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 10:15:18,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:15:19,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:15:20,370 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 10:15:20,431 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 10:15:20,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:15:21,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:15:21,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:15:23,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:15:23,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:15:23,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:15:25,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 10:15:25,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:25,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:15:27,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:27,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 10:15:27,966 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 10:15:27,972 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 10:15:29,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 10:15:32,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:15:32,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:15:33,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:15:34,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:15:36,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 10:15:38,121 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 10:15:39,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:41,928 INFO [train.py:1046] (1/4) Epoch 24, batch 4400, loss[loss=0.1519, simple_loss=0.2321, pruned_loss=0.03581, over 24631.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2471, pruned_loss=0.04732, over 4701749.71 frames. ], batch size: 60, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:15:42,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:15:42,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:43,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:15:44,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 10:15:44,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 10:15:46,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 10:15:46,318 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 10:15:48,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:15:48,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:15:51,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 10:15:54,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:55,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:15:55,562 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 10:15:59,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:15:59,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 10:16:00,916 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 10:16:03,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 10:16:03,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=843920.0, ans=0.2 2023-10-02 10:16:04,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 10:16:04,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 10:16:05,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:16:07,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:16:07,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:16:07,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:16:08,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 10:16:08,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 10:16:09,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:16:12,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:16:12,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:16:14,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:16:15,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:16:15,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 10:16:16,936 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 10:16:19,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:16:21,475 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.00 vs. limit=12.0 2023-10-02 10:16:26,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:16:29,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 10:16:30,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:16:32,290 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.852e+02 2.093e+02 2.430e+02 3.908e+02, threshold=4.185e+02, percent-clipped=0.0 2023-10-02 10:16:33,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:16:33,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=844053.3333333334, ans=0.035 2023-10-02 10:16:34,097 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:16:37,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:16:38,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 10:16:38,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:16:38,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:16:38,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:16:38,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:16:42,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 10:16:46,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 10:16:46,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 10:16:46,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:16:47,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 10:16:47,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:16:52,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:16:55,150 INFO [train.py:1046] (1/4) Epoch 24, batch 4450, loss[loss=0.1738, simple_loss=0.2513, pruned_loss=0.04809, over 24473.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2478, pruned_loss=0.04756, over 4719530.81 frames. ], batch size: 63, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:16:55,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 10:17:00,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:17:00,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=844186.6666666666, ans=0.0 2023-10-02 10:17:01,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:17:02,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:17:07,961 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.79 vs. limit=15.0 2023-10-02 10:17:10,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:17:10,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:17:14,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:17:16,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:17:17,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:17:17,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:17:19,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 10:17:19,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:17:20,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:17:20,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:17:20,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:17:23,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 10:17:23,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=844320.0, ans=0.0 2023-10-02 10:17:25,328 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.77 vs. limit=22.5 2023-10-02 10:17:27,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:17:29,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:17:30,707 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.52 vs. limit=6.0 2023-10-02 10:17:31,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:17:31,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:17:31,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:17:35,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 10:17:37,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 10:17:38,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 10:17:38,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:17:40,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:17:41,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 10:17:42,296 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.13 vs. limit=22.5 2023-10-02 10:17:44,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:17:47,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:17:48,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 10:17:48,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:17:48,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:17:48,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:17:48,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:17:51,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:17:54,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 10:17:54,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 10:17:57,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:17:59,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:17:59,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=844453.3333333334, ans=0.125 2023-10-02 10:18:01,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:18:02,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:18:02,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 10:18:05,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:18:06,357 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.98 vs. limit=15.0 2023-10-02 10:18:08,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=844520.0, ans=0.125 2023-10-02 10:18:09,742 INFO [train.py:1046] (1/4) Epoch 24, batch 4500, loss[loss=0.1479, simple_loss=0.2243, pruned_loss=0.03576, over 24311.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.248, pruned_loss=0.04759, over 4723509.09 frames. ], batch size: 56, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:18:09,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 10:18:11,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:18:15,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:18:17,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 10:18:17,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 10:18:18,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:18:23,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:18:23,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:18:25,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:18:25,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:18:25,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:18:26,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:18:36,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:18:37,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=844586.6666666666, ans=0.1 2023-10-02 10:18:38,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:18:40,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:18:42,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:18:43,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:18:48,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:18:51,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:18:55,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:18:56,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:18:58,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 10:18:59,246 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.429e+02 1.850e+02 2.033e+02 2.343e+02 3.798e+02, threshold=4.065e+02, percent-clipped=0.0 2023-10-02 10:18:59,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:19:00,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:19:02,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:19:03,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:19:04,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:19:04,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 10:19:04,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:19:04,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:19:06,086 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.63 vs. limit=22.5 2023-10-02 10:19:09,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:19:09,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:19:12,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:19:15,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:19:15,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:19:18,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 10:19:18,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 10:19:18,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 10:19:21,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 10:19:22,500 INFO [train.py:1046] (1/4) Epoch 24, batch 4550, loss[loss=0.1691, simple_loss=0.2567, pruned_loss=0.04078, over 24643.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2465, pruned_loss=0.04671, over 4724439.67 frames. ], batch size: 68, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:19:23,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 10:19:25,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:19:28,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:19:29,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:19:33,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:19:36,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:19:37,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:19:40,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:19:40,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:19:40,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:19:42,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:19:42,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=844920.0, ans=0.125 2023-10-02 10:19:44,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:19:46,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:19:46,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=844920.0, ans=0.125 2023-10-02 10:19:47,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 10:19:49,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 10:19:50,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:19:51,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 10:19:54,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 10:19:54,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:19:57,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 10:19:59,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:20:02,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:02,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:02,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:20:03,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 10:20:06,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:20:08,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:08,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:20:09,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:20:10,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 10:20:11,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 10:20:11,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:20:12,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 10:20:15,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 10:20:15,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:20:15,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=845053.3333333334, ans=0.125 2023-10-02 10:20:18,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:20:18,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:20:18,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:19,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:20:19,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 10:20:21,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 10:20:23,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:20:23,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 10:20:23,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 10:20:23,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:20:23,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 10:20:27,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:20:27,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:20:30,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:20:30,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:30,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:20:32,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:20:34,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:20:36,070 INFO [train.py:1046] (1/4) Epoch 24, batch 4600, loss[loss=0.1702, simple_loss=0.2591, pruned_loss=0.04063, over 24344.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2451, pruned_loss=0.04611, over 4724008.03 frames. ], batch size: 77, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:20:37,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:20:38,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:20:41,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:20:41,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:20:42,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:20:44,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 10:20:44,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:20:47,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:20:49,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:20:53,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:20:53,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=845253.3333333334, ans=0.1 2023-10-02 10:20:59,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 10:20:59,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:02,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:03,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:21:03,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:21:11,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 10:21:11,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:21:11,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:21:14,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=845320.0, ans=0.125 2023-10-02 10:21:14,384 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:21:15,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=845320.0, ans=0.0 2023-10-02 10:21:16,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:18,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:21:19,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:21:22,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 10:21:23,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 10:21:25,026 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.808e+02 1.985e+02 2.322e+02 3.064e+02, threshold=3.969e+02, percent-clipped=0.0 2023-10-02 10:21:29,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:21:29,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:21:32,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=845386.6666666666, ans=0.125 2023-10-02 10:21:33,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:21:33,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 10:21:33,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:35,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 10:21:35,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:21:36,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:21:36,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:21:37,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:21:38,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:21:39,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 10:21:39,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 10:21:40,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 10:21:40,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:21:42,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:21:43,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:21:45,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:21:49,428 INFO [train.py:1046] (1/4) Epoch 24, batch 4650, loss[loss=0.1609, simple_loss=0.2501, pruned_loss=0.03586, over 24616.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2452, pruned_loss=0.04568, over 4740508.10 frames. ], batch size: 68, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:21:52,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:21:54,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=845520.0, ans=0.125 2023-10-02 10:21:55,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:21:55,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:55,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:21:56,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:21:56,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:21:58,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:22:01,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 10:22:06,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:22:06,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=845586.6666666666, ans=0.125 2023-10-02 10:22:08,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 10:22:09,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:22:10,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 10:22:10,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:22:11,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 10:22:11,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 10:22:11,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:22:11,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:22:15,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:22:15,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=845586.6666666666, ans=0.125 2023-10-02 10:22:17,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:22:17,845 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 10:22:21,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:22:21,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 10:22:26,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:22:26,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:22:26,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 10:22:27,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:22:30,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:22:31,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:22:38,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:22:41,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:22:41,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:22:42,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:22:44,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 10:22:45,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 10:22:45,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 10:22:45,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 10:22:48,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:22:55,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:22:55,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:22:55,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 10:22:55,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:22:56,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:22:56,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:22:59,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:23:01,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:23:01,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:23:02,339 INFO [train.py:1046] (1/4) Epoch 24, batch 4700, loss[loss=0.1739, simple_loss=0.2438, pruned_loss=0.05202, over 23789.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.246, pruned_loss=0.04624, over 4733959.22 frames. ], batch size: 179, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:23:02,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:23:07,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=845853.3333333334, ans=0.1 2023-10-02 10:23:09,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:23:09,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:23:09,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:23:10,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 10:23:10,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:23:12,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 10:23:20,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:23:22,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:23:22,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:23:22,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=845920.0, ans=0.125 2023-10-02 10:23:23,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:23:25,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:23:28,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=845920.0, ans=0.0 2023-10-02 10:23:29,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 10:23:29,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 10:23:30,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:23:32,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:23:33,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:23:36,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:23:36,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=845986.6666666666, ans=0.125 2023-10-02 10:23:42,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:23:42,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 10:23:44,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:23:50,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 10:23:51,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:23:53,103 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.791e+02 2.008e+02 2.250e+02 3.556e+02, threshold=4.016e+02, percent-clipped=0.0 2023-10-02 10:23:53,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:23:57,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 10:23:58,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:24:02,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:24:02,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 10:24:04,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:24:04,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:24:07,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:24:08,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:24:08,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 10:24:09,010 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 10:24:12,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:24:13,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:24:13,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:24:13,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 10:24:16,538 INFO [train.py:1046] (1/4) Epoch 24, batch 4750, loss[loss=0.1716, simple_loss=0.2446, pruned_loss=0.04925, over 23307.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2467, pruned_loss=0.04631, over 4734509.26 frames. ], batch size: 119, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:24:16,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:24:20,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 10:24:22,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:24:24,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:24:25,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=846186.6666666666, ans=0.0 2023-10-02 10:24:28,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:24:29,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:24:29,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 10:24:30,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:24:31,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=846253.3333333334, ans=0.125 2023-10-02 10:24:35,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 10:24:36,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:24:36,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:24:36,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:24:39,209 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.99 vs. limit=22.5 2023-10-02 10:24:42,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 10:24:47,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:24:50,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 10:24:50,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:24:52,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:24:52,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:24:52,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:24:53,809 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 10:24:53,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 10:24:59,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 10:25:00,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:25:03,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:25:03,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:25:03,636 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 10:25:04,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:25:08,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:25:09,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:25:10,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 10:25:12,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 10:25:12,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:25:14,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:25:14,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:25:14,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 10:25:15,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 10:25:17,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 10:25:20,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:25:22,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:25:22,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 10:25:22,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=846453.3333333334, ans=0.125 2023-10-02 10:25:23,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:25:25,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:25:25,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=846453.3333333334, ans=0.0 2023-10-02 10:25:25,875 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.71 vs. limit=15.0 2023-10-02 10:25:26,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:25:28,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:25:28,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:25:30,623 INFO [train.py:1046] (1/4) Epoch 24, batch 4800, loss[loss=0.1513, simple_loss=0.2269, pruned_loss=0.03784, over 24247.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.248, pruned_loss=0.04676, over 4724781.53 frames. ], batch size: 56, lr: 4.24e-03, grad_scale: 16.0 2023-10-02 10:25:32,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:25:32,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 10:25:32,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 10:25:34,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 10:25:36,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:25:36,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:25:37,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 10:25:42,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:25:42,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:25:48,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:25:48,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=846586.6666666666, ans=0.1 2023-10-02 10:25:50,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:25:50,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:25:50,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 10:25:50,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=846586.6666666666, ans=0.025 2023-10-02 10:25:52,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:25:52,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:25:53,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:25:55,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=846586.6666666666, ans=0.125 2023-10-02 10:25:57,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:25:58,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:25:59,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:26:00,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:26:00,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 10:26:00,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:26:01,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:26:04,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:26:06,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:26:07,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:26:07,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:26:10,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 10:26:10,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:26:12,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 10:26:12,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 10:26:14,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:26:14,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:26:14,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:26:14,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:26:14,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:26:17,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:26:17,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:26:21,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:26:23,292 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.885e+02 2.115e+02 2.514e+02 3.907e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-02 10:26:24,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:26,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=846720.0, ans=0.1 2023-10-02 10:26:27,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:26:30,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 10:26:30,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:26:31,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:31,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:26:33,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:26:35,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:26:37,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:26:37,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:39,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:26:39,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:26:40,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:26:44,817 INFO [train.py:1046] (1/4) Epoch 24, batch 4850, loss[loss=0.1557, simple_loss=0.2379, pruned_loss=0.03682, over 24604.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2485, pruned_loss=0.04696, over 4724210.29 frames. ], batch size: 60, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:26:44,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:26:44,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:44,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:26:45,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=846853.3333333334, ans=0.2 2023-10-02 10:26:45,518 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.73 vs. limit=6.0 2023-10-02 10:26:46,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 10:26:46,882 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.61 vs. limit=22.5 2023-10-02 10:26:48,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 10:26:48,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:26:48,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:26:50,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:26:50,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:53,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:26:58,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 10:26:59,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:27:03,717 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.11 vs. limit=22.5 2023-10-02 10:27:04,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:27:04,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 10:27:05,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:27:08,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:27:08,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=846920.0, ans=0.125 2023-10-02 10:27:09,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:27:12,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:27:12,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 10:27:13,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:27:17,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:27:17,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:27:17,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:27:17,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 10:27:20,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:27:20,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:27:25,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:27:25,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 10:27:25,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 10:27:28,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:27:36,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:27:36,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 10:27:36,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:27:36,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:27:37,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:27:40,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 10:27:40,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:27:40,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 10:27:40,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:27:43,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:27:44,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 10:27:51,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:27:56,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:27:56,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:27:58,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=847186.6666666666, ans=0.125 2023-10-02 10:27:59,048 INFO [train.py:1046] (1/4) Epoch 24, batch 4900, loss[loss=0.1972, simple_loss=0.2643, pruned_loss=0.0651, over 23873.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2479, pruned_loss=0.04675, over 4719215.15 frames. ], batch size: 195, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:28:00,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 10:28:00,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:28:03,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=847186.6666666666, ans=0.125 2023-10-02 10:28:04,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:28:06,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:28:06,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:28:07,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=847186.6666666666, ans=0.125 2023-10-02 10:28:08,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 10:28:13,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 10:28:17,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 10:28:18,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 10:28:18,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:28:18,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:28:18,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:28:18,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:28:18,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:28:20,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 10:28:24,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 10:28:24,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:28:25,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:28:27,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:28:29,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:28:30,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:28:31,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:28:31,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 10:28:33,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:28:33,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:28:33,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 10:28:34,621 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=4.85 vs. limit=12.0 2023-10-02 10:28:35,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 10:28:36,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 10:28:36,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=847320.0, ans=0.125 2023-10-02 10:28:39,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:28:40,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:28:40,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:28:42,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:28:42,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 10:28:42,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:28:44,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 10:28:47,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:28:48,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 10:28:51,078 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.854e+02 2.024e+02 2.236e+02 3.450e+02, threshold=4.049e+02, percent-clipped=0.0 2023-10-02 10:28:51,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:28:55,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 10:28:56,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:28:56,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 10:28:57,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 10:29:05,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:29:05,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:29:07,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 10:29:07,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 10:29:07,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:29:08,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:29:11,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:29:11,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:29:12,842 INFO [train.py:1046] (1/4) Epoch 24, batch 4950, loss[loss=0.163, simple_loss=0.228, pruned_loss=0.04896, over 22670.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2466, pruned_loss=0.0465, over 4714359.07 frames. ], batch size: 322, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:29:12,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:29:12,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 10:29:13,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 10:29:16,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:29:16,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 10:29:19,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 10:29:20,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 10:29:20,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:29:22,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 10:29:22,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:29:23,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:29:23,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:29:23,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:29:23,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=847520.0, ans=0.125 2023-10-02 10:29:26,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:29:26,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:29:27,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:29:27,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:29:29,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=847586.6666666666, ans=0.125 2023-10-02 10:29:31,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:29:31,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:29:33,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:29:38,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:29:39,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:29:39,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=847586.6666666666, ans=0.125 2023-10-02 10:29:41,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=847653.3333333334, ans=0.0 2023-10-02 10:29:42,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:29:42,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:29:45,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:29:47,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 10:29:47,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 10:29:50,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:29:53,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:29:53,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:29:53,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:29:53,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:29:54,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:29:57,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:29:59,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:30:02,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:30:04,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:30:04,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:30:06,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 10:30:06,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:30:07,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:30:11,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:30:11,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:30:12,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:30:12,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:30:13,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:30:13,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:30:16,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:30:16,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:30:17,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:30:18,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 10:30:24,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:30:26,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=847853.3333333334, ans=0.125 2023-10-02 10:30:27,214 INFO [train.py:1046] (1/4) Epoch 24, batch 5000, loss[loss=0.1746, simple_loss=0.2606, pruned_loss=0.04431, over 24637.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2462, pruned_loss=0.04649, over 4704559.66 frames. ], batch size: 73, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:30:27,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 10:30:27,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 10:30:28,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=847853.3333333334, ans=0.125 2023-10-02 10:30:31,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=847853.3333333334, ans=0.125 2023-10-02 10:30:35,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:30:35,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:30:35,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 10:30:37,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 10:30:40,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:30:40,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 10:30:40,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:30:40,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:30:42,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 10:30:42,794 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.89 vs. limit=6.0 2023-10-02 10:30:43,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:30:43,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:30:44,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 10:30:44,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:30:44,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:30:46,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 10:30:46,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 10:30:47,528 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.01 vs. limit=15.0 2023-10-02 10:30:48,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:30:48,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 10:30:48,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:30:48,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:30:49,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:30:49,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 10:30:49,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 10:30:51,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 10:30:51,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:30:51,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:30:53,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 10:30:53,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:30:55,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:30:55,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:30:57,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 10:31:00,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 10:31:02,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:31:03,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:31:06,220 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 10:31:09,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:31:10,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:31:10,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:13,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 10:31:13,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:31:14,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:31:14,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:31:16,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 10:31:18,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:31:19,763 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.837e+02 2.022e+02 2.310e+02 4.101e+02, threshold=4.045e+02, percent-clipped=1.0 2023-10-02 10:31:19,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:31:19,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:31:22,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=848053.3333333334, ans=10.0 2023-10-02 10:31:24,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 10:31:28,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:37,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:31:40,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:40,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:31:40,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:31:41,903 INFO [train.py:1046] (1/4) Epoch 24, batch 5050, loss[loss=0.1831, simple_loss=0.2503, pruned_loss=0.05792, over 23760.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2461, pruned_loss=0.04636, over 4710632.11 frames. ], batch size: 164, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:31:41,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:31:41,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:31:42,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:46,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:46,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 10:31:46,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=848186.6666666666, ans=0.125 2023-10-02 10:31:48,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:31:49,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:31:50,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:31:51,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 10:31:53,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:31:53,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:31:55,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:31:56,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:31:58,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:32:07,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 10:32:07,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 10:32:08,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:32:08,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 10:32:08,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:32:10,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:32:10,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:32:11,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:32:11,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 10:32:13,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 10:32:14,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:32:17,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:32:17,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=848320.0, ans=0.015 2023-10-02 10:32:20,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:32:20,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 10:32:21,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:32:23,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 10:32:24,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:32:26,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:32:26,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:32:28,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:32:29,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:32:31,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:32:32,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:32:32,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:32:32,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:32:32,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 10:32:34,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:32:35,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:32:38,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:32:38,175 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 10:32:38,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 10:32:38,972 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.46 vs. limit=10.0 2023-10-02 10:32:39,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:32:39,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:32:41,449 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 10:32:45,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:32:45,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 10:32:45,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:32:49,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:32:49,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:32:49,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 10:32:51,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 10:32:53,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:32:53,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:32:54,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:32:54,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=848520.0, ans=0.0 2023-10-02 10:32:56,189 INFO [train.py:1046] (1/4) Epoch 24, batch 5100, loss[loss=0.1725, simple_loss=0.2554, pruned_loss=0.04477, over 23958.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2459, pruned_loss=0.0459, over 4721150.75 frames. ], batch size: 80, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:32:56,330 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 10:32:58,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:33:03,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 10:33:03,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 10:33:04,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:33:06,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:33:08,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:33:08,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 10:33:09,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 10:33:09,542 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.60 vs. limit=6.0 2023-10-02 10:33:10,700 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=848586.6666666666, ans=0.2 2023-10-02 10:33:13,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:33:15,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:33:20,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:33:22,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 10:33:22,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:33:25,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:33:25,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 10:33:28,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:33:28,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:33:28,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 10:33:31,548 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 10:33:32,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:33:32,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 10:33:32,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 10:33:33,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=848653.3333333334, ans=0.125 2023-10-02 10:33:35,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=848653.3333333334, ans=0.1 2023-10-02 10:33:36,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:33:41,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=848720.0, ans=0.0 2023-10-02 10:33:45,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:33:46,938 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.871e+02 2.071e+02 2.316e+02 3.219e+02, threshold=4.141e+02, percent-clipped=0.0 2023-10-02 10:33:48,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 10:33:49,021 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 10:33:49,029 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 10:33:49,931 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.80 vs. limit=15.0 2023-10-02 10:33:51,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 10:33:51,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:33:54,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 10:33:58,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 10:33:59,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 10:34:00,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:34:03,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 10:34:05,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 10:34:05,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 10:34:06,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=848786.6666666666, ans=0.2 2023-10-02 10:34:09,211 INFO [train.py:1046] (1/4) Epoch 24, batch 5150, loss[loss=0.1722, simple_loss=0.2614, pruned_loss=0.04151, over 24335.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2468, pruned_loss=0.04685, over 4714738.00 frames. ], batch size: 74, lr: 4.23e-03, grad_scale: 4.0 2023-10-02 10:34:11,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:34:11,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:34:11,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:34:11,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:34:12,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:34:12,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:34:14,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 10:34:14,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 10:34:14,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 10:34:15,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:34:15,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 10:34:18,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:34:18,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 10:34:20,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:34:21,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:34:25,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:34:25,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 10:34:27,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:34:27,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:34:29,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:34:29,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:34:29,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:34:30,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:34:30,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:34:30,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 10:34:33,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:34:34,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:34:36,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:34:38,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 10:34:40,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:34:44,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:34:47,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 10:34:47,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=848986.6666666666, ans=0.125 2023-10-02 10:34:51,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:34:57,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:34:58,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:34:58,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=849053.3333333334, ans=0.125 2023-10-02 10:34:58,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=849053.3333333334, ans=0.0 2023-10-02 10:35:01,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:35:02,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:35:04,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 10:35:07,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:35:07,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:35:07,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:35:07,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=849120.0, ans=0.125 2023-10-02 10:35:08,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=849120.0, ans=0.125 2023-10-02 10:35:10,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:35:11,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:35:12,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 10:35:17,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:35:19,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 10:35:20,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:35:20,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:35:22,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 10:35:22,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:35:23,938 INFO [train.py:1046] (1/4) Epoch 24, batch 5200, loss[loss=0.1369, simple_loss=0.2183, pruned_loss=0.02778, over 24315.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2472, pruned_loss=0.04735, over 4712561.24 frames. ], batch size: 56, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:35:23,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:35:24,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:35:28,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:35:29,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:35:30,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:35:34,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 10:35:34,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:35:35,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=849186.6666666666, ans=0.0 2023-10-02 10:35:37,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:35:37,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=849253.3333333334, ans=0.0 2023-10-02 10:35:38,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:35:39,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:35:39,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:35:42,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 10:35:45,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:35:45,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:35:47,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 10:35:50,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:35:50,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:35:51,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 10:35:53,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 10:35:56,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 10:35:56,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=849320.0, ans=0.2 2023-10-02 10:35:58,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:35:58,087 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 10:35:58,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:35:59,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:35:59,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:35:59,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 10:35:59,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:36:02,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:36:03,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=849320.0, ans=0.1 2023-10-02 10:36:05,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 10:36:05,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 10:36:05,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 10:36:09,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 10:36:09,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=849386.6666666666, ans=0.0 2023-10-02 10:36:10,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:36:16,807 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.871e+02 2.065e+02 2.333e+02 3.353e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-02 10:36:16,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:36:16,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:36:17,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=849386.6666666666, ans=0.125 2023-10-02 10:36:18,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 10:36:19,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:36:19,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 10:36:19,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:36:19,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:36:20,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=849386.6666666666, ans=0.0 2023-10-02 10:36:24,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:36:24,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:36:27,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:36:27,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:36:27,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:36:32,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=849453.3333333334, ans=0.125 2023-10-02 10:36:33,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:36:35,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 10:36:35,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:36:36,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:36:36,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:36:37,887 INFO [train.py:1046] (1/4) Epoch 24, batch 5250, loss[loss=0.1555, simple_loss=0.2407, pruned_loss=0.03516, over 24486.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.247, pruned_loss=0.04692, over 4705644.18 frames. ], batch size: 63, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:36:37,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 10:36:39,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:36:40,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:36:43,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:36:44,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:36:46,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:36:53,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:36:54,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:36:55,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:36:57,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:36:59,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 10:37:00,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:37:02,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:37:10,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=849653.3333333334, ans=0.125 2023-10-02 10:37:18,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=849653.3333333334, ans=0.1 2023-10-02 10:37:23,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=849720.0, ans=0.0 2023-10-02 10:37:28,584 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.93 vs. limit=15.0 2023-10-02 10:37:32,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=849786.6666666666, ans=0.0 2023-10-02 10:37:39,225 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.32 vs. limit=15.0 2023-10-02 10:37:46,416 INFO [train.py:1046] (1/4) Epoch 24, batch 5300, loss[loss=0.1702, simple_loss=0.2559, pruned_loss=0.04223, over 24647.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.245, pruned_loss=0.0468, over 4698235.74 frames. ], batch size: 68, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:38:01,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:38:01,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 10:38:01,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 10:38:01,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:38:01,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:38:01,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:38:01,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:38:01,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:38:01,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:01,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:38:01,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 10:38:02,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:38:02,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 10:38:02,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 10:38:02,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 10:38:02,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 10:38:02,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 10:38:02,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 10:38:02,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:38:03,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:38:03,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:38:03,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:38:03,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:38:03,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:38:03,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:38:03,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:38:03,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:38:03,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:38:03,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:38:03,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:38:03,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:38:04,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 10:38:04,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:38:04,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:38:04,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 10:38:04,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 10:38:05,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:38:05,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:38:05,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 10:38:05,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 10:38:05,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:38:06,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:38:06,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:38:06,331 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 10:38:06,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 10:38:06,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:38:06,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:38:06,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 10:38:06,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 10:38:06,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 10:38:06,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:38:13,259 INFO [train.py:1046] (1/4) Epoch 25, batch 0, loss[loss=0.172, simple_loss=0.2462, pruned_loss=0.04889, over 23271.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2462, pruned_loss=0.04889, over 23271.00 frames. ], batch size: 105, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:38:13,259 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 10:38:25,664 INFO [train.py:1078] (1/4) Epoch 25, validation: loss=0.3293, simple_loss=0.2723, pruned_loss=0.1931, over 1125622.00 frames. 2023-10-02 10:38:25,664 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 10:38:29,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 10:38:29,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:38:30,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:38:36,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:38:36,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:38:36,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:37,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 10:38:38,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 10:38:40,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:41,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:43,655 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.40 vs. limit=15.0 2023-10-02 10:38:44,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:45,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:38:45,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:38:47,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:38:48,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 10:38:49,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:38:59,699 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.885e+02 2.204e+02 2.603e+02 4.904e+02, threshold=4.408e+02, percent-clipped=3.0 2023-10-02 10:38:59,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:38:59,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:39:01,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 10:39:05,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:39:05,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:39:08,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:39:10,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=850140.0, ans=0.1 2023-10-02 10:39:11,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:39:15,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:39:21,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 10:39:21,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=850140.0, ans=0.125 2023-10-02 10:39:22,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 10:39:22,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:39:22,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:39:23,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:39:23,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:39:25,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 10:39:28,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:39:28,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=850206.6666666666, ans=0.2 2023-10-02 10:39:29,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:39:34,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:39:37,608 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 10:39:38,889 INFO [train.py:1046] (1/4) Epoch 25, batch 50, loss[loss=0.1586, simple_loss=0.2347, pruned_loss=0.04127, over 24577.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2465, pruned_loss=0.04487, over 1076475.51 frames. ], batch size: 60, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:39:39,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:39:43,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:39:44,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:39:44,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 10:39:45,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:39:47,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:39:48,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:39:49,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:39:51,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:39:54,788 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.30 vs. limit=6.0 2023-10-02 10:39:55,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 10:39:55,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:40:00,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=850340.0, ans=10.0 2023-10-02 10:40:03,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 10:40:05,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 10:40:07,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 10:40:08,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:40:10,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:40:10,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:40:11,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:40:12,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:40:14,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 10:40:14,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:40:14,996 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.47 vs. limit=15.0 2023-10-02 10:40:19,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:40:21,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:40:21,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:40:22,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 10:40:25,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:40:26,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:40:26,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 10:40:26,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:40:29,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 10:40:37,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:40:37,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:40:39,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:40:40,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:40:40,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:40:42,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 10:40:43,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 10:40:44,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:40:45,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:40:46,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:40:47,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:40:47,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 10:40:49,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 10:40:50,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 10:40:50,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:40:50,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:40:51,998 INFO [train.py:1046] (1/4) Epoch 25, batch 100, loss[loss=0.1605, simple_loss=0.2423, pruned_loss=0.03938, over 24487.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2491, pruned_loss=0.04623, over 1893427.16 frames. ], batch size: 63, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:40:52,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 10:40:52,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 10:40:53,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:40:53,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:40:54,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 10:40:54,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:40:57,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:41:00,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:41:04,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:41:05,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 10:41:05,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:41:10,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:41:10,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:41:11,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:41:11,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:41:11,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:41:11,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 10:41:14,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:41:14,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:41:14,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:41:14,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:41:18,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 10:41:20,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:41:21,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:41:23,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:41:24,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:41:25,649 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.442e+02 1.842e+02 2.089e+02 2.326e+02 3.490e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 10:41:28,586 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 10:41:28,603 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 10:41:29,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:41:29,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:41:35,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 10:41:36,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:41:37,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:41:43,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:41:43,870 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 10:41:47,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 10:41:50,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:41:51,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:41:54,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:41:55,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:41:58,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:41:59,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:42:01,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:42:02,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:42:02,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:42:02,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:42:02,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:42:05,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 10:42:05,290 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 10:42:05,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:42:06,361 INFO [train.py:1046] (1/4) Epoch 25, batch 150, loss[loss=0.2278, simple_loss=0.2867, pruned_loss=0.08443, over 19415.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2501, pruned_loss=0.04686, over 2524620.65 frames. ], batch size: 388, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:42:07,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:42:07,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:07,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:42:07,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 10:42:08,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:42:09,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:42:09,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:09,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:42:10,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:42:11,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:42:11,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:42:15,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:42:18,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:42:18,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:42:18,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:21,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:42:21,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:23,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:42:23,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:26,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 10:42:26,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 10:42:26,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 10:42:28,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:42:28,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:42:29,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:42:31,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:42:31,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:42:31,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:33,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:33,784 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 10:42:34,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:42:39,867 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.37 vs. limit=15.0 2023-10-02 10:42:40,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:42:40,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=851073.3333333334, ans=0.1 2023-10-02 10:42:46,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:42:47,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 10:42:50,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:42:50,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:42:52,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:42:53,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:42:54,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:42:56,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:42:57,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:42:58,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 10:43:02,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:43:03,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:43:03,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:43:03,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:43:05,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:43:08,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 10:43:09,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:43:12,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:43:13,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:43:15,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:43:16,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 10:43:16,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:43:16,889 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 10:43:20,114 INFO [train.py:1046] (1/4) Epoch 25, batch 200, loss[loss=0.1724, simple_loss=0.256, pruned_loss=0.04438, over 24458.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2505, pruned_loss=0.04746, over 3024043.94 frames. ], batch size: 63, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:43:20,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:43:22,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:43:22,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:43:25,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 10:43:25,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:43:27,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:43:29,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 10:43:31,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 10:43:33,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:43:34,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:43:39,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:43:39,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:43:39,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:43:50,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=851406.6666666666, ans=0.1 2023-10-02 10:43:53,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=851406.6666666666, ans=0.2 2023-10-02 10:43:54,484 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.978e+02 2.260e+02 2.565e+02 3.626e+02, threshold=4.520e+02, percent-clipped=0.0 2023-10-02 10:43:56,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:43:57,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:43:57,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:43:58,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:43:58,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 10:43:58,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:44:02,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:02,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:44:02,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=851406.6666666666, ans=0.05 2023-10-02 10:44:03,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:44:04,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:44:04,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 10:44:04,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:44:04,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:44:06,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_na.min_abs, batch_count=851473.3333333334, ans=0.02 2023-10-02 10:44:06,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=851473.3333333334, ans=0.125 2023-10-02 10:44:09,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=851473.3333333334, ans=0.125 2023-10-02 10:44:09,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=851473.3333333334, ans=0.2 2023-10-02 10:44:10,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:44:11,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=851473.3333333334, ans=0.125 2023-10-02 10:44:13,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=851473.3333333334, ans=0.125 2023-10-02 10:44:15,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:44:19,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=851540.0, ans=0.125 2023-10-02 10:44:24,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:25,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:44:31,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:31,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=851540.0, ans=0.125 2023-10-02 10:44:33,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 10:44:33,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:44:33,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:44:33,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:44:34,608 INFO [train.py:1046] (1/4) Epoch 25, batch 250, loss[loss=0.169, simple_loss=0.256, pruned_loss=0.041, over 24441.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.25, pruned_loss=0.04728, over 3392966.62 frames. ], batch size: 69, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:44:34,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:44:36,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 10:44:37,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:44:37,513 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 10:44:39,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:40,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:44:40,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:42,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:44:43,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:44:44,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:46,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:44:46,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=851606.6666666666, ans=0.0 2023-10-02 10:44:47,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=851673.3333333334, ans=0.0 2023-10-02 10:44:48,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:44:59,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=851673.3333333334, ans=0.1 2023-10-02 10:45:01,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:45:04,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:45:04,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:45:09,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:45:09,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:45:11,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:45:11,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:45:13,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:45:14,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:45:15,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:45:17,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:45:19,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 10:45:19,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:45:21,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:45:21,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:45:21,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:45:22,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:45:22,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:45:22,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:45:26,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:45:27,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:45:29,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:45:32,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:45:35,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=851873.3333333334, ans=0.125 2023-10-02 10:45:35,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=851873.3333333334, ans=0.125 2023-10-02 10:45:36,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:45:38,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:45:41,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:45:42,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:45:45,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 10:45:47,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:45:47,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:45:48,262 INFO [train.py:1046] (1/4) Epoch 25, batch 300, loss[loss=0.1556, simple_loss=0.2092, pruned_loss=0.05102, over 19486.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2478, pruned_loss=0.04716, over 3686827.51 frames. ], batch size: 388, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:45:49,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 10:45:49,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 10:45:51,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:45:51,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 10:45:56,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:45:56,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:46:00,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:46:02,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 10:46:03,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:46:04,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 10:46:04,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 10:46:04,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:46:06,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=852006.6666666666, ans=0.0 2023-10-02 10:46:08,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:46:12,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:46:13,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 10:46:15,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 10:46:17,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:19,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:46:22,344 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.821e+02 1.986e+02 2.165e+02 3.006e+02, threshold=3.972e+02, percent-clipped=0.0 2023-10-02 10:46:22,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:22,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 10:46:22,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:46:22,709 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:46:25,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:46:27,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:46:29,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:46:32,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 10:46:32,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 10:46:33,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:46:34,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:36,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 10:46:36,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=852140.0, ans=0.125 2023-10-02 10:46:37,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:46:41,984 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.89 vs. limit=15.0 2023-10-02 10:46:42,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:46:42,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=852140.0, ans=0.125 2023-10-02 10:46:45,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:46:45,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 10:46:49,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:49,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:46:50,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:52,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:46:52,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=852206.6666666666, ans=0.2 2023-10-02 10:46:54,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 10:46:54,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:46:54,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:46:56,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 10:46:57,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:58,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:46:59,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:46:59,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:46:59,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:00,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=852206.6666666666, ans=0.1 2023-10-02 10:47:03,001 INFO [train.py:1046] (1/4) Epoch 25, batch 350, loss[loss=0.1677, simple_loss=0.2492, pruned_loss=0.04314, over 24665.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.246, pruned_loss=0.04636, over 3904343.04 frames. ], batch size: 65, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:47:05,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:47:05,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 10:47:07,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:13,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:47:15,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:47:16,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:19,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 10:47:20,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:47:20,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 10:47:23,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:25,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 10:47:26,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:47:28,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 10:47:29,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:47:29,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=852340.0, ans=0.0 2023-10-02 10:47:32,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:47:32,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:47:34,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:47:34,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:47:34,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:47:34,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:47:35,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:47:36,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:47:36,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:44,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:47:44,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:47:46,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:47:46,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:47:46,862 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.66 vs. limit=22.5 2023-10-02 10:47:50,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 10:47:50,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:53,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:47:53,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:47:54,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:47:56,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 10:47:57,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:47:58,840 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 10:48:00,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 10:48:00,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:48:00,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=852540.0, ans=0.125 2023-10-02 10:48:03,123 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.79 vs. limit=10.0 2023-10-02 10:48:03,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:48:03,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 10:48:06,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:48:09,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:48:10,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:48:12,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:48:12,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:48:12,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=852540.0, ans=0.125 2023-10-02 10:48:13,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:48:16,573 INFO [train.py:1046] (1/4) Epoch 25, batch 400, loss[loss=0.1365, simple_loss=0.2124, pruned_loss=0.03027, over 24459.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2448, pruned_loss=0.04598, over 4081446.84 frames. ], batch size: 58, lr: 4.14e-03, grad_scale: 32.0 2023-10-02 10:48:16,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:48:18,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:48:19,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 10:48:19,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:48:20,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:48:22,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:48:22,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:48:25,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:48:25,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=852606.6666666666, ans=0.1 2023-10-02 10:48:28,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:48:28,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 10:48:31,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 10:48:31,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:48:32,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 10:48:33,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:48:36,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:48:36,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:48:36,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 10:48:36,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:48:36,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:48:38,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:48:38,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:48:41,995 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 10:48:42,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 10:48:42,529 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.96 vs. limit=12.0 2023-10-02 10:48:47,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:48:47,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:48:48,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 10:48:49,992 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 1.835e+02 2.039e+02 2.384e+02 3.954e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-02 10:48:50,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 10:48:52,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=852740.0, ans=15.0 2023-10-02 10:48:53,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:48:56,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:48:57,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=852740.0, ans=0.2 2023-10-02 10:49:01,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 10:49:04,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 10:49:04,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 10:49:08,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:49:10,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:49:10,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 10:49:12,978 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.48 vs. limit=6.0 2023-10-02 10:49:13,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:49:14,109 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.98 vs. limit=15.0 2023-10-02 10:49:16,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:49:17,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:49:18,062 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:49:19,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:49:19,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 10:49:21,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:49:22,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 10:49:26,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:49:26,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:49:28,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 10:49:29,381 INFO [train.py:1046] (1/4) Epoch 25, batch 450, loss[loss=0.1578, simple_loss=0.2488, pruned_loss=0.03342, over 24644.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2462, pruned_loss=0.04605, over 4225805.88 frames. ], batch size: 68, lr: 4.14e-03, grad_scale: 32.0 2023-10-02 10:49:30,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:49:30,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:49:30,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 10:49:32,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 10:49:33,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:49:35,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:49:35,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=852940.0, ans=0.2 2023-10-02 10:49:36,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:49:36,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 10:49:36,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:49:38,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:49:40,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:49:47,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:49:48,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:49:50,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 10:49:50,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 10:49:54,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:49:56,508 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.46 vs. limit=22.5 2023-10-02 10:49:57,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:49:59,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:50:02,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:50:04,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:50:06,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 10:50:07,525 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.02 vs. limit=15.0 2023-10-02 10:50:08,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 10:50:10,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 10:50:10,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:50:11,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:50:11,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:50:12,937 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 10:50:14,615 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 10:50:14,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:50:15,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:50:16,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 10:50:20,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 10:50:21,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:50:22,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 10:50:23,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 10:50:23,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:50:24,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=853140.0, ans=0.0 2023-10-02 10:50:25,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:50:26,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:50:28,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 10:50:31,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:50:31,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 10:50:33,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 10:50:33,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:50:37,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:50:39,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:50:40,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:50:40,693 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 10:50:40,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=853273.3333333334, ans=0.0 2023-10-02 10:50:41,957 INFO [train.py:1046] (1/4) Epoch 25, batch 500, loss[loss=0.1628, simple_loss=0.2485, pruned_loss=0.03857, over 24291.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2468, pruned_loss=0.04656, over 4333885.27 frames. ], batch size: 74, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:50:45,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:50:47,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:50:47,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:50:47,135 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 10:50:49,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 10:50:49,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:50:51,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:50:58,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:51:00,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:51:01,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:51:01,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:51:03,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:13,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:51:13,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 10:51:14,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:51:14,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:51:14,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 10:51:14,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:51:16,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=853406.6666666666, ans=0.09899494936611666 2023-10-02 10:51:19,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:51:19,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:51:19,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:51:20,408 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.879e+02 2.028e+02 2.264e+02 3.460e+02, threshold=4.055e+02, percent-clipped=0.0 2023-10-02 10:51:20,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:51:20,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 10:51:25,065 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 10:51:25,697 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.36 vs. limit=10.0 2023-10-02 10:51:26,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:51:28,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:29,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:29,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:30,321 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.85 vs. limit=15.0 2023-10-02 10:51:30,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 10:51:31,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=853473.3333333334, ans=0.125 2023-10-02 10:51:32,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 10:51:35,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=853473.3333333334, ans=0.2 2023-10-02 10:51:36,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:51:36,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:51:40,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:51:43,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:48,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:51:53,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 10:51:54,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:51:54,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:51:56,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 10:51:57,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 10:51:57,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:51:58,892 INFO [train.py:1046] (1/4) Epoch 25, batch 550, loss[loss=0.1724, simple_loss=0.2458, pruned_loss=0.0495, over 23794.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2483, pruned_loss=0.04705, over 4424214.86 frames. ], batch size: 164, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:52:03,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 10:52:06,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 10:52:06,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:52:06,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=853606.6666666666, ans=0.04949747468305833 2023-10-02 10:52:07,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 10:52:07,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:52:07,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:52:08,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:09,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:09,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:52:10,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:52:11,105 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.52 vs. limit=15.0 2023-10-02 10:52:13,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:52:13,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 10:52:13,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:52:19,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:52:19,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:19,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=853673.3333333334, ans=0.125 2023-10-02 10:52:21,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:52:22,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:25,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 10:52:25,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 10:52:28,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:52:34,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:52:34,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:52:34,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:52:37,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:52:37,480 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 10:52:38,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:40,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 10:52:40,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=853740.0, ans=0.125 2023-10-02 10:52:42,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:52:42,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:52:42,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:52:44,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:52:45,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 10:52:47,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 10:52:48,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:52:48,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:52:50,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:52:50,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:52:52,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=853806.6666666666, ans=0.2 2023-10-02 10:52:53,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:52:53,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:52:57,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:52:57,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:52:58,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 10:52:59,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:53:01,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:53:01,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:53:03,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:53:04,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:53:04,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 10:53:08,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=853873.3333333334, ans=0.1 2023-10-02 10:53:09,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 10:53:12,555 INFO [train.py:1046] (1/4) Epoch 25, batch 600, loss[loss=0.1825, simple_loss=0.2455, pruned_loss=0.05974, over 23826.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2482, pruned_loss=0.04685, over 4496696.49 frames. ], batch size: 195, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:53:13,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 10:53:15,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:53:15,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:53:15,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:53:21,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:53:23,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:53:25,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 10:53:28,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 10:53:29,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:53:31,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:53:33,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 10:53:33,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:53:38,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=854006.6666666666, ans=0.1 2023-10-02 10:53:39,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 10:53:42,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:53:42,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:53:42,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:53:48,071 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.854e+02 2.038e+02 2.245e+02 2.978e+02, threshold=4.076e+02, percent-clipped=0.0 2023-10-02 10:53:48,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:53:48,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:53:49,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:53:53,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=854073.3333333334, ans=0.125 2023-10-02 10:53:56,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:53:57,048 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.91 vs. limit=6.0 2023-10-02 10:53:59,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:53:59,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:53:59,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:54:02,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=854140.0, ans=0.1 2023-10-02 10:54:06,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 10:54:12,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 10:54:12,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:54:15,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 10:54:15,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:54:17,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 10:54:19,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:54:19,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:54:25,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 10:54:27,167 INFO [train.py:1046] (1/4) Epoch 25, batch 650, loss[loss=0.1708, simple_loss=0.2467, pruned_loss=0.04749, over 24447.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.247, pruned_loss=0.04699, over 4531064.57 frames. ], batch size: 63, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:54:27,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:54:29,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:54:31,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:54:32,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:54:35,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 10:54:36,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:54:42,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:54:42,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:54:42,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=854340.0, ans=0.125 2023-10-02 10:54:45,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:54:47,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 10:54:49,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=854340.0, ans=0.0 2023-10-02 10:54:51,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:54:51,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:54:56,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:54:56,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 10:54:59,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:54:59,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:00,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:55:00,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=854406.6666666666, ans=0.125 2023-10-02 10:55:01,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:01,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:55:04,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:55:04,680 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 10:55:05,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:55:05,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:55:07,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:07,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=854406.6666666666, ans=0.125 2023-10-02 10:55:08,353 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.17 vs. limit=22.5 2023-10-02 10:55:08,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:55:10,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:55:10,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:55:11,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 10:55:11,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:55:11,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:55:12,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=854473.3333333334, ans=0.125 2023-10-02 10:55:14,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 10:55:14,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:55:15,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:55:15,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 10:55:16,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=854473.3333333334, ans=0.04949747468305833 2023-10-02 10:55:19,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 10:55:19,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:19,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:55:19,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:55:20,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:55:20,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:55:27,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:27,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:55:30,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:55:32,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:55:32,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 10:55:34,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:55:38,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:55:38,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:55:39,747 INFO [train.py:1046] (1/4) Epoch 25, batch 700, loss[loss=0.1681, simple_loss=0.2087, pruned_loss=0.06378, over 19050.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2455, pruned_loss=0.04644, over 4571062.08 frames. ], batch size: 388, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:55:39,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:55:39,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:55:45,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 10:55:46,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 10:55:47,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 10:55:47,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:49,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:55:50,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=854606.6666666666, ans=0.125 2023-10-02 10:55:51,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=854606.6666666666, ans=0.125 2023-10-02 10:55:52,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 10:55:56,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:55:58,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=854673.3333333334, ans=0.125 2023-10-02 10:56:01,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:56:01,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:56:02,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:56:03,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:56:05,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:56:07,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 10:56:07,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:56:09,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 10:56:12,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 10:56:14,702 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.821e+02 2.032e+02 2.235e+02 2.949e+02, threshold=4.064e+02, percent-clipped=0.0 2023-10-02 10:56:16,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:56:16,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:56:18,226 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.19 vs. limit=15.0 2023-10-02 10:56:19,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:56:22,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:56:23,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 10:56:27,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:56:29,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:56:29,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 10:56:32,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:56:33,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:56:35,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:56:37,030 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.29 vs. limit=12.0 2023-10-02 10:56:40,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:56:40,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 10:56:40,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=854873.3333333334, ans=0.125 2023-10-02 10:56:42,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=854873.3333333334, ans=0.0 2023-10-02 10:56:43,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 10:56:44,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 10:56:45,519 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.39 vs. limit=10.0 2023-10-02 10:56:46,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:56:48,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:56:48,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:56:50,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:56:50,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 10:56:52,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=854940.0, ans=0.125 2023-10-02 10:56:53,754 INFO [train.py:1046] (1/4) Epoch 25, batch 750, loss[loss=0.1789, simple_loss=0.2424, pruned_loss=0.05764, over 23714.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2459, pruned_loss=0.04604, over 4615148.76 frames. ], batch size: 164, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:56:55,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 10:56:55,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 10:56:56,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 10:56:58,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 10:56:58,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 10:56:59,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:56:59,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 10:57:01,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:57:01,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:57:03,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:57:05,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:57:06,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:57:06,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:57:09,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:57:09,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:57:10,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:57:13,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:57:13,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:57:13,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 10:57:14,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:57:16,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:57:17,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:57:20,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 10:57:22,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 10:57:22,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:57:23,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=855073.3333333334, ans=0.125 2023-10-02 10:57:25,268 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.82 vs. limit=22.5 2023-10-02 10:57:25,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 10:57:25,905 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 10:57:27,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 10:57:27,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:57:27,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:57:29,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:57:35,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:57:35,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:57:35,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:57:38,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:57:38,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:57:39,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 10:57:39,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:57:39,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=855140.0, ans=0.125 2023-10-02 10:57:42,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 10:57:43,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:57:45,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:57:45,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 10:57:46,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:57:52,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:57:53,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:57:53,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:57:55,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:57:55,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=855206.6666666666, ans=0.0 2023-10-02 10:58:00,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 10:58:00,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:58:00,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:58:03,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:58:04,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:58:05,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=855206.6666666666, ans=0.0 2023-10-02 10:58:05,471 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.39 vs. limit=15.0 2023-10-02 10:58:06,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:58:07,660 INFO [train.py:1046] (1/4) Epoch 25, batch 800, loss[loss=0.149, simple_loss=0.229, pruned_loss=0.03449, over 24294.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2461, pruned_loss=0.04553, over 4646286.34 frames. ], batch size: 61, lr: 4.13e-03, grad_scale: 32.0 2023-10-02 10:58:07,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:58:11,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=855273.3333333334, ans=0.125 2023-10-02 10:58:14,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:58:14,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:17,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:58:17,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:58:18,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:18,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:58:18,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:20,639 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.06 vs. limit=15.0 2023-10-02 10:58:23,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:58:25,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:58:26,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 10:58:28,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:58:29,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:58:29,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:58:29,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:58:29,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 10:58:31,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:58:32,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 10:58:34,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:36,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:58:37,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:58:37,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:58:40,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:58:41,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:58:43,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=855406.6666666666, ans=0.0 2023-10-02 10:58:44,118 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.429e+02 1.817e+02 2.016e+02 2.246e+02 3.441e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-02 10:58:44,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:58:44,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:58:44,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 10:58:45,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=855406.6666666666, ans=0.0 2023-10-02 10:58:46,579 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.97 vs. limit=22.5 2023-10-02 10:58:47,014 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 10:58:47,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 10:58:47,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:58:47,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:58:49,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:49,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:58:53,274 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 10:58:53,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 10:58:54,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:58:56,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:58:58,792 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.97 vs. limit=6.0 2023-10-02 10:59:01,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:59:01,514 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:59:06,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:59:06,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 10:59:06,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:59:08,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=855540.0, ans=0.125 2023-10-02 10:59:08,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=855540.0, ans=0.1 2023-10-02 10:59:10,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 10:59:17,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:59:19,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=855606.6666666666, ans=0.1 2023-10-02 10:59:20,378 INFO [train.py:1046] (1/4) Epoch 25, batch 850, loss[loss=0.185, simple_loss=0.252, pruned_loss=0.059, over 23679.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2466, pruned_loss=0.04575, over 4679164.78 frames. ], batch size: 135, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:59:21,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:59:21,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 10:59:23,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:59:23,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:59:24,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 10:59:24,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:59:26,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:59:28,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:59:29,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:59:29,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=855606.6666666666, ans=0.125 2023-10-02 10:59:29,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=855606.6666666666, ans=0.2 2023-10-02 10:59:31,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:59:32,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 10:59:32,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 10:59:32,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 10:59:34,055 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.46 vs. limit=12.0 2023-10-02 10:59:34,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:59:34,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:59:36,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:59:36,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:59:36,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:59:40,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:59:40,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:59:41,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 10:59:44,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 10:59:45,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=855673.3333333334, ans=0.0 2023-10-02 10:59:45,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=855673.3333333334, ans=0.125 2023-10-02 10:59:48,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:59:48,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 10:59:51,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 10:59:51,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=855740.0, ans=0.125 2023-10-02 10:59:52,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 10:59:54,055 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 10:59:54,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:59:54,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:59:54,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 10:59:57,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:59:59,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:59:59,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 11:00:01,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:00:01,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:00:02,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:00:02,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:00:05,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:00:06,311 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=13.19 vs. limit=15.0 2023-10-02 11:00:06,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 11:00:08,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 11:00:10,104 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.36 vs. limit=15.0 2023-10-02 11:00:12,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:00:12,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:00:13,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:00:13,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:00:15,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:00:17,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:00:19,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:00:20,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:00:20,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:00:20,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:00:22,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=855873.3333333334, ans=0.125 2023-10-02 11:00:28,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:00:28,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:00:31,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 11:00:31,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:00:32,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:00:33,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 11:00:34,963 INFO [train.py:1046] (1/4) Epoch 25, batch 900, loss[loss=0.2119, simple_loss=0.2718, pruned_loss=0.07601, over 22809.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2488, pruned_loss=0.04687, over 4679012.36 frames. ], batch size: 322, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 11:00:39,689 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.68 vs. limit=6.0 2023-10-02 11:00:40,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:00:41,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:00:43,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 11:00:44,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=855940.0, ans=0.0 2023-10-02 11:00:46,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:00:46,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 11:00:47,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 11:00:47,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:00:47,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:00:48,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:00:48,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:00:50,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=856006.6666666666, ans=0.05 2023-10-02 11:00:57,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:00:57,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:00:57,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:01:01,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:01:05,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 11:01:07,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:01:08,929 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:01:10,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:01:11,292 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.867e+02 2.045e+02 2.308e+02 5.112e+02, threshold=4.090e+02, percent-clipped=1.0 2023-10-02 11:01:11,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:01:11,468 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 11:01:12,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 11:01:17,825 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.18 vs. limit=15.0 2023-10-02 11:01:18,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:01:18,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:01:18,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:01:25,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:01:25,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:01:25,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=856140.0, ans=0.1 2023-10-02 11:01:27,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 11:01:27,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:01:29,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 11:01:29,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=856140.0, ans=0.0 2023-10-02 11:01:33,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:01:33,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:01:34,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:01:34,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:01:36,843 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.76 vs. limit=6.0 2023-10-02 11:01:39,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 11:01:39,127 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 11:01:40,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 11:01:40,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 11:01:42,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:01:44,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 11:01:47,443 INFO [train.py:1046] (1/4) Epoch 25, batch 950, loss[loss=0.1885, simple_loss=0.2548, pruned_loss=0.06113, over 23824.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2487, pruned_loss=0.04711, over 4686559.32 frames. ], batch size: 179, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 11:01:48,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:01:52,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:01:53,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:01:53,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 11:01:54,128 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.02 vs. limit=15.0 2023-10-02 11:01:55,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=856273.3333333334, ans=0.1 2023-10-02 11:01:56,259 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 11:02:00,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=856273.3333333334, ans=0.125 2023-10-02 11:02:01,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:02:03,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:02:03,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:02:04,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:02:04,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 11:02:05,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:02:07,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:02:08,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 11:02:08,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:02:14,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:02:14,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:02:15,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:02:16,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 11:02:18,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 11:02:18,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=856406.6666666666, ans=0.0 2023-10-02 11:02:19,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:02:21,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:02:21,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=856406.6666666666, ans=0.125 2023-10-02 11:02:26,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:02:26,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:02:30,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 11:02:31,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 11:02:31,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:02:31,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:02:33,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:02:33,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:02:33,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=856473.3333333334, ans=0.0 2023-10-02 11:02:37,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 11:02:37,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=856473.3333333334, ans=0.125 2023-10-02 11:02:38,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:02:42,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:02:42,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:02:42,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 11:02:44,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:02:44,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:02:44,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 11:02:47,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:02:50,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:02:54,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:02:55,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 11:02:55,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 11:03:00,018 INFO [train.py:1046] (1/4) Epoch 25, batch 1000, loss[loss=0.1681, simple_loss=0.2373, pruned_loss=0.04943, over 23793.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2474, pruned_loss=0.0469, over 4689100.89 frames. ], batch size: 179, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 11:03:01,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:03:06,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 11:03:06,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:07,795 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=856606.6666666666, ans=0.125 2023-10-02 11:03:10,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:03:10,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 11:03:10,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 11:03:13,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:03:13,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:03:16,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:03:17,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 11:03:19,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=856673.3333333334, ans=0.2 2023-10-02 11:03:20,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=856673.3333333334, ans=0.0 2023-10-02 11:03:22,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 11:03:23,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 11:03:23,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=856673.3333333334, ans=0.1 2023-10-02 11:03:24,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:03:24,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 11:03:28,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 11:03:28,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 11:03:29,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:03:29,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:37,777 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.816e+02 2.002e+02 2.184e+02 3.024e+02, threshold=4.004e+02, percent-clipped=0.0 2023-10-02 11:03:37,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:03:39,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:03:39,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:40,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:03:40,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 11:03:40,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:03:40,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=856740.0, ans=0.125 2023-10-02 11:03:42,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:03:42,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:03:43,457 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 11:03:46,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 11:03:46,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 11:03:47,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 11:03:50,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:03:54,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:55,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:03:56,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:57,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:04:00,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 11:04:02,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:04:02,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 11:04:03,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 11:04:05,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:04:05,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:04:06,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:04:06,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=856873.3333333334, ans=0.0 2023-10-02 11:04:06,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=856873.3333333334, ans=0.2 2023-10-02 11:04:10,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:04:11,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:04:14,016 INFO [train.py:1046] (1/4) Epoch 25, batch 1050, loss[loss=0.1739, simple_loss=0.2587, pruned_loss=0.04459, over 23680.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2455, pruned_loss=0.04678, over 4674451.75 frames. ], batch size: 85, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 11:04:15,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:04:17,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:04:19,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 11:04:19,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:04:21,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:04:23,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:04:25,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:04:27,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=857006.6666666666, ans=0.2 2023-10-02 11:04:28,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:04:28,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:04:28,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:04:29,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:04:30,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 11:04:31,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:04:33,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 11:04:35,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:04:35,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 11:04:35,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:04:40,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:04:42,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:04:42,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:04:45,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 11:04:45,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 11:04:46,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:04:48,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 11:04:48,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=857073.3333333334, ans=0.1 2023-10-02 11:04:51,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 11:04:51,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:04:52,747 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:04:53,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 11:04:55,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:04:55,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:04:56,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:04:57,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=857140.0, ans=0.0 2023-10-02 11:05:02,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:05:06,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 11:05:07,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 11:05:09,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 11:05:09,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:05:09,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:05:11,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 11:05:15,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:05:16,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:05:16,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:05:16,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:05:18,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:05:20,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:05:20,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 11:05:22,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:05:22,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 11:05:22,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 11:05:23,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:05:26,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:05:28,140 INFO [train.py:1046] (1/4) Epoch 25, batch 1100, loss[loss=0.178, simple_loss=0.2498, pruned_loss=0.05311, over 23842.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2453, pruned_loss=0.0465, over 4687434.63 frames. ], batch size: 212, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:05:32,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:05:34,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=857273.3333333334, ans=0.2 2023-10-02 11:05:36,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:05:38,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:05:38,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:05:39,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 11:05:41,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:05:44,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 11:05:47,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:05:50,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:05:50,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 11:05:51,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 11:05:53,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:05:53,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:05:55,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:05:57,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:06:01,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:06:04,995 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.795e+02 1.916e+02 2.111e+02 3.206e+02, threshold=3.831e+02, percent-clipped=0.0 2023-10-02 11:06:05,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 11:06:05,189 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 11:06:07,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:06:08,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:06:09,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:06:10,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:06:11,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 11:06:11,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:06:11,699 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:06:12,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:06:12,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:06:12,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:06:12,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 11:06:20,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:06:21,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 11:06:22,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:06:24,948 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.66 vs. limit=10.0 2023-10-02 11:06:28,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:06:31,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 11:06:31,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 11:06:33,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:06:35,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:06:35,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=857540.0, ans=0.125 2023-10-02 11:06:37,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:06:38,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 11:06:38,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:06:38,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:06:39,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=857540.0, ans=0.125 2023-10-02 11:06:39,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=857540.0, ans=0.125 2023-10-02 11:06:40,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 11:06:40,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:06:41,865 INFO [train.py:1046] (1/4) Epoch 25, batch 1150, loss[loss=0.16, simple_loss=0.2363, pruned_loss=0.04187, over 24459.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2461, pruned_loss=0.0465, over 4700114.73 frames. ], batch size: 58, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:06:41,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 11:06:42,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:06:43,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:06:44,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:06:48,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:06:50,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:06:53,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:06:53,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:06:53,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 11:06:54,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:06:56,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 11:06:57,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:06:57,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:07:03,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 11:07:06,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:07:09,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:07:09,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:07:09,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 11:07:09,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:07:09,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:07:15,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 11:07:16,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:07:17,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:07:22,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=857740.0, ans=0.125 2023-10-02 11:07:25,027 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:07:26,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=857806.6666666666, ans=0.0 2023-10-02 11:07:27,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:07:27,919 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:07:35,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:07:35,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 11:07:35,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:07:36,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:07:38,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=857806.6666666666, ans=0.0 2023-10-02 11:07:41,094 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 11:07:41,575 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.35 vs. limit=6.0 2023-10-02 11:07:44,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:07:49,927 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 11:07:54,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:07:55,389 INFO [train.py:1046] (1/4) Epoch 25, batch 1200, loss[loss=0.1766, simple_loss=0.2518, pruned_loss=0.05075, over 23214.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2471, pruned_loss=0.04686, over 4693167.31 frames. ], batch size: 119, lr: 4.12e-03, grad_scale: 32.0 2023-10-02 11:07:55,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:07:55,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:07:56,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:07:57,640 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.17 vs. limit=6.0 2023-10-02 11:08:00,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:08:04,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:08:04,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:08:06,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:08:06,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:08:06,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:08:07,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:08:10,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:08:12,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:08:12,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:08:15,389 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 11:08:16,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 11:08:19,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:08:21,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:08:21,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=858006.6666666666, ans=0.1 2023-10-02 11:08:22,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:08:25,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=858073.3333333334, ans=0.0 2023-10-02 11:08:26,026 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.19 vs. limit=15.0 2023-10-02 11:08:26,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:08:26,434 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 11:08:27,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:08:30,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=858073.3333333334, ans=0.125 2023-10-02 11:08:32,519 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.868e+02 2.056e+02 2.360e+02 3.745e+02, threshold=4.112e+02, percent-clipped=0.0 2023-10-02 11:08:36,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:08:36,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:08:36,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 11:08:37,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:08:40,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 11:08:45,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 11:08:45,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:08:45,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=858140.0, ans=0.125 2023-10-02 11:08:46,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:08:46,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:08:48,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:08:48,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:08:49,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:08:49,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:08:50,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 11:08:52,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:08:52,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:08:52,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:08:52,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=858140.0, ans=0.0 2023-10-02 11:08:54,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:08:54,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:08:57,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:08:58,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=858206.6666666666, ans=0.0 2023-10-02 11:08:59,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:09:03,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 11:09:05,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=858206.6666666666, ans=0.125 2023-10-02 11:09:06,843 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 11:09:08,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:09:09,462 INFO [train.py:1046] (1/4) Epoch 25, batch 1250, loss[loss=0.1519, simple_loss=0.2351, pruned_loss=0.03436, over 24451.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2476, pruned_loss=0.04695, over 4706819.12 frames. ], batch size: 63, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:09:10,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:09:11,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:09:12,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:09:16,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 11:09:19,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=858273.3333333334, ans=0.2 2023-10-02 11:09:20,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:09:21,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:09:21,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 11:09:24,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:09:25,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:09:28,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 11:09:28,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=858340.0, ans=0.1 2023-10-02 11:09:30,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:09:31,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:09:31,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:09:34,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:09:38,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 11:09:39,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:09:39,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:09:39,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=858406.6666666666, ans=0.0 2023-10-02 11:09:40,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:09:40,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:09:44,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:09:44,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=858406.6666666666, ans=0.05 2023-10-02 11:09:45,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:09:48,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 11:09:50,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:09:53,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:09:54,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 11:09:55,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:09:55,736 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 11:09:55,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:09:55,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:09:58,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:10:01,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:10:02,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:10:04,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 11:10:04,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 11:10:05,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 11:10:09,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:10:10,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 11:10:10,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:10:12,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 11:10:13,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:10:15,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 11:10:15,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 11:10:16,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:10:16,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:10:16,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:10:17,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 11:10:21,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:10:22,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:10:23,950 INFO [train.py:1046] (1/4) Epoch 25, batch 1300, loss[loss=0.1744, simple_loss=0.2606, pruned_loss=0.04406, over 24293.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2479, pruned_loss=0.04695, over 4705473.20 frames. ], batch size: 74, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:10:24,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:10:25,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:10:28,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:10:29,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 11:10:32,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:10:35,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:10:37,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:10:39,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:10:40,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:10:40,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 11:10:44,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:10:44,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:10:46,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 11:10:51,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:10:54,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:10:55,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:10:56,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:10:58,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:10:58,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:10:59,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:10:59,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 11:11:03,928 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.851e+02 2.097e+02 2.493e+02 3.634e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-02 11:11:05,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:11:05,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:11:07,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 11:11:07,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 11:11:07,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=858806.6666666666, ans=0.1 2023-10-02 11:11:09,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:11:11,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:11:13,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 11:11:14,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:11:14,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 11:11:15,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:11:19,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:11:20,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:11:21,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 11:11:22,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 11:11:23,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 11:11:27,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:11:30,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 11:11:31,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:11:37,873 INFO [train.py:1046] (1/4) Epoch 25, batch 1350, loss[loss=0.1627, simple_loss=0.2169, pruned_loss=0.05424, over 19136.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2471, pruned_loss=0.04681, over 4699589.10 frames. ], batch size: 388, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:11:40,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 11:11:42,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:11:43,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:11:48,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:11:48,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:11:51,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:11:51,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:11:55,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:11:56,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 11:11:58,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 11:11:59,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:12:01,431 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.96 vs. limit=12.0 2023-10-02 11:12:02,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 11:12:02,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:12:03,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:12:03,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 11:12:03,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=859006.6666666666, ans=0.0 2023-10-02 11:12:04,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 11:12:08,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 11:12:08,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:12:09,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=859073.3333333334, ans=0.125 2023-10-02 11:12:10,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 11:12:19,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:12:26,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:12:26,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:12:26,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 11:12:30,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:12:30,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 11:12:30,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 11:12:30,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:12:33,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:12:37,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 11:12:37,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=859206.6666666666, ans=0.2 2023-10-02 11:12:38,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:12:43,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 11:12:44,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 11:12:51,749 INFO [train.py:1046] (1/4) Epoch 25, batch 1400, loss[loss=0.1853, simple_loss=0.2508, pruned_loss=0.05994, over 23788.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2457, pruned_loss=0.04649, over 4698630.07 frames. ], batch size: 179, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:12:53,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 11:12:54,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:12:57,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:12:57,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:12:59,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=859273.3333333334, ans=0.0 2023-10-02 11:13:00,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 11:13:01,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 11:13:05,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=859340.0, ans=0.125 2023-10-02 11:13:05,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=859340.0, ans=0.05 2023-10-02 11:13:06,668 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:13:10,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:13:11,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:13:13,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:13:15,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 11:13:19,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:13:19,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 11:13:22,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=859406.6666666666, ans=0.1 2023-10-02 11:13:29,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:13:30,870 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.837e+02 2.107e+02 2.511e+02 3.281e+02, threshold=4.214e+02, percent-clipped=0.0 2023-10-02 11:13:30,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:13:34,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=859473.3333333334, ans=0.125 2023-10-02 11:13:35,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 11:13:37,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:13:38,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:13:38,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:13:40,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:13:40,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:13:41,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:13:41,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:13:43,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 11:13:43,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:13:47,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:13:50,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:13:50,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=859540.0, ans=0.04949747468305833 2023-10-02 11:13:58,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 11:13:59,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 11:14:00,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:14:02,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 11:14:02,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:14:03,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=859606.6666666666, ans=0.125 2023-10-02 11:14:04,956 INFO [train.py:1046] (1/4) Epoch 25, batch 1450, loss[loss=0.194, simple_loss=0.2688, pruned_loss=0.05963, over 23404.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2447, pruned_loss=0.0463, over 4691060.91 frames. ], batch size: 93, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:14:05,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:14:07,781 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.04 vs. limit=15.0 2023-10-02 11:14:08,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:14:11,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:14:11,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:11,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 11:14:14,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:14:14,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:14:17,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:14:17,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 11:14:17,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:14:19,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 11:14:19,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:20,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:14:20,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 11:14:22,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:14:22,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:14:24,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 11:14:25,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:14:25,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:14:26,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:30,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:14:32,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:14:32,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:14:35,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:14:36,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:37,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:14:38,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:14:39,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:39,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:14:43,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 11:14:45,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:14:49,369 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 11:14:50,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:14:52,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:14:52,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:14:54,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 11:14:55,675 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.70 vs. limit=6.0 2023-10-02 11:14:58,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:15:00,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 11:15:00,976 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.87 vs. limit=6.0 2023-10-02 11:15:01,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 11:15:01,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=859806.6666666666, ans=0.0 2023-10-02 11:15:03,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:15:03,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=859873.3333333334, ans=0.125 2023-10-02 11:15:07,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:15:07,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:15:10,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 11:15:12,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 11:15:12,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 11:15:12,765 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.73 vs. limit=22.5 2023-10-02 11:15:13,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:15:15,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:15:19,145 INFO [train.py:1046] (1/4) Epoch 25, batch 1500, loss[loss=0.1768, simple_loss=0.2477, pruned_loss=0.05299, over 23710.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2446, pruned_loss=0.04628, over 4689494.80 frames. ], batch size: 164, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:15:25,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 11:15:25,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:15:25,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:15:25,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:15:27,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:15:28,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:15:30,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 11:15:32,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:15:32,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:15:32,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:15:33,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:15:35,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:15:35,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:15:41,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:15:41,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 11:15:41,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:15:43,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:15:44,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:15:47,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 11:15:50,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 11:15:51,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:15:53,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 11:15:56,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 11:15:58,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:15:59,382 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.797e+02 1.991e+02 2.155e+02 3.181e+02, threshold=3.982e+02, percent-clipped=0.0 2023-10-02 11:15:59,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:15:59,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:16:00,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 11:16:00,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:16:02,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:16:02,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 11:16:03,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:16:07,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:16:07,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 11:16:11,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=860140.0, ans=0.0 2023-10-02 11:16:14,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:16:16,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:16:20,872 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 11:16:20,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:16:20,930 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 11:16:23,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:16:23,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:16:23,758 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 11:16:25,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:16:29,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 11:16:30,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:16:30,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=860206.6666666666, ans=0.125 2023-10-02 11:16:32,938 INFO [train.py:1046] (1/4) Epoch 25, batch 1550, loss[loss=0.1849, simple_loss=0.2575, pruned_loss=0.0562, over 23811.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2455, pruned_loss=0.0461, over 4699739.97 frames. ], batch size: 164, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:16:34,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:16:34,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:16:34,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:16:34,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:16:35,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:16:37,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=860273.3333333334, ans=0.0 2023-10-02 11:16:38,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 11:16:38,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 11:16:38,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:16:40,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 11:16:40,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 11:16:40,899 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.67 vs. limit=15.0 2023-10-02 11:16:42,024 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.06 vs. limit=15.0 2023-10-02 11:16:42,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:16:44,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:16:44,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:16:45,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:16:46,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:16:47,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:16:50,777 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 11:16:50,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:16:50,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:16:52,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:16:53,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:16:53,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 11:16:55,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:16:56,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 11:16:57,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 11:16:57,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=860340.0, ans=0.5 2023-10-02 11:16:58,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 11:16:58,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:16:58,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:17:01,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:17:03,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 11:17:03,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 11:17:08,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=860406.6666666666, ans=0.125 2023-10-02 11:17:11,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:17:11,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=860406.6666666666, ans=0.125 2023-10-02 11:17:15,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:17:15,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:17:15,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:17:16,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 11:17:22,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:17:22,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:17:22,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=860473.3333333334, ans=0.0 2023-10-02 11:17:22,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=860473.3333333334, ans=0.125 2023-10-02 11:17:24,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=860473.3333333334, ans=0.0 2023-10-02 11:17:26,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:17:29,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:17:29,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:17:29,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 11:17:30,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:17:32,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:17:32,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:17:32,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 11:17:32,329 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 11:17:35,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:17:40,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 11:17:43,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:17:45,899 INFO [train.py:1046] (1/4) Epoch 25, batch 1600, loss[loss=0.1429, simple_loss=0.2205, pruned_loss=0.0326, over 24296.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2467, pruned_loss=0.04614, over 4704855.86 frames. ], batch size: 56, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:17:45,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:17:46,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 11:17:47,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:17:49,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:17:49,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:17:49,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:17:51,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:17:53,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:17:55,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 11:17:55,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 11:17:58,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 11:18:00,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:18:01,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 11:18:03,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:18:05,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:18:08,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:18:11,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 11:18:13,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=860673.3333333334, ans=0.1 2023-10-02 11:18:14,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:18:14,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 11:18:15,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:18:15,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 11:18:20,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 11:18:20,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=860740.0, ans=0.125 2023-10-02 11:18:26,763 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.917e+02 2.071e+02 2.381e+02 2.970e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-02 11:18:28,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:18:28,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 11:18:28,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:18:28,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:18:28,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:18:33,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 11:18:37,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 11:18:37,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=860806.6666666666, ans=0.125 2023-10-02 11:18:39,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:18:39,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:18:39,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:18:41,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:18:42,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:18:43,256 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.87 vs. limit=15.0 2023-10-02 11:18:44,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:18:44,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:18:47,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=860873.3333333334, ans=0.0 2023-10-02 11:18:50,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:18:51,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:18:54,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 11:18:54,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:18:57,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 11:19:00,270 INFO [train.py:1046] (1/4) Epoch 25, batch 1650, loss[loss=0.1612, simple_loss=0.2405, pruned_loss=0.04094, over 24499.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2473, pruned_loss=0.04666, over 4712225.61 frames. ], batch size: 63, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:19:00,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:19:02,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:19:03,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:19:03,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 11:19:03,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 11:19:03,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 11:19:03,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 11:19:07,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:19:07,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:19:07,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:19:07,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:19:10,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:19:13,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 11:19:14,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:19:14,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:19:14,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:19:16,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:19:16,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 11:19:17,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 11:19:25,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:19:27,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:19:33,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 11:19:33,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:19:36,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 11:19:38,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:19:40,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=861073.3333333334, ans=0.125 2023-10-02 11:19:40,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=861073.3333333334, ans=0.05 2023-10-02 11:19:41,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:19:42,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:19:43,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:19:45,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:19:45,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:19:45,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=861140.0, ans=0.2 2023-10-02 11:19:48,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:19:49,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:19:49,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:19:50,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:19:51,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:19:51,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:19:55,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:19:57,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 11:19:58,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:19:59,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 11:20:00,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 11:20:00,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 11:20:01,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:20:01,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:20:01,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:20:01,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:20:01,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 11:20:06,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:20:07,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:20:07,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:20:10,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 11:20:14,111 INFO [train.py:1046] (1/4) Epoch 25, batch 1700, loss[loss=0.1876, simple_loss=0.2749, pruned_loss=0.05014, over 24330.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2467, pruned_loss=0.04664, over 4712759.12 frames. ], batch size: 74, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:20:15,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:20:15,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:20:15,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 11:20:16,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:20:16,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:20:16,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:20:18,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:20:18,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:20:18,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 11:20:21,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:20:23,559 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:20:24,999 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.10 vs. limit=15.0 2023-10-02 11:20:28,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=861340.0, ans=0.0 2023-10-02 11:20:29,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:20:32,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:20:34,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=861340.0, ans=0.125 2023-10-02 11:20:37,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=861340.0, ans=0.0 2023-10-02 11:20:37,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=861340.0, ans=0.0 2023-10-02 11:20:38,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:20:38,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:20:39,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:20:39,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:20:39,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=861340.0, ans=0.125 2023-10-02 11:20:39,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=861340.0, ans=0.2 2023-10-02 11:20:41,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 11:20:43,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:20:43,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:20:43,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=861406.6666666666, ans=0.0 2023-10-02 11:20:43,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=861406.6666666666, ans=0.125 2023-10-02 11:20:45,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:20:46,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:20:47,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 11:20:47,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 11:20:49,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:20:51,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 11:20:51,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:20:55,051 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.911e+02 2.075e+02 2.352e+02 2.964e+02, threshold=4.149e+02, percent-clipped=0.0 2023-10-02 11:20:59,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:21:01,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:21:01,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:21:02,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 11:21:02,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 11:21:02,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:21:05,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:21:05,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 11:21:06,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:21:06,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:21:06,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:21:06,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:21:09,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:21:09,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:21:10,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:21:11,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:21:11,833 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.40 vs. limit=22.5 2023-10-02 11:21:12,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:21:15,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:21:15,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 11:21:18,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:21:19,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:21:22,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 11:21:28,785 INFO [train.py:1046] (1/4) Epoch 25, batch 1750, loss[loss=0.162, simple_loss=0.2481, pruned_loss=0.0379, over 24033.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2454, pruned_loss=0.04621, over 4721308.09 frames. ], batch size: 86, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:21:28,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:21:30,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:21:32,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:21:32,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 11:21:32,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:21:35,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:21:35,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:21:37,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=861606.6666666666, ans=0.125 2023-10-02 11:21:39,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 11:21:40,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:21:43,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 11:21:43,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:21:44,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:21:47,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 11:21:49,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 11:21:50,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:21:50,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 11:21:50,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=861673.3333333334, ans=0.125 2023-10-02 11:21:54,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=861673.3333333334, ans=0.0 2023-10-02 11:21:56,678 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.12 vs. limit=15.0 2023-10-02 11:21:59,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:22:02,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:22:02,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:22:06,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:22:08,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:22:09,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:22:09,678 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=861740.0, ans=0.125 2023-10-02 11:22:10,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:22:13,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:22:13,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:22:14,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 11:22:16,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:22:19,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 11:22:21,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:22:23,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:22:23,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:22:25,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:22:25,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 11:22:27,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:22:29,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:22:30,048 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.20 vs. limit=15.0 2023-10-02 11:22:33,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:22:35,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:22:36,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:22:39,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 11:22:39,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:22:40,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:22:40,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:22:40,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:22:40,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:22:41,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:22:42,937 INFO [train.py:1046] (1/4) Epoch 25, batch 1800, loss[loss=0.1559, simple_loss=0.2316, pruned_loss=0.04006, over 23529.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2448, pruned_loss=0.04597, over 4724671.98 frames. ], batch size: 134, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:22:44,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:22:44,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:22:46,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 11:22:46,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=861940.0, ans=0.125 2023-10-02 11:22:50,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:22:51,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=861940.0, ans=0.125 2023-10-02 11:22:53,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 11:22:53,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:22:58,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:23:01,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:23:01,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:23:02,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:23:04,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:23:04,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 11:23:04,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:06,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=862006.6666666666, ans=0.125 2023-10-02 11:23:08,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:09,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=862006.6666666666, ans=0.0 2023-10-02 11:23:11,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 11:23:14,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 11:23:14,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 11:23:14,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:23:16,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:23:16,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:23:18,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:23:22,273 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.934e+02 2.294e+02 2.756e+02 4.950e+02, threshold=4.588e+02, percent-clipped=2.0 2023-10-02 11:23:23,774 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 11:23:25,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:23:27,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:30,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 11:23:30,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 11:23:30,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:23:30,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=862140.0, ans=0.1 2023-10-02 11:23:31,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:23:33,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:23:35,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=862140.0, ans=0.125 2023-10-02 11:23:38,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 11:23:40,515 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.61 vs. limit=5.0 2023-10-02 11:23:44,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:23:44,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 11:23:45,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:23:45,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:23:46,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:23:46,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 11:23:49,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:23:49,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:23:51,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 11:23:51,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:23:54,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:23:54,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:23:54,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:56,325 INFO [train.py:1046] (1/4) Epoch 25, batch 1850, loss[loss=0.1772, simple_loss=0.2482, pruned_loss=0.05308, over 23305.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2454, pruned_loss=0.04568, over 4730021.47 frames. ], batch size: 119, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:23:56,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:56,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:23:59,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:23:59,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:24:01,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:24:02,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:24:10,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:24:10,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 11:24:14,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 11:24:14,976 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.25 vs. limit=15.0 2023-10-02 11:24:15,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 11:24:18,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:24:18,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 11:24:18,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 11:24:30,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:24:32,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 11:24:34,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:24:34,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:24:40,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 11:24:40,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:24:40,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:24:42,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:24:44,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:24:45,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:24:48,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:24:49,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:24:49,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 11:24:51,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:24:51,913 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.46 vs. limit=15.0 2023-10-02 11:24:52,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:24:54,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:24:54,692 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.30 vs. limit=10.0 2023-10-02 11:24:56,339 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.72 vs. limit=15.0 2023-10-02 11:24:57,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 11:24:57,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:25:01,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:25:03,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:25:03,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 11:25:03,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 11:25:04,554 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 11:25:05,940 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 11:25:07,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:25:07,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:25:07,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:25:09,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:25:09,331 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 11:25:09,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:25:09,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:25:10,616 INFO [train.py:1046] (1/4) Epoch 25, batch 1900, loss[loss=0.1722, simple_loss=0.2455, pruned_loss=0.04946, over 23365.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2461, pruned_loss=0.04599, over 4717629.94 frames. ], batch size: 119, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:25:10,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:25:12,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:25:14,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:25:14,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 11:25:16,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:25:16,767 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 11:25:16,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:25:18,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:25:23,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:25:23,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:25:25,301 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 11:25:26,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 11:25:28,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:25:30,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:25:30,060 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 11:25:30,083 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 11:25:33,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 11:25:35,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:25:39,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 11:25:40,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=862740.0, ans=0.025 2023-10-02 11:25:41,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 11:25:49,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 11:25:50,897 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 2.019e+02 2.414e+02 2.839e+02 5.766e+02, threshold=4.828e+02, percent-clipped=1.0 2023-10-02 11:25:52,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 11:25:52,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:25:53,847 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 11:25:53,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 11:25:53,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 11:25:53,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 11:25:53,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:25:54,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=862806.6666666666, ans=0.125 2023-10-02 11:25:56,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 11:25:59,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:26:04,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:26:04,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 11:26:05,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:26:08,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 11:26:08,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:26:12,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=862873.3333333334, ans=0.2 2023-10-02 11:26:14,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:26:14,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:26:14,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:26:17,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:26:18,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:26:19,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 11:26:20,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:26:23,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:26:23,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:26:24,534 INFO [train.py:1046] (1/4) Epoch 25, batch 1950, loss[loss=0.1561, simple_loss=0.2398, pruned_loss=0.03616, over 24647.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2466, pruned_loss=0.04609, over 4720689.92 frames. ], batch size: 65, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:26:25,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:26:25,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:26:26,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:26:27,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:26:30,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:26:32,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:26:33,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:26:33,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:26:35,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 11:26:35,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=862940.0, ans=0.125 2023-10-02 11:26:36,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 11:26:36,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:26:38,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:26:39,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:26:39,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=863006.6666666666, ans=0.2 2023-10-02 11:26:40,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:26:42,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:26:44,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:26:45,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:26:47,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:26:47,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:26:47,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:26:50,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:26:54,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:26:54,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:26:54,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 11:26:54,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 11:26:54,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:26:54,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:26:55,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:26:57,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:27:01,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:27:04,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:27:09,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:27:09,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:27:09,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 11:27:10,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:27:13,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:27:15,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:27:15,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:27:22,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:27:24,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:27:26,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:27:28,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:27:31,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:27:31,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:27:32,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 11:27:32,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:27:33,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:27:34,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 11:27:36,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:27:38,582 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.97 vs. limit=6.0 2023-10-02 11:27:39,093 INFO [train.py:1046] (1/4) Epoch 25, batch 2000, loss[loss=0.1729, simple_loss=0.2484, pruned_loss=0.04868, over 23711.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2477, pruned_loss=0.04666, over 4714277.20 frames. ], batch size: 135, lr: 4.11e-03, grad_scale: 32.0 2023-10-02 11:27:40,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:27:41,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:27:41,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:27:43,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:27:45,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:27:45,553 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:27:48,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 11:27:49,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:27:52,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:27:53,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 11:27:55,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:27:55,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:27:58,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:27:58,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 11:27:59,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:01,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:02,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:05,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 11:28:05,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:28:07,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 11:28:07,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:28:10,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:28:10,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:28:10,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:11,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:28:11,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:28:12,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 11:28:13,540 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.02 vs. limit=12.0 2023-10-02 11:28:14,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 11:28:15,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:28:15,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:28:19,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:28:20,915 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.859e+02 2.034e+02 2.311e+02 3.192e+02, threshold=4.068e+02, percent-clipped=0.0 2023-10-02 11:28:21,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:28:21,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:28:21,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:28:21,778 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.77 vs. limit=15.0 2023-10-02 11:28:25,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:28:25,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:28:25,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:28:25,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:28:27,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:31,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:28:32,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 11:28:35,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:28:36,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:28:38,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:28:38,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:28:40,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=863540.0, ans=0.5 2023-10-02 11:28:44,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:45,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:28:45,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:47,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:28:47,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:28:50,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:28:50,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:52,888 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.63 vs. limit=15.0 2023-10-02 11:28:53,313 INFO [train.py:1046] (1/4) Epoch 25, batch 2050, loss[loss=0.1521, simple_loss=0.2376, pruned_loss=0.03334, over 24635.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2472, pruned_loss=0.04645, over 4713904.34 frames. ], batch size: 68, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:28:53,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:28:55,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:57,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=863606.6666666666, ans=0.1 2023-10-02 11:28:58,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:28:59,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:29:01,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:29:01,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:29:04,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 11:29:04,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:29:05,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:29:05,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:29:16,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:29:16,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:29:16,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=863673.3333333334, ans=0.1 2023-10-02 11:29:17,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 11:29:20,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:29:21,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 11:29:21,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:29:22,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=863740.0, ans=0.125 2023-10-02 11:29:24,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:29:29,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:29:29,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:29:30,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:29:32,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:29:33,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:29:33,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:29:35,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=863740.0, ans=0.0 2023-10-02 11:29:38,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:29:38,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:29:39,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:29:41,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:29:43,518 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:29:45,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:29:51,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:29:51,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 11:29:57,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:29:57,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:29:58,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:30:00,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 11:30:05,774 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 11:30:05,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:30:05,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:30:07,106 INFO [train.py:1046] (1/4) Epoch 25, batch 2100, loss[loss=0.1674, simple_loss=0.2609, pruned_loss=0.0369, over 24607.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2462, pruned_loss=0.04604, over 4720354.32 frames. ], batch size: 71, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:30:07,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:30:07,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:30:09,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 11:30:09,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 11:30:11,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:30:14,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:30:14,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:30:16,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:30:17,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:30:17,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 11:30:19,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:30:20,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 11:30:20,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 11:30:22,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:30:22,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:30:22,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 11:30:24,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 11:30:28,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 11:30:28,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:30:31,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:30:32,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:30:35,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:30:36,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 11:30:36,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:30:37,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 11:30:39,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 11:30:40,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:30:41,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 11:30:41,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 11:30:43,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 11:30:44,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:30:45,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:30:48,952 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.897e+02 2.155e+02 2.514e+02 3.500e+02, threshold=4.310e+02, percent-clipped=0.0 2023-10-02 11:30:49,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:30:51,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:30:53,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:30:53,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:30:53,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 11:30:53,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:30:53,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:30:55,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:30:55,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 11:30:55,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 11:30:56,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 11:30:59,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:31:02,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:31:03,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 11:31:08,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:31:10,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:31:11,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:31:11,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:31:11,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 11:31:11,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:31:14,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:31:14,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:31:14,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:31:14,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:31:17,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 11:31:18,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 11:31:18,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:31:21,322 INFO [train.py:1046] (1/4) Epoch 25, batch 2150, loss[loss=0.1622, simple_loss=0.2464, pruned_loss=0.03899, over 24448.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2454, pruned_loss=0.04584, over 4714779.64 frames. ], batch size: 63, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:31:21,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:31:21,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:31:23,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:31:23,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:31:28,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 11:31:30,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:31:31,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:31:31,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:31:31,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:31:31,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:31:31,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=864273.3333333334, ans=0.125 2023-10-02 11:31:34,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:31:36,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:31:36,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:31:40,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:31:40,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 11:31:45,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:31:46,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:31:46,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:31:48,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:31:48,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:31:48,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:31:49,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:31:49,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:31:50,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:31:52,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 11:31:52,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=864406.6666666666, ans=0.125 2023-10-02 11:31:55,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:31:55,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:31:55,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:31:58,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:31:59,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:32:01,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:32:02,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:32:02,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:32:02,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 11:32:04,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:32:04,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=864473.3333333334, ans=0.0 2023-10-02 11:32:05,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:32:06,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:08,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:32:09,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:32:09,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:12,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:12,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 11:32:14,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 11:32:14,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:32:14,696 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 11:32:14,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:16,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:32:16,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 11:32:16,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:32:16,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 11:32:18,106 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 11:32:18,107 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 11:32:18,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 11:32:19,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:19,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:32:19,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:32:19,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=864540.0, ans=0.0 2023-10-02 11:32:20,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:22,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 11:32:23,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:23,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:33,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:32:33,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 11:32:34,874 INFO [train.py:1046] (1/4) Epoch 25, batch 2200, loss[loss=0.1763, simple_loss=0.2497, pruned_loss=0.05148, over 23721.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2458, pruned_loss=0.04584, over 4727748.59 frames. ], batch size: 212, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:32:36,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:32:40,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:42,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:32:42,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:32:44,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:32:44,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=864606.6666666666, ans=0.125 2023-10-02 11:32:44,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=864606.6666666666, ans=0.07 2023-10-02 11:32:45,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:45,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:32:47,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 11:32:52,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 11:32:52,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:32:58,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 11:33:02,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:33:02,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:33:04,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:33:06,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:33:06,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 11:33:07,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=864740.0, ans=0.2 2023-10-02 11:33:11,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:33:13,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:33:13,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=864740.0, ans=0.125 2023-10-02 11:33:14,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 11:33:16,289 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.455e+02 1.821e+02 1.981e+02 2.213e+02 3.022e+02, threshold=3.961e+02, percent-clipped=0.0 2023-10-02 11:33:16,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:33:17,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:33:19,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:33:20,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:33:22,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 11:33:23,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:33:25,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 11:33:26,138 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.51 vs. limit=12.0 2023-10-02 11:33:27,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:33:27,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 11:33:28,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:33:29,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:33:29,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:33:29,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:33:29,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:33:32,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:33:32,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:33:34,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 11:33:36,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=864873.3333333334, ans=0.0 2023-10-02 11:33:37,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 11:33:37,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:33:41,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:33:41,706 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 11:33:42,978 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.40 vs. limit=6.0 2023-10-02 11:33:45,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:33:45,612 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 11:33:45,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:33:47,105 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 11:33:48,993 INFO [train.py:1046] (1/4) Epoch 25, batch 2250, loss[loss=0.1847, simple_loss=0.254, pruned_loss=0.05766, over 23798.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2458, pruned_loss=0.04583, over 4728039.78 frames. ], batch size: 164, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:33:49,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:33:49,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 11:33:50,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:33:52,062 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 11:33:53,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:33:54,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=864940.0, ans=0.1 2023-10-02 11:33:56,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:34:02,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:34:04,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:34:06,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:34:06,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:34:07,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:34:10,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 11:34:10,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:34:10,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:34:13,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 11:34:13,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:34:15,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:34:15,843 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.08 vs. limit=15.0 2023-10-02 11:34:17,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:34:23,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:34:24,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 11:34:24,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:34:24,781 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=865073.3333333334, ans=0.125 2023-10-02 11:34:24,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff3.min_abs, batch_count=865073.3333333334, ans=0.2 2023-10-02 11:34:26,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 11:34:26,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:34:28,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:34:33,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:34:34,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:34:36,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:34:36,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:34:37,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:34:40,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:34:44,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:34:46,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:34:47,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=865206.6666666666, ans=0.125 2023-10-02 11:34:51,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 11:34:51,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:34:51,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:34:57,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 11:35:00,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:35:00,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 11:35:00,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:35:01,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:35:02,456 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.61 vs. limit=22.5 2023-10-02 11:35:03,270 INFO [train.py:1046] (1/4) Epoch 25, batch 2300, loss[loss=0.1554, simple_loss=0.2374, pruned_loss=0.03668, over 24450.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2468, pruned_loss=0.04612, over 4734604.25 frames. ], batch size: 66, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:35:05,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 11:35:07,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:35:07,911 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.56 vs. limit=15.0 2023-10-02 11:35:08,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:35:09,419 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.01 vs. limit=15.0 2023-10-02 11:35:12,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:35:13,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:35:16,032 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 11:35:18,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:35:19,750 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:35:22,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:35:24,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 11:35:24,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:35:24,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:35:24,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 11:35:26,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:35:27,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:35:27,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:35:30,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:35:31,236 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.30 vs. limit=22.5 2023-10-02 11:35:33,946 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.84 vs. limit=15.0 2023-10-02 11:35:34,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:35:35,363 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.02 vs. limit=15.0 2023-10-02 11:35:36,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:35:40,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:35:40,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:35:43,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:35:44,371 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.801e+02 2.122e+02 2.550e+02 3.360e+02, threshold=4.244e+02, percent-clipped=0.0 2023-10-02 11:35:45,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:35:47,384 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.53 vs. limit=22.5 2023-10-02 11:35:49,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:35:49,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:35:49,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:35:49,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 11:35:49,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=865473.3333333334, ans=0.125 2023-10-02 11:35:54,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 11:35:54,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:35:55,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:35:55,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:35:55,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:35:57,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 11:35:59,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:35:59,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 11:35:59,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:35:59,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:36:00,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 11:36:00,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=865473.3333333334, ans=0.0 2023-10-02 11:36:03,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:36:05,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=865540.0, ans=0.0 2023-10-02 11:36:07,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:36:10,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=865540.0, ans=0.125 2023-10-02 11:36:11,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:36:12,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:36:12,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 11:36:14,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:36:14,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:36:14,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:36:16,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 11:36:17,545 INFO [train.py:1046] (1/4) Epoch 25, batch 2350, loss[loss=0.1969, simple_loss=0.278, pruned_loss=0.05785, over 24335.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2481, pruned_loss=0.04678, over 4726424.38 frames. ], batch size: 77, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:36:19,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=865606.6666666666, ans=0.2 2023-10-02 11:36:21,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:36:21,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 11:36:26,382 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.13 vs. limit=10.0 2023-10-02 11:36:28,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 11:36:31,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:36:32,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:36:32,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:36:34,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:36:34,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:36:35,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 11:36:36,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=865673.3333333334, ans=0.125 2023-10-02 11:36:39,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:36:40,778 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.24 vs. limit=22.5 2023-10-02 11:36:41,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=865673.3333333334, ans=0.025 2023-10-02 11:36:44,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=865673.3333333334, ans=0.125 2023-10-02 11:36:45,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 11:36:45,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=865740.0, ans=0.09899494936611666 2023-10-02 11:36:46,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:36:46,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=865740.0, ans=0.2 2023-10-02 11:36:49,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:36:49,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:36:51,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=865740.0, ans=0.125 2023-10-02 11:36:52,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:36:54,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 11:36:55,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:36:57,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:36:57,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:36:57,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:37:02,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:37:04,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 11:37:04,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:37:06,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:37:06,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:37:09,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 11:37:09,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:37:13,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 11:37:13,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:37:13,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=865806.6666666666, ans=0.125 2023-10-02 11:37:14,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=865873.3333333334, ans=0.125 2023-10-02 11:37:14,937 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=865873.3333333334, ans=0.0 2023-10-02 11:37:17,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 11:37:20,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 11:37:20,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:37:22,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 11:37:22,195 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 11:37:22,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 11:37:22,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=865873.3333333334, ans=0.0 2023-10-02 11:37:24,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 11:37:29,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:37:31,494 INFO [train.py:1046] (1/4) Epoch 25, batch 2400, loss[loss=0.1868, simple_loss=0.2658, pruned_loss=0.05389, over 23192.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2474, pruned_loss=0.04651, over 4723909.84 frames. ], batch size: 105, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:37:34,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:37:36,702 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.07 vs. limit=12.0 2023-10-02 11:37:38,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:37:38,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:37:38,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 11:37:39,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 11:37:45,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 11:37:45,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:37:46,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 11:37:46,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:37:48,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:37:49,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 11:37:50,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=866006.6666666666, ans=0.0 2023-10-02 11:37:51,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=866006.6666666666, ans=0.0 2023-10-02 11:37:54,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:37:55,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 11:38:02,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:38:05,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 11:38:07,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:38:09,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:38:12,533 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.315e+02 1.806e+02 1.972e+02 2.221e+02 3.865e+02, threshold=3.945e+02, percent-clipped=0.0 2023-10-02 11:38:12,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=866073.3333333334, ans=0.125 2023-10-02 11:38:13,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:38:15,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 11:38:15,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:38:21,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:38:24,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:38:25,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:38:27,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:38:27,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 11:38:27,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:38:27,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:38:29,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:38:29,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:38:33,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:38:35,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:38:35,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 11:38:36,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 11:38:39,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:38:39,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:38:39,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 11:38:40,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 11:38:40,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 11:38:40,800 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 11:38:42,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 11:38:43,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:38:44,986 INFO [train.py:1046] (1/4) Epoch 25, batch 2450, loss[loss=0.1525, simple_loss=0.2279, pruned_loss=0.03853, over 24598.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2462, pruned_loss=0.04655, over 4709906.86 frames. ], batch size: 60, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:38:45,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:38:45,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:38:46,505 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 11:38:46,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:38:47,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 11:38:49,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:38:51,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:38:54,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:38:54,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:38:54,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 11:39:00,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:39:00,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:39:03,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:39:05,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:39:05,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:39:05,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 11:39:09,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:39:12,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:39:12,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:39:16,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:39:16,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:39:17,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:39:17,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:39:20,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 11:39:21,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:39:27,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:39:29,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:39:30,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:39:30,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:39:30,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:39:32,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:39:32,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 11:39:36,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:39:36,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:39:39,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:39:39,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:39:43,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:39:44,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 11:39:45,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:39:46,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:39:46,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 11:39:46,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:39:48,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:39:51,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:39:53,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:39:53,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:39:58,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 11:39:58,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:40:00,117 INFO [train.py:1046] (1/4) Epoch 25, batch 2500, loss[loss=0.1658, simple_loss=0.2231, pruned_loss=0.05424, over 18979.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2459, pruned_loss=0.04629, over 4709163.97 frames. ], batch size: 388, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:40:05,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:40:13,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=866673.3333333334, ans=0.1 2023-10-02 11:40:14,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:40:15,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:40:16,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:40:16,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 11:40:23,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:40:23,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:40:25,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 11:40:25,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 11:40:25,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 11:40:27,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:40:28,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:40:28,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 11:40:28,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:40:29,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 11:40:29,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:40:32,453 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.19 vs. limit=15.0 2023-10-02 11:40:35,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:40:36,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:40:39,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 11:40:39,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 11:40:39,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:40:39,434 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:40:39,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=866740.0, ans=0.125 2023-10-02 11:40:41,815 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.456e+02 1.832e+02 2.049e+02 2.331e+02 3.606e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-02 11:40:41,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:40:44,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:40:47,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:40:50,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:40:56,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 11:40:57,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 11:40:59,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:40:59,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 11:41:00,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:41:00,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 11:41:02,387 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 11:41:02,388 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 11:41:02,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 11:41:04,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:41:05,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 11:41:05,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 11:41:07,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:41:08,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 11:41:11,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 11:41:12,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:41:14,053 INFO [train.py:1046] (1/4) Epoch 25, batch 2550, loss[loss=0.1905, simple_loss=0.265, pruned_loss=0.05801, over 23835.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2458, pruned_loss=0.04628, over 4707391.21 frames. ], batch size: 195, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:41:15,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:41:15,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:41:16,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:41:18,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 11:41:20,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:41:22,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 11:41:24,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:41:26,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:41:28,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:41:28,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 11:41:29,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:41:29,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:41:29,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:41:33,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:41:33,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 11:41:33,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 11:41:33,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:41:33,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 11:41:39,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=867006.6666666666, ans=0.125 2023-10-02 11:41:44,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=867073.3333333334, ans=0.0 2023-10-02 11:41:48,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:41:54,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:41:54,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:41:54,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:41:54,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=867073.3333333334, ans=0.125 2023-10-02 11:41:54,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=867073.3333333334, ans=0.125 2023-10-02 11:41:55,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:41:59,681 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.75 vs. limit=22.5 2023-10-02 11:42:03,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:42:06,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:42:06,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:42:06,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:42:06,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:42:06,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=867140.0, ans=0.0 2023-10-02 11:42:08,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:42:11,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:42:11,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:42:11,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=867206.6666666666, ans=0.0 2023-10-02 11:42:16,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:42:16,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 11:42:16,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:42:16,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:42:17,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:42:18,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=867206.6666666666, ans=0.125 2023-10-02 11:42:19,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:42:20,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:42:26,622 INFO [train.py:1046] (1/4) Epoch 25, batch 2600, loss[loss=0.1798, simple_loss=0.2564, pruned_loss=0.05162, over 23311.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2467, pruned_loss=0.0465, over 4706922.97 frames. ], batch size: 93, lr: 4.10e-03, grad_scale: 16.0 2023-10-02 11:42:26,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:42:28,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:42:30,983 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 11:42:32,455 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 11:42:32,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:42:33,770 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 11:42:33,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 11:42:33,864 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 11:42:37,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:42:38,981 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 11:42:40,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 11:42:42,232 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 11:42:43,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:42:45,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 11:42:45,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=867340.0, ans=0.125 2023-10-02 11:42:46,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 11:42:47,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:42:47,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 11:42:50,520 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 11:42:50,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 11:42:56,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:42:56,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:42:57,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:42:57,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 11:43:00,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:43:05,631 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 11:43:10,119 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 1.853e+02 2.071e+02 2.375e+02 3.577e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-02 11:43:12,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:43:12,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:43:12,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 11:43:13,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:43:13,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:43:14,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 11:43:17,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:43:17,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:43:20,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:43:23,091 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 11:43:24,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:43:24,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:43:29,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:43:30,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:43:30,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 11:43:31,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:43:31,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:43:33,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:43:38,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 11:43:39,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:43:39,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 11:43:41,632 INFO [train.py:1046] (1/4) Epoch 25, batch 2650, loss[loss=0.1828, simple_loss=0.2714, pruned_loss=0.04716, over 24307.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2467, pruned_loss=0.04625, over 4715890.16 frames. ], batch size: 74, lr: 4.10e-03, grad_scale: 16.0 2023-10-02 11:43:44,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 11:43:44,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:43:46,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:43:46,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=867606.6666666666, ans=0.1 2023-10-02 11:43:47,252 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 11:43:47,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:43:51,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:43:51,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 11:43:53,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:43:55,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:43:55,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 11:43:55,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:43:55,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:43:57,089 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.01 vs. limit=15.0 2023-10-02 11:43:59,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 11:43:59,196 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 11:44:02,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:44:03,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 11:44:03,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:44:04,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=867673.3333333334, ans=0.2 2023-10-02 11:44:05,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 11:44:11,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:44:11,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 11:44:11,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:44:12,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:15,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 11:44:16,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 11:44:19,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:44:23,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 11:44:23,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:44:25,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:25,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:44:25,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:44:26,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:44:26,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:44:28,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:44:30,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:44:30,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:44:30,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=867806.6666666666, ans=0.2 2023-10-02 11:44:31,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:44:31,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:44:31,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=867806.6666666666, ans=0.125 2023-10-02 11:44:33,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:44:35,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:44:36,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:44:36,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 11:44:39,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:41,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:44:41,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:44:41,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 11:44:46,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:44:47,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:48,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:48,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:44:50,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:44:51,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:44:54,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:44:54,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 11:44:55,442 INFO [train.py:1046] (1/4) Epoch 25, batch 2700, loss[loss=0.1614, simple_loss=0.2435, pruned_loss=0.03963, over 24453.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2472, pruned_loss=0.04623, over 4719848.07 frames. ], batch size: 63, lr: 4.10e-03, grad_scale: 16.0 2023-10-02 11:44:56,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:44:57,131 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:44:58,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 11:45:00,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:45:01,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:01,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:02,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:45:02,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:45:02,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:45:02,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:45:02,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 11:45:04,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:45:05,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:45:05,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=867940.0, ans=0.1 2023-10-02 11:45:07,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:45:07,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:45:10,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=868006.6666666666, ans=0.125 2023-10-02 11:45:12,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:45:14,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 11:45:14,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:45:17,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=868006.6666666666, ans=0.2 2023-10-02 11:45:18,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:45:18,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:45:19,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=868006.6666666666, ans=10.0 2023-10-02 11:45:22,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:45:22,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:45:22,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=868006.6666666666, ans=0.2 2023-10-02 11:45:23,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:45:23,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:45:24,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=868073.3333333334, ans=0.0 2023-10-02 11:45:25,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=868073.3333333334, ans=0.0 2023-10-02 11:45:26,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:45:29,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:45:29,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:45:29,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:45:34,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:34,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:45:39,108 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.430e+02 1.840e+02 2.089e+02 2.422e+02 3.725e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 11:45:43,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=868140.0, ans=0.125 2023-10-02 11:45:44,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:45:44,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:45:47,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:45:47,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:45:51,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:52,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:45:53,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=868206.6666666666, ans=0.1 2023-10-02 11:45:54,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:45:54,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:45:56,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:56,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:45:58,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:46:00,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:46:00,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:46:03,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 11:46:04,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:46:06,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:46:06,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 11:46:09,175 INFO [train.py:1046] (1/4) Epoch 25, batch 2750, loss[loss=0.1729, simple_loss=0.2476, pruned_loss=0.04908, over 24581.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2475, pruned_loss=0.04613, over 4726671.16 frames. ], batch size: 60, lr: 4.10e-03, grad_scale: 16.0 2023-10-02 11:46:09,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 11:46:09,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:46:12,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:46:13,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:46:15,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:15,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:46:16,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:19,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:46:21,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 11:46:21,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:46:21,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:21,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 11:46:21,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:46:21,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:46:26,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 11:46:28,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:46:28,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:29,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:46:29,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 11:46:31,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:46:33,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:46:34,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:46:35,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:46:38,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:46:40,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 11:46:40,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:46:42,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:43,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:46:50,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:46:52,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 11:46:53,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:46:56,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:56,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:46:56,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:47:02,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:47:02,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:47:02,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 11:47:06,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:47:08,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 11:47:13,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=868540.0, ans=0.0 2023-10-02 11:47:15,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 11:47:17,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:47:17,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 11:47:17,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:47:19,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:47:19,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 11:47:19,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:47:22,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 11:47:22,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:47:22,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:47:24,081 INFO [train.py:1046] (1/4) Epoch 25, batch 2800, loss[loss=0.1594, simple_loss=0.2501, pruned_loss=0.03439, over 24487.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2456, pruned_loss=0.04577, over 4719279.06 frames. ], batch size: 66, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:47:24,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 11:47:24,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:47:24,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:47:25,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:47:25,664 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 11:47:25,665 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 11:47:28,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=868606.6666666666, ans=0.09899494936611666 2023-10-02 11:47:29,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:47:32,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:47:32,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:47:34,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:47:37,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 11:47:40,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 11:47:41,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 11:47:41,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:47:43,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:47:43,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:47:47,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:47:47,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:47:47,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:47:48,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:47:56,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:47:57,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:47:59,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:48:00,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:48:01,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=868740.0, ans=0.125 2023-10-02 11:48:02,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:48:04,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:48:04,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 11:48:05,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=868740.0, ans=0.0 2023-10-02 11:48:06,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:48:08,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:48:08,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:48:09,359 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.857e+02 2.100e+02 2.533e+02 3.672e+02, threshold=4.201e+02, percent-clipped=0.0 2023-10-02 11:48:12,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:48:12,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:48:15,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:48:16,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:48:18,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:48:18,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 11:48:19,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:48:19,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:48:21,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:48:21,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 11:48:21,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:48:22,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:48:22,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:48:23,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 11:48:25,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:48:25,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:48:25,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:48:27,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 11:48:33,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:48:33,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:48:35,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:48:36,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:48:38,047 INFO [train.py:1046] (1/4) Epoch 25, batch 2850, loss[loss=0.1577, simple_loss=0.2329, pruned_loss=0.0413, over 24409.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2442, pruned_loss=0.04539, over 4711370.71 frames. ], batch size: 58, lr: 4.10e-03, grad_scale: 8.0 2023-10-02 11:48:41,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:48:41,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:48:41,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=868940.0, ans=0.125 2023-10-02 11:48:41,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=868940.0, ans=0.0 2023-10-02 11:48:42,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:48:44,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:48:45,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:48:47,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:48:47,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 11:48:52,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 11:48:52,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:48:54,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 11:48:55,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:48:58,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 11:48:58,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 11:49:01,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:49:04,957 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=869006.6666666666, ans=0.125 2023-10-02 11:49:14,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:49:15,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:49:15,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:49:17,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:49:17,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:49:17,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:49:20,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:49:20,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 11:49:21,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:49:21,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:49:22,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:49:22,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:49:25,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:49:25,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:49:25,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:49:27,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:49:29,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:49:30,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:49:30,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:49:32,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:49:38,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:49:39,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 11:49:39,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 11:49:41,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 11:49:43,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:49:43,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 11:49:44,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:49:44,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:49:44,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:49:44,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:49:44,596 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 11:49:45,957 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 11:49:45,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:49:46,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:49:51,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:49:51,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:49:51,914 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.83 vs. limit=10.0 2023-10-02 11:49:52,683 INFO [train.py:1046] (1/4) Epoch 25, batch 2900, loss[loss=0.1494, simple_loss=0.2306, pruned_loss=0.03412, over 24592.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2449, pruned_loss=0.04578, over 4705865.13 frames. ], batch size: 60, lr: 4.10e-03, grad_scale: 8.0 2023-10-02 11:49:52,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:49:52,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 11:49:57,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:49:57,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 11:49:58,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 11:49:58,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=869273.3333333334, ans=0.125 2023-10-02 11:49:59,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:49:59,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:50:01,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:50:04,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:50:08,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:50:08,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:50:11,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 11:50:12,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 11:50:12,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:50:14,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:50:16,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 11:50:17,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 11:50:18,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:50:18,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 11:50:20,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:50:21,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:50:21,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:50:23,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:50:23,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:50:25,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:50:27,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:50:28,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 11:50:30,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 11:50:30,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:50:31,082 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.65 vs. limit=6.0 2023-10-02 11:50:33,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:50:34,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=869406.6666666666, ans=0.0 2023-10-02 11:50:36,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 11:50:36,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=869473.3333333334, ans=0.125 2023-10-02 11:50:37,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:50:40,584 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.913e+02 2.084e+02 2.277e+02 3.213e+02, threshold=4.167e+02, percent-clipped=0.0 2023-10-02 11:50:43,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:50:43,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=869473.3333333334, ans=0.125 2023-10-02 11:50:50,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:50:50,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:50:53,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 11:50:56,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:50:56,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 11:50:56,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:50:56,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:50:56,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=869540.0, ans=0.125 2023-10-02 11:51:03,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:51:04,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=869540.0, ans=0.125 2023-10-02 11:51:05,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 11:51:07,028 INFO [train.py:1046] (1/4) Epoch 25, batch 2950, loss[loss=0.1714, simple_loss=0.2586, pruned_loss=0.04206, over 24560.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2466, pruned_loss=0.04633, over 4709907.92 frames. ], batch size: 71, lr: 4.10e-03, grad_scale: 4.0 2023-10-02 11:51:07,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:51:07,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:51:08,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:51:08,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:51:11,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 11:51:11,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 11:51:12,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 11:51:12,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:51:14,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=869606.6666666666, ans=0.035 2023-10-02 11:51:18,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:51:21,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:51:22,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:51:22,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:51:25,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:51:25,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:51:26,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:51:28,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:51:28,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:51:28,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=869673.3333333334, ans=0.125 2023-10-02 11:51:31,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 11:51:34,957 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.54 vs. limit=15.0 2023-10-02 11:51:37,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 11:51:37,315 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 11:51:37,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:51:39,335 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 11:51:40,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 11:51:40,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:51:42,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:51:42,082 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 11:51:42,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 11:51:45,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 11:51:46,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:51:46,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:51:49,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:51:49,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:51:50,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:51:50,859 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 11:51:52,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:51:52,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 11:51:52,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=869806.6666666666, ans=15.0 2023-10-02 11:51:57,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:51:58,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:52:00,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 11:52:00,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:52:01,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 11:52:05,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:52:07,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:52:07,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:52:09,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:52:09,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 11:52:12,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:52:12,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:52:12,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:52:14,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 11:52:15,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:52:15,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:52:17,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:52:17,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 11:52:19,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:52:20,703 INFO [train.py:1046] (1/4) Epoch 25, batch 3000, loss[loss=0.1618, simple_loss=0.2398, pruned_loss=0.04193, over 24633.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2476, pruned_loss=0.04663, over 4713270.42 frames. ], batch size: 68, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:52:20,704 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 11:52:33,497 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.0505, 5.5056, 5.1866, 5.2288], device='cuda:1') 2023-10-02 11:52:34,338 INFO [train.py:1078] (1/4) Epoch 25, validation: loss=0.328, simple_loss=0.2751, pruned_loss=0.1905, over 1125622.00 frames. 2023-10-02 11:52:34,339 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 11:52:34,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:52:35,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:52:38,917 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 11:52:38,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 11:52:40,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:52:42,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:52:42,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 11:52:42,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:52:49,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:52:57,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:53:02,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 11:53:04,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:53:07,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:53:08,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:53:08,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:53:09,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=870073.3333333334, ans=0.0 2023-10-02 11:53:11,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:53:11,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 11:53:13,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 11:53:14,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:53:14,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 11:53:15,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=870073.3333333334, ans=0.125 2023-10-02 11:53:16,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:53:18,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:53:18,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:53:18,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:53:21,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:53:21,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:53:21,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:53:23,878 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.799e+02 2.052e+02 2.353e+02 3.864e+02, threshold=4.104e+02, percent-clipped=0.0 2023-10-02 11:53:24,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:53:25,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=870140.0, ans=0.125 2023-10-02 11:53:27,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 11:53:29,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:53:29,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:53:29,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:53:33,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:53:33,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:53:33,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 11:53:33,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 11:53:33,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=870206.6666666666, ans=0.125 2023-10-02 11:53:35,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:53:35,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 11:53:35,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:53:35,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=870206.6666666666, ans=0.1 2023-10-02 11:53:37,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 11:53:39,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 11:53:41,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 11:53:41,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 11:53:43,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 11:53:43,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 11:53:44,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:53:46,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:53:46,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 11:53:46,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:53:47,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:53:49,644 INFO [train.py:1046] (1/4) Epoch 25, batch 3050, loss[loss=0.1837, simple_loss=0.2564, pruned_loss=0.05548, over 23714.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2486, pruned_loss=0.04756, over 4700459.19 frames. ], batch size: 135, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:53:49,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 11:53:51,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:53:53,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:53:53,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:53:56,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:53:56,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=870273.3333333334, ans=0.2 2023-10-02 11:53:59,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 11:54:04,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 11:54:05,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 11:54:07,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:54:09,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:54:14,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:54:14,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:54:15,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:54:19,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:54:19,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 11:54:19,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:54:19,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:54:19,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:54:20,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:54:23,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:54:26,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:54:26,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 11:54:26,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=870406.6666666666, ans=0.125 2023-10-02 11:54:28,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:54:28,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:54:30,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:54:31,705 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.73 vs. limit=12.0 2023-10-02 11:54:32,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:54:32,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:54:33,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:54:35,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=870473.3333333334, ans=0.125 2023-10-02 11:54:38,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:54:38,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=870473.3333333334, ans=0.2 2023-10-02 11:54:39,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:54:42,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:54:43,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:54:43,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:54:45,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:54:46,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 11:54:47,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=870540.0, ans=0.0 2023-10-02 11:54:48,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:54:48,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 11:54:50,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:54:50,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:54:51,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 11:54:52,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:54:55,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=870540.0, ans=0.2 2023-10-02 11:54:57,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:54:59,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:55:00,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:55:01,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 11:55:03,274 INFO [train.py:1046] (1/4) Epoch 25, batch 3100, loss[loss=0.1586, simple_loss=0.2267, pruned_loss=0.04527, over 23670.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2481, pruned_loss=0.04743, over 4698823.82 frames. ], batch size: 149, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:55:06,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 11:55:07,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 11:55:09,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:55:12,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:55:12,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:55:15,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:55:20,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:55:24,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 11:55:29,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 11:55:29,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:55:30,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:55:30,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:55:31,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 11:55:33,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:55:33,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=870740.0, ans=0.125 2023-10-02 11:55:34,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 11:55:34,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:55:35,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:55:37,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 11:55:39,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:55:39,964 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.32 vs. limit=15.0 2023-10-02 11:55:41,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:55:41,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 11:55:43,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 11:55:44,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:55:45,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:55:49,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:55:49,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:55:49,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:55:50,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:55:50,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:55:52,043 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.832e+02 1.979e+02 2.196e+02 3.160e+02, threshold=3.959e+02, percent-clipped=0.0 2023-10-02 11:55:52,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=870806.6666666666, ans=0.125 2023-10-02 11:55:53,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:55:53,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:55:53,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:55:53,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 11:55:58,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:55:58,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=870806.6666666666, ans=0.125 2023-10-02 11:55:59,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 11:56:01,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:56:02,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 11:56:02,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:56:02,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:56:03,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 11:56:16,767 INFO [train.py:1046] (1/4) Epoch 25, batch 3150, loss[loss=0.1786, simple_loss=0.2628, pruned_loss=0.04719, over 24353.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2464, pruned_loss=0.04711, over 4700793.16 frames. ], batch size: 77, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:56:16,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 11:56:18,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:56:18,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:56:19,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:56:19,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:56:19,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=870940.0, ans=0.125 2023-10-02 11:56:21,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 11:56:23,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:56:23,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:56:24,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 11:56:26,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:56:28,067 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 11:56:29,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 11:56:31,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:56:32,517 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 11:56:32,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 11:56:35,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 11:56:36,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 11:56:36,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 11:56:36,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:56:36,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:56:37,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:56:39,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 11:56:41,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:56:41,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:56:41,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=871006.6666666666, ans=0.125 2023-10-02 11:56:42,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:56:44,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:56:48,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 11:56:49,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:56:51,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:56:52,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:56:52,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 11:56:55,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 11:56:56,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:56:57,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 11:56:57,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 11:56:58,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:56:58,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:56:58,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:56:58,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 11:57:00,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 11:57:00,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:57:00,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:01,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=871140.0, ans=0.1 2023-10-02 11:57:03,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:57:03,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:57:03,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 11:57:04,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:57:06,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 11:57:06,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:06,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=871140.0, ans=0.0 2023-10-02 11:57:07,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 11:57:09,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 11:57:10,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:57:10,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:57:12,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 11:57:13,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 11:57:13,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:57:16,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:57:17,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=871206.6666666666, ans=0.0 2023-10-02 11:57:18,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:18,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:57:20,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=871206.6666666666, ans=0.2 2023-10-02 11:57:24,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:57:24,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:26,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 11:57:30,971 INFO [train.py:1046] (1/4) Epoch 25, batch 3200, loss[loss=0.1475, simple_loss=0.2259, pruned_loss=0.03456, over 24442.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2452, pruned_loss=0.04642, over 4708874.96 frames. ], batch size: 58, lr: 4.09e-03, grad_scale: 16.0 2023-10-02 11:57:33,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:57:33,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:57:37,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:37,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:57:37,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 11:57:40,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:57:43,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:57:46,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:54,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:58:03,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 11:58:03,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:58:03,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=871406.6666666666, ans=0.0 2023-10-02 11:58:06,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 11:58:08,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 11:58:12,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:58:12,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 11:58:13,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:58:17,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 11:58:18,966 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.831e+02 2.000e+02 2.221e+02 3.044e+02, threshold=4.001e+02, percent-clipped=0.0 2023-10-02 11:58:19,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 11:58:20,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 11:58:23,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 11:58:25,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 11:58:30,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:58:30,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:58:30,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:58:31,660 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 11:58:31,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 11:58:34,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:58:35,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 11:58:36,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 11:58:36,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 11:58:38,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 11:58:39,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:58:42,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 11:58:42,381 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 11:58:42,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:58:42,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:58:42,510 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 11:58:45,078 INFO [train.py:1046] (1/4) Epoch 25, batch 3250, loss[loss=0.1789, simple_loss=0.2491, pruned_loss=0.05434, over 23593.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2448, pruned_loss=0.04632, over 4701219.78 frames. ], batch size: 256, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:58:48,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:58:49,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:58:58,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:58:58,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 11:59:00,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:59:00,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:59:00,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:59:00,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:59:01,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:59:01,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=871673.3333333334, ans=0.2 2023-10-02 11:59:04,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:59:04,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:59:04,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:59:05,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:59:06,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:59:06,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:59:07,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=871673.3333333334, ans=0.125 2023-10-02 11:59:08,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:59:10,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:59:11,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:59:13,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:59:14,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:59:14,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:59:14,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:59:18,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 11:59:18,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:59:18,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:59:19,459 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.12 vs. limit=15.0 2023-10-02 11:59:21,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:59:23,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:59:26,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=871740.0, ans=0.125 2023-10-02 11:59:27,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:59:33,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:59:34,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:59:34,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 11:59:34,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:59:34,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 11:59:35,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:59:37,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=871806.6666666666, ans=0.1 2023-10-02 11:59:39,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 11:59:39,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 11:59:40,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=871806.6666666666, ans=0.2 2023-10-02 11:59:41,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:59:42,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:59:42,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:59:42,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:59:43,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:59:46,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:59:46,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:59:50,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 11:59:50,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:59:51,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:59:51,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 11:59:52,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten.whitening_limit, batch_count=871873.3333333334, ans=15.0 2023-10-02 11:59:54,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:59:55,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 11:59:58,142 INFO [train.py:1046] (1/4) Epoch 25, batch 3300, loss[loss=0.16, simple_loss=0.2487, pruned_loss=0.03567, over 24617.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2466, pruned_loss=0.04709, over 4701269.15 frames. ], batch size: 65, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:59:58,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 11:59:58,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 11:59:58,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:00:02,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:00:03,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:00:03,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:06,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 12:00:06,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:00:07,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:00:10,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:00:15,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 12:00:15,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:00:15,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:00:16,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:16,575 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 12:00:16,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=872006.6666666666, ans=0.07 2023-10-02 12:00:18,934 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.15 vs. limit=6.0 2023-10-02 12:00:19,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:00:19,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:00:19,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:00:19,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:00:21,053 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 12:00:21,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=872006.6666666666, ans=0.0 2023-10-02 12:00:25,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:00:25,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:00:25,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:25,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 12:00:25,568 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:00:27,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 12:00:27,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:29,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:00:29,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=872073.3333333334, ans=0.125 2023-10-02 12:00:32,469 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 12:00:35,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 12:00:35,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:00:35,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff3.min_abs, batch_count=872073.3333333334, ans=0.2 2023-10-02 12:00:38,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 12:00:38,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=872073.3333333334, ans=0.125 2023-10-02 12:00:40,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:00:43,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:00:43,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:00:46,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:00:46,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:00:46,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:00:46,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:00:48,873 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.878e+02 2.117e+02 2.413e+02 3.290e+02, threshold=4.234e+02, percent-clipped=0.0 2023-10-02 12:00:49,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:00:49,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:50,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:00:50,447 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 12:00:51,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 12:00:54,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 12:00:54,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:00:54,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:00:58,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:00:58,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:00:59,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:00:59,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:00:59,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 12:01:00,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:01:02,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 12:01:05,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 12:01:06,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:07,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:08,356 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.47 vs. limit=15.0 2023-10-02 12:01:08,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:01:08,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:01:10,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:01:12,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:01:12,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:13,359 INFO [train.py:1046] (1/4) Epoch 25, batch 3350, loss[loss=0.1858, simple_loss=0.256, pruned_loss=0.05782, over 23786.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.247, pruned_loss=0.0467, over 4711365.83 frames. ], batch size: 212, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:01:16,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:01:17,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:17,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:01:19,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:21,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:01:23,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:01:24,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:01:26,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 12:01:27,699 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 12:01:27,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:01:31,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 12:01:31,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 12:01:33,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:01:33,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:01:33,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:01:34,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 12:01:34,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:34,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:01:35,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=872340.0, ans=0.2 2023-10-02 12:01:36,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:38,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:39,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:39,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:01:42,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:01:45,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:45,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:01:47,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=872406.6666666666, ans=0.125 2023-10-02 12:01:49,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:01:50,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:52,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:52,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:01:55,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:01:58,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 12:01:58,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 12:01:58,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 12:01:58,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:01:59,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 12:02:02,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:02:02,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:02:09,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:02:09,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 12:02:11,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:02:12,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:02:14,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:02:19,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:02:20,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 12:02:20,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:02:21,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:02:24,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:02:24,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 12:02:25,924 INFO [train.py:1046] (1/4) Epoch 25, batch 3400, loss[loss=0.2318, simple_loss=0.2889, pruned_loss=0.08736, over 19358.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2473, pruned_loss=0.04679, over 4712520.23 frames. ], batch size: 390, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:02:26,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:02:26,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 12:02:27,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:02:29,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:02:30,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:02:30,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:02:32,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 12:02:36,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 12:02:36,243 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 12:02:36,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:02:41,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:02:41,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:02:41,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:02:42,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:02:47,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:02:48,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 12:02:48,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=872673.3333333334, ans=0.05 2023-10-02 12:02:53,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:02:55,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:02:55,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:02:57,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 12:03:04,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:03:08,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 12:03:12,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:03:12,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:03:12,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 12:03:14,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:03:15,534 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.863e+02 2.090e+02 2.369e+02 3.462e+02, threshold=4.180e+02, percent-clipped=0.0 2023-10-02 12:03:15,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:03:17,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:03:17,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:03:20,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:03:23,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:03:23,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:03:27,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:03:29,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 12:03:33,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:03:38,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 12:03:39,790 INFO [train.py:1046] (1/4) Epoch 25, batch 3450, loss[loss=0.155, simple_loss=0.2329, pruned_loss=0.03856, over 24338.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2471, pruned_loss=0.04644, over 4710840.85 frames. ], batch size: 56, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:03:42,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 12:03:43,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:03:45,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:03:45,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 12:03:45,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:03:45,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=872940.0, ans=0.1 2023-10-02 12:03:50,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:03:53,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=873006.6666666666, ans=0.0 2023-10-02 12:03:55,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:03:55,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:03:57,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:03:57,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:04:00,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:04:04,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 12:04:07,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=873073.3333333334, ans=0.125 2023-10-02 12:04:09,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 12:04:09,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 12:04:11,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:04:11,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:04:16,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 12:04:18,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:04:18,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=873073.3333333334, ans=0.0 2023-10-02 12:04:21,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:04:21,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:04:23,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:04:25,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:04:26,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 12:04:26,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:04:28,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:04:29,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:04:31,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 12:04:34,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:04:36,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=873140.0, ans=0.1 2023-10-02 12:04:40,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:04:40,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:04:43,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:04:48,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:04:49,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:04:49,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:04:51,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:04:53,917 INFO [train.py:1046] (1/4) Epoch 25, batch 3500, loss[loss=0.1739, simple_loss=0.2552, pruned_loss=0.04633, over 23407.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2469, pruned_loss=0.04613, over 4718072.11 frames. ], batch size: 93, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:04:53,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:04:57,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:04:58,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 12:05:01,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:05:04,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 12:05:07,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:05:07,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 12:05:11,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:05:12,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:05:14,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:05:14,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:05:14,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 12:05:14,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:14,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:05:16,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 12:05:16,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=873340.0, ans=0.035 2023-10-02 12:05:19,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:19,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:05:21,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:05:24,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:25,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 12:05:25,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:05:25,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=873406.6666666666, ans=0.2 2023-10-02 12:05:30,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:05:30,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:05:31,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:31,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=873406.6666666666, ans=0.0 2023-10-02 12:05:32,247 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.00 vs. limit=15.0 2023-10-02 12:05:34,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:05:34,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:05:36,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 12:05:36,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 12:05:38,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 12:05:38,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:05:39,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:39,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:05:41,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:05:42,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 12:05:43,824 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.790e+02 1.964e+02 2.126e+02 3.238e+02, threshold=3.929e+02, percent-clipped=0.0 2023-10-02 12:05:43,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:05:50,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:05:51,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 12:05:51,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 12:05:51,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:05:53,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:05:53,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:05:54,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:58,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 12:05:58,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:05:59,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:06:00,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 12:06:02,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 12:06:03,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:06:04,258 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.58 vs. limit=22.5 2023-10-02 12:06:05,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:06:06,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:06:06,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:06:06,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=873606.6666666666, ans=0.1 2023-10-02 12:06:07,625 INFO [train.py:1046] (1/4) Epoch 25, batch 3550, loss[loss=0.1597, simple_loss=0.246, pruned_loss=0.03672, over 24363.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.245, pruned_loss=0.04583, over 4713435.63 frames. ], batch size: 74, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:06:10,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:06:18,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:06:19,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 12:06:23,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:06:23,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:06:24,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:06:25,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:06:25,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:06:27,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:06:28,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:06:28,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:06:28,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 12:06:30,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:06:30,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=873673.3333333334, ans=0.015 2023-10-02 12:06:34,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:06:34,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:06:37,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:06:37,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:06:37,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:06:37,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 12:06:37,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:06:39,127 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.51 vs. limit=10.0 2023-10-02 12:06:39,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:06:41,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 12:06:46,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:06:48,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:06:49,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:06:50,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 12:06:52,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:06:54,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 12:06:54,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:06:55,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:06:55,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:06:58,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 12:07:00,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:07:04,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:07:04,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=873806.6666666666, ans=0.125 2023-10-02 12:07:05,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 12:07:07,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:07:10,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=873873.3333333334, ans=0.1 2023-10-02 12:07:10,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=873873.3333333334, ans=0.125 2023-10-02 12:07:11,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:07:11,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 12:07:12,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=873873.3333333334, ans=0.1 2023-10-02 12:07:18,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 12:07:18,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:07:18,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:07:21,700 INFO [train.py:1046] (1/4) Epoch 25, batch 3600, loss[loss=0.1759, simple_loss=0.2498, pruned_loss=0.05094, over 23698.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2446, pruned_loss=0.04577, over 4704231.09 frames. ], batch size: 232, lr: 4.09e-03, grad_scale: 16.0 2023-10-02 12:07:21,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:07:21,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:07:23,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:07:28,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:07:30,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:07:31,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:07:31,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:07:32,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:07:32,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 12:07:36,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:07:36,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:07:39,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:07:42,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:07:43,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:07:43,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:07:43,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 12:07:43,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:07:46,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:07:48,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:07:50,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:07:52,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:07:52,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:07:54,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 12:08:02,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:08:03,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:08:03,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 12:08:07,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:08:11,621 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.785e+02 1.913e+02 2.150e+02 3.207e+02, threshold=3.826e+02, percent-clipped=0.0 2023-10-02 12:08:11,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:08:13,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:08:20,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 12:08:20,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:08:20,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 12:08:22,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 12:08:23,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 12:08:27,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:08:27,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:08:27,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 12:08:29,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:08:29,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:08:29,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:08:30,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 12:08:30,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 12:08:34,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:08:35,747 INFO [train.py:1046] (1/4) Epoch 25, batch 3650, loss[loss=0.1701, simple_loss=0.2475, pruned_loss=0.04638, over 24312.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2453, pruned_loss=0.04574, over 4710990.06 frames. ], batch size: 56, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:08:35,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 12:08:36,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=874273.3333333334, ans=0.0 2023-10-02 12:08:39,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 12:08:42,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:08:45,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 12:08:46,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 12:08:52,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:08:52,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:08:52,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:08:55,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 12:08:55,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:08:56,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 12:08:57,883 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.14 vs. limit=6.0 2023-10-02 12:08:58,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:08:58,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:08:58,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 12:09:00,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 12:09:00,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:09:00,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:09:03,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:09:05,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=874406.6666666666, ans=0.125 2023-10-02 12:09:06,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 12:09:07,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 12:09:09,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:09:10,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 12:09:11,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:09:13,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:09:17,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 12:09:18,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:09:18,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:09:20,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:09:21,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:09:24,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:09:26,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:09:28,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:09:28,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:09:29,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 12:09:31,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:09:31,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:09:35,639 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.03 vs. limit=15.0 2023-10-02 12:09:37,775 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 12:09:39,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:09:39,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:09:40,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:09:40,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:09:42,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:09:43,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:09:44,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 12:09:44,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:09:46,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 12:09:49,162 INFO [train.py:1046] (1/4) Epoch 25, batch 3700, loss[loss=0.1797, simple_loss=0.2486, pruned_loss=0.0554, over 23619.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2458, pruned_loss=0.04641, over 4707549.62 frames. ], batch size: 256, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:09:49,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:09:49,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:09:52,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:09:52,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 12:09:52,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:09:52,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:09:52,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:09:55,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:09:55,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=874606.6666666666, ans=0.1 2023-10-02 12:09:59,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:09:59,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:10:00,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:10:01,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:10:01,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 12:10:05,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:10:06,569 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 12:10:06,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=874673.3333333334, ans=0.1 2023-10-02 12:10:14,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:10:14,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 12:10:16,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:10:16,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 12:10:16,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:10:16,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_na.min_abs, batch_count=874673.3333333334, ans=0.02 2023-10-02 12:10:19,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:10:20,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 12:10:22,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:10:24,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:10:25,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=874740.0, ans=0.125 2023-10-02 12:10:26,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:10:26,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:10:27,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=874740.0, ans=0.0 2023-10-02 12:10:29,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:10:32,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:10:32,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 12:10:34,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:10:34,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 12:10:38,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:10:38,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:10:40,222 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.959e+02 2.210e+02 2.685e+02 4.346e+02, threshold=4.420e+02, percent-clipped=1.0 2023-10-02 12:10:41,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:10:43,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 12:10:45,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:10:45,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 12:10:45,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:10:45,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:10:50,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:10:50,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 12:10:51,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 12:10:53,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:10:53,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:10:54,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:10:54,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:10:57,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:10:59,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:11:00,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:11:04,091 INFO [train.py:1046] (1/4) Epoch 25, batch 3750, loss[loss=0.1576, simple_loss=0.2416, pruned_loss=0.03677, over 24333.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2472, pruned_loss=0.04725, over 4700494.98 frames. ], batch size: 61, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:11:04,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 12:11:05,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 12:11:07,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:11:08,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 12:11:08,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:11:10,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:11:11,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:11:12,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:11:15,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:11:18,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:11:20,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:11:21,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:11:24,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:11:24,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 12:11:24,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=875006.6666666666, ans=0.125 2023-10-02 12:11:25,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:11:25,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:11:27,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:11:30,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 12:11:33,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 12:11:34,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:11:35,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:11:37,815 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.93 vs. limit=15.0 2023-10-02 12:11:38,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:11:38,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=875073.3333333334, ans=0.125 2023-10-02 12:11:44,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:11:44,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 12:11:49,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 12:11:52,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:11:54,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:11:56,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:11:59,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:12:05,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 12:12:06,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:12:08,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:12:09,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:12:11,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:12:17,201 INFO [train.py:1046] (1/4) Epoch 25, batch 3800, loss[loss=0.1826, simple_loss=0.2423, pruned_loss=0.06148, over 19886.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2465, pruned_loss=0.04694, over 4706712.99 frames. ], batch size: 388, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:12:19,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:12:23,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:12:24,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 12:12:25,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 12:12:25,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:12:29,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:12:29,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 12:12:31,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 12:12:31,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:12:33,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:12:35,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:12:35,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:12:35,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:12:36,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 12:12:39,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 12:12:41,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:12:42,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:12:43,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:12:43,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:12:47,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 12:12:47,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:12:50,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:12:51,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:12:55,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 12:12:56,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 12:12:58,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:13:04,326 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.29 vs. limit=6.0 2023-10-02 12:13:05,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:13:08,023 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.870e+02 2.050e+02 2.289e+02 3.047e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-02 12:13:10,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:13:12,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 12:13:13,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 12:13:13,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:13:16,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:13:19,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:13:20,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 12:13:21,010 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=9.01 vs. limit=15.0 2023-10-02 12:13:23,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 12:13:23,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 12:13:23,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:13:24,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:13:28,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:13:30,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:13:31,979 INFO [train.py:1046] (1/4) Epoch 25, batch 3850, loss[loss=0.1691, simple_loss=0.2271, pruned_loss=0.05556, over 23339.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2451, pruned_loss=0.04684, over 4693313.78 frames. ], batch size: 285, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:13:34,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:13:36,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 12:13:38,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:13:38,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:13:41,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:13:42,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:13:45,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 12:13:45,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=875673.3333333334, ans=0.0 2023-10-02 12:13:47,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 12:13:51,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=875673.3333333334, ans=0.125 2023-10-02 12:13:52,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:13:52,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:13:55,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:13:56,053 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.18 vs. limit=15.0 2023-10-02 12:13:56,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:13:59,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:13:59,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:14:01,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:02,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:14:03,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:05,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:07,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:14:07,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:14:08,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 12:14:08,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 12:14:09,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:14:09,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:14:11,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:11,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:14:11,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 12:14:14,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 12:14:14,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:19,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 12:14:20,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 12:14:23,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:24,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=875806.6666666666, ans=0.2 2023-10-02 12:14:25,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:14:26,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=875806.6666666666, ans=0.0 2023-10-02 12:14:29,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:29,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 12:14:32,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 12:14:34,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:34,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:38,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:14:38,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 12:14:38,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:39,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:39,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:14:39,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 12:14:41,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:14:42,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 12:14:43,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:43,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:44,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:14:45,828 INFO [train.py:1046] (1/4) Epoch 25, batch 3900, loss[loss=0.1537, simple_loss=0.2285, pruned_loss=0.03938, over 23236.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2436, pruned_loss=0.04661, over 4695355.11 frames. ], batch size: 105, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:14:45,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:47,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:14:47,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:48,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:48,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:14:49,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 12:14:50,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:53,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:14:54,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 12:14:56,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:14:57,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:15:00,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 12:15:00,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:15:00,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:15:02,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 12:15:02,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:15:03,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 12:15:04,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:15:05,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 12:15:07,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 12:15:10,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:15:11,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:15:11,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:15:11,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:15:16,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=876073.3333333334, ans=0.1 2023-10-02 12:15:17,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:15:19,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:15:21,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:15:22,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:15:22,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:15:30,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:15:30,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:15:37,489 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.823e+02 1.992e+02 2.120e+02 3.702e+02, threshold=3.983e+02, percent-clipped=0.0 2023-10-02 12:15:37,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:15:38,168 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.14 vs. limit=15.0 2023-10-02 12:15:38,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:15:39,531 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.67 vs. limit=15.0 2023-10-02 12:15:46,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:15:51,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:15:51,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 12:15:51,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 12:15:51,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:15:52,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 12:15:54,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:15:55,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 12:15:59,846 INFO [train.py:1046] (1/4) Epoch 25, batch 3950, loss[loss=0.1765, simple_loss=0.2585, pruned_loss=0.04729, over 23377.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.244, pruned_loss=0.04611, over 4712830.97 frames. ], batch size: 93, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:16:02,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:16:04,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 12:16:04,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:16:07,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:16:08,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:16:14,652 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 12:16:16,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:16:16,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 12:16:16,100 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 12:16:16,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:16:18,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:16:18,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:16:18,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:16:23,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 12:16:26,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:16:26,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:16:26,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:16:28,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:16:28,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:16:37,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:16:37,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:16:41,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 12:16:47,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 12:16:47,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 12:16:47,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:16:49,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:16:53,260 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.48 vs. limit=22.5 2023-10-02 12:16:55,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:16:57,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:16:57,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:16:57,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:16:57,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 12:17:02,248 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.21 vs. limit=15.0 2023-10-02 12:17:02,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:17:04,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:17:06,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=876540.0, ans=0.2 2023-10-02 12:17:08,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 12:17:13,732 INFO [train.py:1046] (1/4) Epoch 25, batch 4000, loss[loss=0.1675, simple_loss=0.256, pruned_loss=0.03952, over 24649.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2451, pruned_loss=0.04628, over 4721429.37 frames. ], batch size: 73, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:17:19,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:17:27,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:17:30,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:17:30,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:17:32,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:17:32,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 12:17:33,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 12:17:33,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 12:17:34,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:17:34,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 12:17:34,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:17:37,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:17:37,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:17:37,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:17:39,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:17:39,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 12:17:40,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:17:42,088 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 12:17:42,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:17:43,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:17:44,974 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 12:17:45,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:17:46,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:17:48,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=876740.0, ans=0.2 2023-10-02 12:17:52,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 12:17:52,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:17:55,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:17:57,322 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 12:17:57,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:17:58,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 12:17:58,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:18:00,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:18:01,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:18:02,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:18:02,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:18:04,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:18:05,460 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.899e+02 2.057e+02 2.287e+02 3.112e+02, threshold=4.114e+02, percent-clipped=0.0 2023-10-02 12:18:07,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 12:18:07,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:18:08,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=876806.6666666666, ans=0.2 2023-10-02 12:18:09,975 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 12:18:14,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:18:16,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 12:18:19,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:18:19,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:18:20,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:18:21,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:18:25,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:18:26,801 INFO [train.py:1046] (1/4) Epoch 25, batch 4050, loss[loss=0.1616, simple_loss=0.246, pruned_loss=0.03856, over 24502.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2457, pruned_loss=0.04573, over 4718814.38 frames. ], batch size: 66, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:18:28,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 12:18:30,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 12:18:31,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:18:31,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:18:32,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:18:34,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:18:35,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:18:37,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=876940.0, ans=0.2 2023-10-02 12:18:40,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:18:41,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:18:43,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 12:18:45,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:18:47,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:18:51,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:18:53,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:18:54,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 12:18:57,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 12:18:57,603 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 12:19:00,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:19:08,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 12:19:08,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:19:09,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=877140.0, ans=0.125 2023-10-02 12:19:10,514 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.40 vs. limit=15.0 2023-10-02 12:19:10,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:19:14,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:19:15,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:19:15,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:19:17,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:19:20,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 12:19:20,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 12:19:21,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:19:24,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 12:19:27,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:19:34,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 12:19:36,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:19:36,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:19:39,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 12:19:39,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 12:19:39,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:19:40,396 INFO [train.py:1046] (1/4) Epoch 25, batch 4100, loss[loss=0.1699, simple_loss=0.2599, pruned_loss=0.03996, over 24649.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2468, pruned_loss=0.04621, over 4720957.99 frames. ], batch size: 73, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:19:40,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:19:42,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:19:42,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:19:48,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 12:19:49,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 12:19:51,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 12:19:54,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 12:19:54,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:19:55,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:19:55,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:19:55,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:19:55,680 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 12:19:58,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:19:59,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:19:59,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:19:59,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:20:03,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=877340.0, ans=0.0 2023-10-02 12:20:04,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:20:05,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:20:05,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:20:05,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 12:20:07,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:20:07,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:20:07,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:20:07,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:20:09,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 12:20:13,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:20:14,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 12:20:16,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:20:17,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:20:17,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 12:20:18,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:20:19,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:20:20,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:20:22,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 12:20:22,410 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:20:23,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:20:23,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:20:25,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 12:20:26,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:20:26,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:20:26,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=877473.3333333334, ans=0.04949747468305833 2023-10-02 12:20:30,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:20:30,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=877473.3333333334, ans=0.0 2023-10-02 12:20:33,241 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.821e+02 2.033e+02 2.340e+02 3.969e+02, threshold=4.066e+02, percent-clipped=0.0 2023-10-02 12:20:36,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:20:38,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:20:38,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:20:45,103 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.85 vs. limit=10.0 2023-10-02 12:20:45,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:20:45,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:20:48,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=877540.0, ans=0.125 2023-10-02 12:20:50,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:20:53,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:20:54,546 INFO [train.py:1046] (1/4) Epoch 25, batch 4150, loss[loss=0.1799, simple_loss=0.254, pruned_loss=0.05291, over 23371.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2481, pruned_loss=0.04706, over 4708082.06 frames. ], batch size: 106, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:20:57,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:20:57,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:20:58,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:20:58,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:21:02,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 12:21:02,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:21:03,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 12:21:04,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 12:21:04,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 12:21:06,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:21:10,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:21:10,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:21:14,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:21:14,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:21:15,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:21:15,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=877673.3333333334, ans=0.125 2023-10-02 12:21:18,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:21:18,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:21:19,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 12:21:24,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:21:28,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:21:28,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 12:21:29,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 12:21:29,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:21:31,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 12:21:31,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:21:31,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:21:34,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:21:35,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:21:38,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 12:21:42,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 12:21:43,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:21:43,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 12:21:45,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:21:45,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 12:21:47,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=877806.6666666666, ans=0.0 2023-10-02 12:21:48,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:21:49,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:21:51,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:21:52,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 12:21:52,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:21:52,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 12:21:53,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 12:21:55,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 12:21:55,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:21:55,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:21:56,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:21:56,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 12:21:57,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:21:57,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 12:21:59,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:22:00,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:22:01,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 12:22:01,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 12:22:06,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:22:08,737 INFO [train.py:1046] (1/4) Epoch 25, batch 4200, loss[loss=0.1812, simple_loss=0.2653, pruned_loss=0.04853, over 24371.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2473, pruned_loss=0.04645, over 4705726.98 frames. ], batch size: 77, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:22:08,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 12:22:10,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:22:12,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:22:12,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=877940.0, ans=0.1 2023-10-02 12:22:13,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:22:15,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:22:15,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:22:18,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 12:22:21,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 12:22:22,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:22:24,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:22:25,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=878006.6666666666, ans=0.2 2023-10-02 12:22:28,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:22:28,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=878006.6666666666, ans=0.0 2023-10-02 12:22:30,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 12:22:33,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:22:34,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:22:34,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 12:22:34,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:22:35,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:22:35,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:22:37,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:22:38,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:22:39,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 12:22:40,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:22:43,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=878073.3333333334, ans=0.125 2023-10-02 12:22:44,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 12:22:44,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:22:46,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:22:49,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:22:51,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:22:51,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 12:22:52,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:22:52,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:22:57,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:22:58,606 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.12 vs. limit=6.0 2023-10-02 12:22:59,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:23:01,934 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.878e+02 2.060e+02 2.306e+02 3.122e+02, threshold=4.120e+02, percent-clipped=0.0 2023-10-02 12:23:05,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:23:07,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 12:23:09,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:23:13,995 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.95 vs. limit=6.0 2023-10-02 12:23:15,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 12:23:15,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:23:18,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 12:23:23,004 INFO [train.py:1046] (1/4) Epoch 25, batch 4250, loss[loss=0.1468, simple_loss=0.2265, pruned_loss=0.03357, over 24442.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2453, pruned_loss=0.04591, over 4701539.07 frames. ], batch size: 58, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:23:23,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:23:27,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:23:27,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:23:30,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:23:31,487 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.60 vs. limit=6.0 2023-10-02 12:23:34,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:23:34,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 12:23:34,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:23:36,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:23:38,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=878340.0, ans=0.0 2023-10-02 12:23:39,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:23:43,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:23:43,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:23:45,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:23:45,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:23:47,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:23:47,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:23:48,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:23:51,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:23:53,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:23:54,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 12:23:57,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 12:23:58,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:23:58,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:23:58,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:23:58,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:23:58,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:24:00,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:24:02,487 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.71 vs. limit=22.5 2023-10-02 12:24:04,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 12:24:04,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:24:08,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:24:10,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:24:11,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 12:24:11,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:24:13,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 12:24:13,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:24:15,573 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.19 vs. limit=6.0 2023-10-02 12:24:15,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:24:16,458 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.51 vs. limit=15.0 2023-10-02 12:24:17,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:24:17,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:24:21,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 12:24:23,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:24:24,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:24:24,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=878540.0, ans=0.125 2023-10-02 12:24:28,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:24:31,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:24:32,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:24:34,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:24:35,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:24:36,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=878606.6666666666, ans=0.125 2023-10-02 12:24:37,124 INFO [train.py:1046] (1/4) Epoch 25, batch 4300, loss[loss=0.1661, simple_loss=0.2549, pruned_loss=0.03861, over 24641.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2455, pruned_loss=0.04609, over 4704408.31 frames. ], batch size: 68, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:24:37,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:24:38,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:24:38,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 12:24:40,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:24:42,147 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.95 vs. limit=22.5 2023-10-02 12:24:45,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:24:46,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:24:48,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:24:57,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:24:57,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 12:24:58,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:24:59,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:25:01,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:25:01,286 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 12:25:03,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:25:03,805 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.93 vs. limit=15.0 2023-10-02 12:25:04,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:25:07,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 12:25:07,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:25:07,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 12:25:07,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=878740.0, ans=0.05 2023-10-02 12:25:10,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 12:25:11,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:25:14,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:25:14,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:25:14,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:25:16,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:25:16,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:25:17,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 12:25:17,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 12:25:19,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:25:21,797 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.76 vs. limit=10.0 2023-10-02 12:25:22,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:25:22,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 12:25:22,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:25:23,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:25:23,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 12:25:23,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 12:25:25,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 12:25:25,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:25:27,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 12:25:27,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 12:25:31,176 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.823e+02 1.995e+02 2.293e+02 3.439e+02, threshold=3.990e+02, percent-clipped=0.0 2023-10-02 12:25:31,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:25:33,200 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 12:25:34,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:25:36,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:25:36,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:25:37,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=878873.3333333334, ans=0.125 2023-10-02 12:25:38,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 12:25:38,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:25:38,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:25:38,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=878873.3333333334, ans=0.125 2023-10-02 12:25:40,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:25:40,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:25:40,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:25:42,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:25:44,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=878873.3333333334, ans=0.0 2023-10-02 12:25:46,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:25:47,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:25:47,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:25:51,630 INFO [train.py:1046] (1/4) Epoch 25, batch 4350, loss[loss=0.1644, simple_loss=0.2436, pruned_loss=0.04258, over 24270.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2461, pruned_loss=0.0461, over 4704368.26 frames. ], batch size: 56, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:25:53,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 12:25:53,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:25:57,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:26:01,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:26:04,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:26:04,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:26:07,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:26:09,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=879006.6666666666, ans=0.2 2023-10-02 12:26:11,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:26:14,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:26:14,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:26:17,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:26:19,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:26:20,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:26:26,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 12:26:27,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:26:29,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:26:33,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:26:34,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=879073.3333333334, ans=0.04949747468305833 2023-10-02 12:26:35,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 12:26:38,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:26:39,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:26:41,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=879140.0, ans=0.125 2023-10-02 12:26:42,879 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 12:26:45,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:26:46,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:26:46,950 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 12:26:48,335 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 12:26:48,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:26:48,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:26:48,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:26:50,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:26:51,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:26:51,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:26:54,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 12:26:54,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:26:54,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:26:54,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:26:54,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 12:26:55,988 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 12:26:55,992 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 12:26:56,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 12:26:56,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=879206.6666666666, ans=0.125 2023-10-02 12:26:58,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:26:59,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:26:59,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:27:00,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:27:02,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 12:27:02,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=879206.6666666666, ans=0.0 2023-10-02 12:27:05,407 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 12:27:05,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:27:06,764 INFO [train.py:1046] (1/4) Epoch 25, batch 4400, loss[loss=0.1704, simple_loss=0.2571, pruned_loss=0.04185, over 24616.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2476, pruned_loss=0.0471, over 4696694.67 frames. ], batch size: 68, lr: 4.07e-03, grad_scale: 16.0 2023-10-02 12:27:07,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=879273.3333333334, ans=0.125 2023-10-02 12:27:09,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:27:09,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:27:11,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:27:13,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 12:27:13,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 12:27:15,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 12:27:15,117 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 12:27:15,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:27:15,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:27:17,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 12:27:18,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:27:19,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:27:19,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=879340.0, ans=0.125 2023-10-02 12:27:20,722 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 12:27:22,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:27:22,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 12:27:22,686 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 12:27:25,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 12:27:26,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 12:27:26,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 12:27:29,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:27:30,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:27:30,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:27:31,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:27:33,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 12:27:33,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=879340.0, ans=0.2 2023-10-02 12:27:34,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 12:27:34,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:27:36,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=879406.6666666666, ans=0.0 2023-10-02 12:27:37,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:27:37,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:27:38,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:27:39,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:27:39,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 12:27:40,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=879406.6666666666, ans=0.125 2023-10-02 12:27:41,272 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 12:27:44,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:27:44,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=879406.6666666666, ans=0.1 2023-10-02 12:27:51,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:27:52,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 12:27:56,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:27:57,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:28:00,648 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.859e+02 2.058e+02 2.291e+02 3.192e+02, threshold=4.115e+02, percent-clipped=0.0 2023-10-02 12:28:02,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:28:04,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 12:28:04,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:28:04,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:28:04,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:28:04,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:28:08,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 12:28:11,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 12:28:13,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 12:28:13,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:28:13,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 12:28:13,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:28:15,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:28:18,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 12:28:18,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=879606.6666666666, ans=0.95 2023-10-02 12:28:19,789 INFO [train.py:1046] (1/4) Epoch 25, batch 4450, loss[loss=0.1944, simple_loss=0.2624, pruned_loss=0.06323, over 22679.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2485, pruned_loss=0.04742, over 4683948.15 frames. ], batch size: 322, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:28:22,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:28:26,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:28:26,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:28:30,750 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.18 vs. limit=15.0 2023-10-02 12:28:34,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:28:34,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:28:38,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:28:40,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:28:40,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=879673.3333333334, ans=0.125 2023-10-02 12:28:42,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:28:42,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:28:42,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 12:28:42,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:28:43,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=879673.3333333334, ans=0.1 2023-10-02 12:28:44,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:28:44,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:28:44,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:28:47,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 12:28:49,280 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.93 vs. limit=6.0 2023-10-02 12:28:51,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:28:51,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:28:52,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:28:53,378 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.21 vs. limit=15.0 2023-10-02 12:28:54,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:28:56,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:29:01,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 12:29:02,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 12:29:02,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 12:29:02,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:29:06,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:29:06,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 12:29:09,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:29:12,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:29:13,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 12:29:13,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:29:13,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:29:14,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:29:14,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:29:16,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:29:19,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:29:19,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 12:29:20,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:29:23,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:29:25,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:29:25,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:29:25,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 12:29:25,372 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:29:25,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=879873.3333333334, ans=0.0 2023-10-02 12:29:25,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=879873.3333333334, ans=0.125 2023-10-02 12:29:27,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:29:30,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 12:29:32,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=879940.0, ans=0.2 2023-10-02 12:29:33,411 INFO [train.py:1046] (1/4) Epoch 25, batch 4500, loss[loss=0.1617, simple_loss=0.2364, pruned_loss=0.04349, over 17530.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2489, pruned_loss=0.04721, over 4695473.67 frames. ], batch size: 38, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:29:33,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:29:36,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:29:39,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 12:29:39,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 12:29:39,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:29:41,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=879940.0, ans=0.1 2023-10-02 12:29:44,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:29:45,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:29:49,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:29:50,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:29:50,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:29:50,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:29:57,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=880006.6666666666, ans=0.1 2023-10-02 12:30:00,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:30:01,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:30:02,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:30:04,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:30:06,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:30:12,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:30:16,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:30:21,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:30:22,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:30:22,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 12:30:24,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:30:25,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:30:28,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:30:28,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:30:30,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:30:31,328 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.866e+02 2.072e+02 2.376e+02 3.335e+02, threshold=4.143e+02, percent-clipped=0.0 2023-10-02 12:30:31,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 12:30:31,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:30:31,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:30:36,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:30:36,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:30:37,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:30:40,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:30:42,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:30:42,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 12:30:45,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 12:30:45,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 12:30:48,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 12:30:50,988 INFO [train.py:1046] (1/4) Epoch 25, batch 4550, loss[loss=0.1609, simple_loss=0.2202, pruned_loss=0.05079, over 22745.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2469, pruned_loss=0.04642, over 4698640.25 frames. ], batch size: 322, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:30:51,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 12:30:52,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:30:55,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:30:55,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=880273.3333333334, ans=0.1 2023-10-02 12:30:56,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:30:59,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:31:03,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:31:06,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:31:06,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:31:06,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:31:06,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:10,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:31:12,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:31:15,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:31:16,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 12:31:18,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 12:31:18,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:31:18,601 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:31:19,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 12:31:22,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 12:31:22,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:31:22,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=880406.6666666666, ans=0.0 2023-10-02 12:31:27,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 12:31:27,527 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=880406.6666666666, ans=0.0 2023-10-02 12:31:29,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:31:31,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:32,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:32,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:31:34,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 12:31:37,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:31:38,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:38,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:31:41,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:31:42,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 12:31:43,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 12:31:43,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:31:45,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 12:31:45,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=880473.3333333334, ans=10.0 2023-10-02 12:31:48,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 12:31:48,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:31:49,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:31:49,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:31:51,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:51,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:31:53,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:31:53,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 12:31:55,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=880540.0, ans=0.125 2023-10-02 12:31:56,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:31:56,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 12:31:58,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 12:31:58,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:31:58,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 12:31:59,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:32:00,351 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.82 vs. limit=15.0 2023-10-02 12:32:01,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:32:02,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:32:02,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:32:04,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:32:05,469 INFO [train.py:1046] (1/4) Epoch 25, batch 4600, loss[loss=0.1762, simple_loss=0.2446, pruned_loss=0.05392, over 23751.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.246, pruned_loss=0.04608, over 4708769.63 frames. ], batch size: 179, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:32:05,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:32:06,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:32:08,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:09,101 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.65 vs. limit=15.0 2023-10-02 12:32:10,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:32:11,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:32:11,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:32:13,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:32:14,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=880606.6666666666, ans=0.125 2023-10-02 12:32:15,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 12:32:16,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:32:19,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:32:21,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:32:23,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:26,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=880673.3333333334, ans=0.0 2023-10-02 12:32:30,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 12:32:31,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:32,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:35,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=880740.0, ans=0.0 2023-10-02 12:32:37,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:32:37,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:32:40,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 12:32:40,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:32:41,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:32:46,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:48,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:32:49,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:32:52,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 12:32:53,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 12:32:56,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=880806.6666666666, ans=0.125 2023-10-02 12:32:57,357 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.41 vs. limit=15.0 2023-10-02 12:32:57,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:32:58,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:33:00,731 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.837e+02 2.008e+02 2.202e+02 3.381e+02, threshold=4.017e+02, percent-clipped=0.0 2023-10-02 12:33:00,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:00,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 12:33:00,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:33:02,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 12:33:02,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:02,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:33:03,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:03,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:33:05,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:33:05,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 12:33:05,925 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.32 vs. limit=15.0 2023-10-02 12:33:06,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 12:33:06,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 12:33:06,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:33:08,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:33:08,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:33:09,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:33:19,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:33:20,354 INFO [train.py:1046] (1/4) Epoch 25, batch 4650, loss[loss=0.1625, simple_loss=0.2292, pruned_loss=0.04793, over 22815.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2441, pruned_loss=0.04579, over 4693739.04 frames. ], batch size: 322, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:33:21,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:33:23,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:33:23,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:33:23,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:33:23,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:33:24,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:33:27,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 12:33:30,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:33:30,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=880940.0, ans=0.125 2023-10-02 12:33:31,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 12:33:31,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:33:33,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 12:33:33,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:33:33,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 12:33:35,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 12:33:35,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:35,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:33:38,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:33:39,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:33:39,468 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 12:33:40,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:33:42,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 12:33:47,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:47,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:33:48,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 12:33:48,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:33:52,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:33:57,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:01,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=881073.3333333334, ans=0.2 2023-10-02 12:34:03,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:34:05,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:34:06,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:34:06,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:34:07,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 12:34:09,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 12:34:10,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 12:34:10,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 12:34:10,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=881140.0, ans=0.125 2023-10-02 12:34:11,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:34:16,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:34:16,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:34:16,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 12:34:16,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:18,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:34:18,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:34:18,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=881206.6666666666, ans=0.1 2023-10-02 12:34:19,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:34:22,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:34:22,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:34:23,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:34:24,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=881206.6666666666, ans=0.125 2023-10-02 12:34:26,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:34:28,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:34:28,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:34:29,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 12:34:31,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:34:32,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 12:34:33,836 INFO [train.py:1046] (1/4) Epoch 25, batch 4700, loss[loss=0.1768, simple_loss=0.247, pruned_loss=0.05327, over 23818.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2455, pruned_loss=0.04629, over 4685037.07 frames. ], batch size: 179, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:34:39,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:39,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:34:40,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:34:41,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:34:42,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 12:34:46,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 12:34:46,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 12:34:49,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:50,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:34:50,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:34:54,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:59,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 12:35:00,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 12:35:03,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:35:08,488 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:35:09,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 12:35:11,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:35:13,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:17,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 12:35:18,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:35:21,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=881473.3333333334, ans=0.125 2023-10-02 12:35:22,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:35:24,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 12:35:25,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=881473.3333333334, ans=0.05 2023-10-02 12:35:25,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=881473.3333333334, ans=0.0 2023-10-02 12:35:27,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:27,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:35:28,466 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.857e+02 2.002e+02 2.228e+02 2.822e+02, threshold=4.004e+02, percent-clipped=0.0 2023-10-02 12:35:29,219 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.67 vs. limit=15.0 2023-10-02 12:35:30,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:35:30,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:35:30,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 12:35:32,026 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 12:35:32,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.92 vs. limit=15.0 2023-10-02 12:35:33,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:35:34,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:34,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:34,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 12:35:34,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:38,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 12:35:41,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=881540.0, ans=0.125 2023-10-02 12:35:42,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:35:43,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:35:46,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:35:48,464 INFO [train.py:1046] (1/4) Epoch 25, batch 4750, loss[loss=0.1681, simple_loss=0.2559, pruned_loss=0.04014, over 24665.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2466, pruned_loss=0.04635, over 4700356.33 frames. ], batch size: 73, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:35:48,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:35:50,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 12:35:51,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:35:54,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 12:35:57,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:35:57,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:35:58,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:36:04,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 12:36:08,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:36:10,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 12:36:11,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:36:15,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:36:15,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:36:15,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:36:17,340 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 12:36:17,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 12:36:22,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 12:36:25,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:36:26,319 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.15 vs. limit=15.0 2023-10-02 12:36:26,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:36:29,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:36:29,297 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 12:36:29,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:36:30,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:36:35,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:36:36,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=881806.6666666666, ans=0.0 2023-10-02 12:36:38,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 12:36:38,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 12:36:39,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:36:39,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:36:39,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:36:40,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:36:41,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=881806.6666666666, ans=15.0 2023-10-02 12:36:42,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 12:36:43,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 12:36:45,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=881873.3333333334, ans=0.1 2023-10-02 12:36:46,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:36:49,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:36:49,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 12:36:51,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:36:52,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:36:53,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:36:55,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:36:55,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 12:36:59,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:36:59,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 12:37:01,151 INFO [train.py:1046] (1/4) Epoch 25, batch 4800, loss[loss=0.1949, simple_loss=0.2669, pruned_loss=0.06145, over 22805.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2477, pruned_loss=0.04624, over 4718433.21 frames. ], batch size: 322, lr: 4.07e-03, grad_scale: 16.0 2023-10-02 12:37:01,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 12:37:02,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 12:37:03,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:37:05,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:37:05,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 12:37:11,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:37:11,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:37:16,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:37:18,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:37:18,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:37:18,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 12:37:18,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=882006.6666666666, ans=0.0 2023-10-02 12:37:19,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:37:19,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:37:21,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:37:25,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:37:27,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:37:27,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:37:29,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:37:29,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 12:37:29,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:37:31,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:37:32,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:37:35,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:37:36,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:37:36,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:37:36,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 12:37:38,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:37:41,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 12:37:41,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 12:37:42,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:37:42,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:37:44,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:37:44,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:37:44,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:37:44,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:37:45,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:37:50,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:37:53,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:37:54,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:37:55,799 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.900e+02 2.147e+02 2.539e+02 4.141e+02, threshold=4.294e+02, percent-clipped=1.0 2023-10-02 12:37:59,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 12:37:59,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:37:59,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=882206.6666666666, ans=0.1 2023-10-02 12:38:00,276 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.50 vs. limit=5.0 2023-10-02 12:38:00,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:00,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:38:00,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:38:05,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:38:05,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=882206.6666666666, ans=0.125 2023-10-02 12:38:06,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:38:06,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:07,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:38:08,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:38:09,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:38:12,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:38:12,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:12,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:38:14,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 12:38:15,492 INFO [train.py:1046] (1/4) Epoch 25, batch 4850, loss[loss=0.171, simple_loss=0.2576, pruned_loss=0.04218, over 24389.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2476, pruned_loss=0.04625, over 4708118.00 frames. ], batch size: 69, lr: 4.07e-03, grad_scale: 16.0 2023-10-02 12:38:15,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 12:38:15,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:38:15,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:38:16,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:38:16,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:19,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:38:22,434 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.88 vs. limit=15.0 2023-10-02 12:38:27,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 12:38:27,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:38:27,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=882273.3333333334, ans=0.0 2023-10-02 12:38:32,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=882340.0, ans=0.125 2023-10-02 12:38:34,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:38:34,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:38:34,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:37,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:38:37,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=882340.0, ans=0.0 2023-10-02 12:38:38,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:38:38,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:38:38,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 12:38:42,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:38:42,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=882340.0, ans=0.125 2023-10-02 12:38:44,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:38:44,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:38:46,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:38:46,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 12:38:48,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:38:49,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:38:53,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:38:53,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 12:38:53,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 12:38:55,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:39:02,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:39:02,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 12:39:03,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:39:03,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:39:05,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:39:06,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 12:39:06,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:39:08,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 12:39:08,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:39:09,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:39:09,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 12:39:18,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:39:21,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=882540.0, ans=0.0 2023-10-02 12:39:24,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:39:24,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:39:28,610 INFO [train.py:1046] (1/4) Epoch 25, batch 4900, loss[loss=0.1757, simple_loss=0.2507, pruned_loss=0.05031, over 23368.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2465, pruned_loss=0.04662, over 4695194.21 frames. ], batch size: 119, lr: 4.07e-03, grad_scale: 16.0 2023-10-02 12:39:28,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 12:39:28,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:39:34,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:39:36,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:39:36,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:39:38,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 12:39:43,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 12:39:45,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=882673.3333333334, ans=0.1 2023-10-02 12:39:47,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 12:39:48,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 12:39:49,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:39:49,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:39:49,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:39:49,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:39:49,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:39:51,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 12:39:53,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 12:39:55,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:39:56,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:39:57,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:40:00,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=882740.0, ans=0.1 2023-10-02 12:40:01,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:40:01,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:40:04,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:40:04,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 12:40:04,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:40:04,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=882740.0, ans=0.1 2023-10-02 12:40:05,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:40:05,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 12:40:05,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 12:40:08,054 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.10 vs. limit=15.0 2023-10-02 12:40:10,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 12:40:11,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:40:12,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:40:14,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:40:14,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:40:14,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 12:40:14,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:40:14,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 12:40:14,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=882806.6666666666, ans=0.1 2023-10-02 12:40:17,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:40:17,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=882806.6666666666, ans=0.0 2023-10-02 12:40:19,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 12:40:22,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:40:23,576 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.865e+02 2.106e+02 2.440e+02 3.328e+02, threshold=4.212e+02, percent-clipped=0.0 2023-10-02 12:40:25,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 12:40:26,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:40:27,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 12:40:27,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 12:40:33,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:40:35,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:40:37,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 12:40:37,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 12:40:37,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:40:38,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:40:38,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=882873.3333333334, ans=0.07 2023-10-02 12:40:42,658 INFO [train.py:1046] (1/4) Epoch 25, batch 4950, loss[loss=0.1718, simple_loss=0.233, pruned_loss=0.05529, over 23361.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2453, pruned_loss=0.04623, over 4708784.68 frames. ], batch size: 285, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:40:42,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:40:42,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:40:44,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:40:44,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 12:40:45,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:40:47,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:40:47,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 12:40:47,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=882940.0, ans=0.125 2023-10-02 12:40:51,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 12:40:51,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 12:40:51,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:40:53,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 12:40:53,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:40:53,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:40:53,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:40:53,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:40:55,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:40:55,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:40:57,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:40:59,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:40:59,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:40:59,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:41:01,314 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.66 vs. limit=12.0 2023-10-02 12:41:01,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:41:05,907 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.32 vs. limit=22.5 2023-10-02 12:41:07,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:41:09,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:41:11,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:41:11,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:41:13,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:41:15,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 12:41:16,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 12:41:18,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:41:20,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:41:20,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:41:21,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:41:22,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:41:22,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:41:24,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:41:25,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:41:28,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:41:29,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:41:29,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:41:31,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 12:41:31,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:41:33,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:41:37,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:41:39,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:41:39,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:41:39,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:41:40,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:41:41,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:41:45,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:41:45,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:41:45,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:41:47,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 12:41:52,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:41:56,721 INFO [train.py:1046] (1/4) Epoch 25, batch 5000, loss[loss=0.1642, simple_loss=0.2358, pruned_loss=0.04624, over 23892.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2451, pruned_loss=0.04586, over 4718362.35 frames. ], batch size: 195, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:41:58,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 12:41:58,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 12:42:03,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:42:03,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:42:04,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 12:42:05,711 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.00 vs. limit=15.0 2023-10-02 12:42:06,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 12:42:06,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:42:07,495 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.22 vs. limit=10.0 2023-10-02 12:42:10,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 12:42:10,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:42:10,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:42:11,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 12:42:11,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:42:11,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:42:12,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 12:42:12,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:42:12,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=883340.0, ans=0.0 2023-10-02 12:42:14,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:42:15,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 12:42:17,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 12:42:17,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:42:18,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 12:42:18,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:42:18,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:42:20,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:42:20,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 12:42:20,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 12:42:20,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 12:42:21,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:42:21,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:42:24,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 12:42:24,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:42:26,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:42:26,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:42:28,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 12:42:30,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 12:42:30,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:42:31,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:42:32,337 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.08 vs. limit=15.0 2023-10-02 12:42:35,731 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 12:42:38,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=883406.6666666666, ans=15.0 2023-10-02 12:42:39,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:42:39,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:42:39,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:42:42,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=883473.3333333334, ans=0.125 2023-10-02 12:42:43,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 12:42:43,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:42:43,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:42:44,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:42:47,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 12:42:47,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:42:50,603 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.815e+02 1.960e+02 2.170e+02 3.682e+02, threshold=3.920e+02, percent-clipped=0.0 2023-10-02 12:42:50,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:42:50,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:42:51,393 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.65 vs. limit=12.0 2023-10-02 12:42:56,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 12:42:59,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=883540.0, ans=0.2 2023-10-02 12:43:02,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:43:09,569 INFO [train.py:1046] (1/4) Epoch 25, batch 5050, loss[loss=0.1607, simple_loss=0.2519, pruned_loss=0.03472, over 24308.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2458, pruned_loss=0.0459, over 4705489.71 frames. ], batch size: 74, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:43:11,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:43:11,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:43:11,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:43:13,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:43:13,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:43:14,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:43:14,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:43:18,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:43:18,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 12:43:18,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:43:21,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:43:23,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:43:23,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 12:43:24,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:43:24,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:43:27,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:43:29,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:43:29,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:43:32,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=883673.3333333334, ans=0.2 2023-10-02 12:43:32,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=883673.3333333334, ans=0.2 2023-10-02 12:43:34,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=883673.3333333334, ans=0.125 2023-10-02 12:43:36,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=883673.3333333334, ans=0.0 2023-10-02 12:43:38,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 12:43:38,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 12:43:39,419 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.09 vs. limit=15.0 2023-10-02 12:43:40,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:43:40,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 12:43:42,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:43:43,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:43:43,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:43:45,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:43:45,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 12:43:45,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 12:43:46,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:43:49,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:43:51,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:43:51,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 12:43:52,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=883806.6666666666, ans=0.0 2023-10-02 12:43:53,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:43:55,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 12:43:56,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:43:56,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=883806.6666666666, ans=0.125 2023-10-02 12:43:57,525 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=10.41 vs. limit=15.0 2023-10-02 12:43:57,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:43:58,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:43:58,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:44:01,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:44:02,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:44:04,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:04,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:44:04,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:44:04,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 12:44:05,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:44:06,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:44:07,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=883873.3333333334, ans=0.125 2023-10-02 12:44:10,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:44:11,619 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 12:44:11,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 12:44:13,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:44:14,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:15,017 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 12:44:15,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=883873.3333333334, ans=0.1 2023-10-02 12:44:16,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=883873.3333333334, ans=0.125 2023-10-02 12:44:17,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:44:17,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 12:44:17,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:22,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:44:22,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:22,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 12:44:23,702 INFO [train.py:1046] (1/4) Epoch 25, batch 5100, loss[loss=0.1927, simple_loss=0.268, pruned_loss=0.05873, over 23722.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2461, pruned_loss=0.04617, over 4702339.52 frames. ], batch size: 85, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:44:23,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 12:44:25,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:44:25,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=883940.0, ans=0.05 2023-10-02 12:44:26,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:44:26,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:44:30,161 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 12:44:31,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:44:34,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 12:44:35,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 12:44:35,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:44:37,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:44:38,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:44:39,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 12:44:41,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 12:44:45,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:44:45,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:44:49,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:44:51,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 12:44:53,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:44:55,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=884073.3333333334, ans=0.1 2023-10-02 12:44:55,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=884073.3333333334, ans=15.0 2023-10-02 12:44:56,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:56,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 12:44:59,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:44:59,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:44:59,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 12:45:01,295 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 12:45:02,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:45:02,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 12:45:02,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 12:45:05,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:45:12,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:45:14,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 12:45:15,026 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 12:45:15,042 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 12:45:15,847 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.34 vs. limit=22.5 2023-10-02 12:45:17,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 12:45:17,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:45:19,064 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.925e+02 2.148e+02 2.500e+02 3.994e+02, threshold=4.296e+02, percent-clipped=1.0 2023-10-02 12:45:19,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 12:45:23,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 12:45:26,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:45:27,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:45:30,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 12:45:32,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 12:45:33,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 12:45:37,765 INFO [train.py:1046] (1/4) Epoch 25, batch 5150, loss[loss=0.1671, simple_loss=0.2476, pruned_loss=0.04326, over 23324.00 frames. ], tot_loss[loss=0.17, simple_loss=0.247, pruned_loss=0.04649, over 4706680.28 frames. ], batch size: 93, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:45:37,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:45:37,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:45:37,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:45:39,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:45:39,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 12:45:39,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:45:40,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 12:45:40,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 12:45:41,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 12:45:41,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:45:43,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 12:45:44,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=884273.3333333334, ans=0.0 2023-10-02 12:45:45,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:45:46,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 12:45:48,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:45:48,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:45:53,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:45:53,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 12:45:53,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:45:55,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:45:58,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:45:58,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:45:58,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:45:58,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:45:58,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:45:59,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 12:46:01,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:46:01,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:46:04,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:46:07,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 12:46:07,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=884406.6666666666, ans=0.125 2023-10-02 12:46:08,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:46:11,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:46:15,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 12:46:18,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:46:24,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:46:25,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:46:29,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:46:29,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:46:31,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 12:46:35,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:46:36,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:46:36,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:46:40,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:46:42,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:46:43,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 12:46:45,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:46:45,773 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.94 vs. limit=15.0 2023-10-02 12:46:48,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:46:51,468 INFO [train.py:1046] (1/4) Epoch 25, batch 5200, loss[loss=0.1714, simple_loss=0.2346, pruned_loss=0.05412, over 23474.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2472, pruned_loss=0.04657, over 4715897.31 frames. ], batch size: 256, lr: 4.06e-03, grad_scale: 32.0 2023-10-02 12:46:51,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:46:51,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:46:52,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 12:46:52,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:46:52,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:46:54,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:46:57,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:46:59,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:47:00,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:47:03,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 12:47:05,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:47:05,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:47:08,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:47:09,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:47:09,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:47:11,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 12:47:12,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:47:14,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:47:17,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 12:47:20,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:47:21,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:47:23,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 12:47:23,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 12:47:24,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 12:47:26,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:47:26,077 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 12:47:26,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:47:29,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:47:29,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:47:29,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 12:47:29,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=884740.0, ans=0.1 2023-10-02 12:47:30,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:47:31,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:47:33,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=884740.0, ans=0.0 2023-10-02 12:47:34,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 12:47:34,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 12:47:34,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 12:47:36,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=884806.6666666666, ans=0.5 2023-10-02 12:47:40,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 12:47:40,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:47:44,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:47:45,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=884806.6666666666, ans=0.2 2023-10-02 12:47:45,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=884806.6666666666, ans=0.125 2023-10-02 12:47:46,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:47:48,187 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.453e+02 1.909e+02 2.152e+02 2.481e+02 3.751e+02, threshold=4.304e+02, percent-clipped=0.0 2023-10-02 12:47:48,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 12:47:48,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:47:48,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 12:47:48,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:47:50,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:47:51,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:47:53,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:47:55,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:47:57,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:47:57,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:48:00,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:48:02,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 12:48:03,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:48:04,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:48:06,305 INFO [train.py:1046] (1/4) Epoch 25, batch 5250, loss[loss=0.182, simple_loss=0.2687, pruned_loss=0.04762, over 24029.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2469, pruned_loss=0.04644, over 4706033.43 frames. ], batch size: 80, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:48:06,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:48:06,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 12:48:07,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:48:09,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:48:13,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:48:13,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:48:14,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:48:19,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:48:22,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:48:23,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:48:24,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:48:26,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 12:48:26,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:48:26,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:48:28,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=885006.6666666666, ans=0.125 2023-10-02 12:49:02,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=885206.6666666666, ans=0.1 2023-10-02 12:49:14,853 INFO [train.py:1046] (1/4) Epoch 25, batch 5300, loss[loss=0.1897, simple_loss=0.2507, pruned_loss=0.06434, over 23794.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2453, pruned_loss=0.04627, over 4707864.97 frames. ], batch size: 164, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:49:20,879 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.12 vs. limit=15.0 2023-10-02 12:49:25,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=885273.3333333334, ans=0.125 2023-10-02 12:49:28,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=885340.0, ans=0.125 2023-10-02 12:49:28,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=885340.0, ans=0.2 2023-10-02 12:49:29,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:49:29,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 12:49:29,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 12:49:29,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:49:29,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:49:29,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:49:29,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:49:29,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:49:30,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:49:30,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:49:30,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 12:49:30,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:49:30,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 12:49:30,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 12:49:30,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 12:49:30,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 12:49:30,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 12:49:30,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 12:49:30,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:49:31,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:49:31,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:49:31,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:49:31,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:49:31,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:49:31,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:49:31,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:49:32,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:49:32,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:49:32,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:49:32,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:49:32,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:49:32,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 12:49:32,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:49:33,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:49:33,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 12:49:33,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 12:49:33,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:49:33,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:49:33,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 12:49:33,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 12:49:33,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:49:33,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:49:33,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:49:34,011 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 12:49:34,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 12:49:34,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:49:34,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:49:34,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 12:49:34,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 12:49:34,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 12:49:35,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:49:40,832 INFO [train.py:1046] (1/4) Epoch 26, batch 0, loss[loss=0.1629, simple_loss=0.243, pruned_loss=0.04139, over 24480.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.243, pruned_loss=0.04139, over 24480.00 frames. ], batch size: 63, lr: 3.98e-03, grad_scale: 32.0 2023-10-02 12:49:40,832 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 12:49:53,975 INFO [train.py:1078] (1/4) Epoch 26, validation: loss=0.3276, simple_loss=0.28, pruned_loss=0.1876, over 1125622.00 frames. 2023-10-02 12:49:53,976 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 12:49:57,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 12:49:58,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:50:00,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:50:06,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:50:06,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:50:07,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:50:07,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 12:50:09,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 12:50:10,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:50:12,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:50:16,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:50:16,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:50:17,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:50:17,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:50:18,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 12:50:20,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:50:24,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=885486.6666666666, ans=0.0 2023-10-02 12:50:27,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:50:28,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:50:31,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 12:50:33,076 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.877e+02 2.109e+02 2.369e+02 3.021e+02, threshold=4.218e+02, percent-clipped=0.0 2023-10-02 12:50:36,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:50:36,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:50:38,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:50:42,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:50:45,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:50:50,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 12:50:54,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 12:50:54,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:50:54,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:50:55,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:50:55,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:50:58,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 12:51:00,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:51:00,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:51:02,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:51:06,257 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 12:51:07,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:51:08,980 INFO [train.py:1046] (1/4) Epoch 26, batch 50, loss[loss=0.1743, simple_loss=0.2536, pruned_loss=0.04749, over 23369.00 frames. ], tot_loss[loss=0.168, simple_loss=0.247, pruned_loss=0.04446, over 1070261.93 frames. ], batch size: 106, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:51:12,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:51:13,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:51:13,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 12:51:15,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:51:15,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:51:16,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:51:17,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:51:19,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:51:19,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=885686.6666666666, ans=0.125 2023-10-02 12:51:21,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 12:51:21,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:51:24,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=885753.3333333334, ans=0.125 2023-10-02 12:51:28,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:51:30,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=885753.3333333334, ans=0.125 2023-10-02 12:51:31,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 12:51:32,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 12:51:34,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:51:36,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:51:36,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:51:37,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:51:37,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:51:38,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:51:38,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:51:46,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:51:47,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:51:47,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:51:48,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 12:51:51,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:51:53,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:51:53,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 12:51:53,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:51:54,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 12:52:02,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:52:02,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:52:04,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:52:04,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:52:04,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:52:07,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 12:52:07,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 12:52:09,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:52:09,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:52:09,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:52:10,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:52:10,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 12:52:10,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 12:52:12,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 12:52:15,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:52:15,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:52:16,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 12:52:16,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 12:52:17,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:52:18,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:52:19,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:52:19,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=885953.3333333334, ans=0.1 2023-10-02 12:52:20,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:52:22,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:52:23,552 INFO [train.py:1046] (1/4) Epoch 26, batch 100, loss[loss=0.1587, simple_loss=0.2381, pruned_loss=0.03965, over 23479.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.247, pruned_loss=0.04494, over 1885610.93 frames. ], batch size: 134, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:52:26,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:52:26,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=886020.0, ans=0.125 2023-10-02 12:52:29,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:52:32,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 12:52:32,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:52:35,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:52:36,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:52:36,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:52:36,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:52:36,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:52:38,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 12:52:39,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:52:39,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:52:40,320 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=16.69 vs. limit=15.0 2023-10-02 12:52:41,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:52:41,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:52:45,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 12:52:46,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:52:47,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:52:49,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:52:50,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:52:54,669 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 12:52:54,695 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 12:52:54,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=886153.3333333334, ans=0.5 2023-10-02 12:52:56,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:52:56,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:52:59,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:52:59,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=886153.3333333334, ans=0.125 2023-10-02 12:53:02,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:53:04,021 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.882e+02 2.043e+02 2.257e+02 3.571e+02, threshold=4.085e+02, percent-clipped=0.0 2023-10-02 12:53:04,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:08,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:09,997 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 12:53:11,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 12:53:15,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:53:16,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:53:19,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:22,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:53:24,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:53:26,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:53:30,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:30,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:53:31,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:53:31,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:53:31,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:33,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 12:53:33,067 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 12:53:33,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:53:35,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:53:35,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:35,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:53:35,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 12:53:35,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:53:36,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:53:36,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:36,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:53:37,848 INFO [train.py:1046] (1/4) Epoch 26, batch 150, loss[loss=0.1766, simple_loss=0.2667, pruned_loss=0.0432, over 24655.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2469, pruned_loss=0.04456, over 2523075.84 frames. ], batch size: 73, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:53:39,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:53:39,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:53:39,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:53:42,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:53:45,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:53:45,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:53:46,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:49,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:53:49,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:52,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:53:53,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:53,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=886420.0, ans=0.125 2023-10-02 12:53:57,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 12:53:57,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 12:53:57,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 12:54:00,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:54:00,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:54:00,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:54:01,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:54:01,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:54:02,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:54:02,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:54:03,554 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 12:54:04,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:54:12,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:54:15,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:54:16,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 12:54:19,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:54:19,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:54:19,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:54:22,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:54:24,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:54:24,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:54:25,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:54:25,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=886553.3333333334, ans=0.125 2023-10-02 12:54:26,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 12:54:30,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:54:30,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:54:30,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:54:31,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=886553.3333333334, ans=0.0 2023-10-02 12:54:32,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:54:33,087 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.14 vs. limit=22.5 2023-10-02 12:54:34,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:54:37,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 12:54:38,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:54:40,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:54:40,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:54:42,048 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.00 vs. limit=15.0 2023-10-02 12:54:43,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:54:43,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 12:54:43,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=886620.0, ans=0.125 2023-10-02 12:54:44,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:54:44,770 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 12:54:49,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:54:50,663 INFO [train.py:1046] (1/4) Epoch 26, batch 200, loss[loss=0.1536, simple_loss=0.2273, pruned_loss=0.03989, over 24454.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.248, pruned_loss=0.04533, over 3021563.26 frames. ], batch size: 58, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:54:52,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:54:52,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:54:55,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 12:54:55,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:54:56,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:54:58,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 12:55:00,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 12:55:02,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=886686.6666666666, ans=0.125 2023-10-02 12:55:03,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:55:03,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:55:08,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:55:08,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:55:10,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:55:23,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=886820.0, ans=0.0 2023-10-02 12:55:26,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:55:26,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:55:28,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:55:30,115 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.774e+02 1.970e+02 2.261e+02 2.926e+02, threshold=3.941e+02, percent-clipped=0.0 2023-10-02 12:55:30,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:55:31,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 12:55:31,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:55:32,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:55:34,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:55:34,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:55:34,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:55:35,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 12:55:36,227 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.87 vs. limit=22.5 2023-10-02 12:55:37,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:55:37,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:55:40,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:55:47,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:55:48,298 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.85 vs. limit=15.0 2023-10-02 12:55:55,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:55:55,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:55:56,433 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.57 vs. limit=15.0 2023-10-02 12:56:01,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:56:03,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 12:56:04,297 INFO [train.py:1046] (1/4) Epoch 26, batch 250, loss[loss=0.154, simple_loss=0.2227, pruned_loss=0.0427, over 23558.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2469, pruned_loss=0.04513, over 3407335.21 frames. ], batch size: 135, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:56:04,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:56:04,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:56:04,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:56:05,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:56:05,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 12:56:07,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:56:09,076 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 12:56:10,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:56:10,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:56:10,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=887020.0, ans=0.0 2023-10-02 12:56:11,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:56:11,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:56:12,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=887020.0, ans=0.0 2023-10-02 12:56:15,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:56:15,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:56:16,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:56:18,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=887086.6666666666, ans=0.125 2023-10-02 12:56:21,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:56:30,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:56:32,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:56:32,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:56:40,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:56:41,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:56:41,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:56:42,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:56:42,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:56:43,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:56:44,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:56:47,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:56:47,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=887220.0, ans=0.125 2023-10-02 12:56:49,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 12:56:49,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.84 vs. limit=10.0 2023-10-02 12:56:50,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:56:51,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:56:52,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:56:52,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:56:53,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:56:53,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:56:55,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:56:56,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:56:57,122 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.69 vs. limit=15.0 2023-10-02 12:56:57,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:56:58,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=887220.0, ans=0.0 2023-10-02 12:56:58,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=887220.0, ans=0.1 2023-10-02 12:56:59,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:57:02,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:57:05,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:57:10,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:57:13,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:57:14,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:57:15,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 12:57:17,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:57:17,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:57:18,992 INFO [train.py:1046] (1/4) Epoch 26, batch 300, loss[loss=0.1506, simple_loss=0.2083, pruned_loss=0.04642, over 19147.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2455, pruned_loss=0.04471, over 3699009.14 frames. ], batch size: 388, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 12:57:19,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 12:57:19,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:57:22,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:57:22,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 12:57:22,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=887353.3333333334, ans=0.125 2023-10-02 12:57:27,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:57:29,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:57:32,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:57:32,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 12:57:34,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:57:35,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:57:35,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 12:57:35,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:57:40,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:57:44,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:57:44,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 12:57:46,308 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=7.00 vs. limit=12.0 2023-10-02 12:57:49,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 12:57:49,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:57:51,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:57:53,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:57:53,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 12:57:53,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:57:55,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:57:56,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:57:57,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:57:59,150 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.798e+02 1.983e+02 2.227e+02 3.244e+02, threshold=3.967e+02, percent-clipped=0.0 2023-10-02 12:58:01,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 12:58:01,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 12:58:02,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:58:05,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:58:05,999 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.22 vs. limit=22.5 2023-10-02 12:58:06,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 12:58:06,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:58:10,977 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.41 vs. limit=15.0 2023-10-02 12:58:11,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:58:14,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:58:14,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 12:58:18,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:58:18,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:58:21,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:58:21,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:58:23,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 12:58:23,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:58:25,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:58:26,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 12:58:26,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:58:26,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:26,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=887620.0, ans=0.125 2023-10-02 12:58:28,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:58:29,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:58:29,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=887620.0, ans=0.1 2023-10-02 12:58:30,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:33,280 INFO [train.py:1046] (1/4) Epoch 26, batch 350, loss[loss=0.1808, simple_loss=0.2477, pruned_loss=0.05696, over 23836.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2442, pruned_loss=0.04454, over 3921359.82 frames. ], batch size: 195, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 12:58:35,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:58:35,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 12:58:36,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=887686.6666666666, ans=0.125 2023-10-02 12:58:38,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:39,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=887686.6666666666, ans=0.125 2023-10-02 12:58:44,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:58:47,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:58:47,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:50,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 12:58:51,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:58:51,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 12:58:54,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:54,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 12:58:54,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:58:57,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 12:58:59,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:59:00,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:59:00,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:59:02,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:59:03,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:59:03,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:59:03,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:59:05,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:59:06,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:59:06,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:59:12,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:59:12,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:59:12,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:59:14,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:59:19,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=887886.6666666666, ans=0.2 2023-10-02 12:59:20,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 12:59:20,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:59:24,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:59:24,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:59:24,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:59:25,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 12:59:28,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:59:29,363 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 12:59:30,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 12:59:30,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:59:33,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:59:33,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 12:59:35,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:59:40,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:59:41,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:59:42,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:59:42,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:59:44,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:59:48,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:59:49,264 INFO [train.py:1046] (1/4) Epoch 26, batch 400, loss[loss=0.1648, simple_loss=0.242, pruned_loss=0.04381, over 23512.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2442, pruned_loss=0.04448, over 4098937.93 frames. ], batch size: 134, lr: 3.97e-03, grad_scale: 32.0 2023-10-02 12:59:49,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=888020.0, ans=0.125 2023-10-02 12:59:50,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:59:51,168 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.67 vs. limit=12.0 2023-10-02 12:59:51,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 12:59:51,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:59:53,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:59:56,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:59:56,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:59:56,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=888020.0, ans=0.0 2023-10-02 12:59:58,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:00:00,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:00:00,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=888020.0, ans=0.125 2023-10-02 13:00:02,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 13:00:03,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 13:00:03,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:00:03,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 13:00:05,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:00:09,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:00:09,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:00:09,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 13:00:11,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:00:11,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:00:11,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:00:12,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:00:15,747 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 13:00:15,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 13:00:20,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:00:21,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:00:23,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 13:00:23,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 13:00:23,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=888153.3333333334, ans=0.1 2023-10-02 13:00:26,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:00:26,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=888153.3333333334, ans=0.2 2023-10-02 13:00:28,742 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.453e+02 1.840e+02 2.026e+02 2.260e+02 3.455e+02, threshold=4.051e+02, percent-clipped=0.0 2023-10-02 13:00:28,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:00:35,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 13:00:37,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:00:38,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=888220.0, ans=0.125 2023-10-02 13:00:39,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 13:00:42,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:00:43,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:00:43,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 13:00:46,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:00:48,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 13:00:51,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:00:51,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=888286.6666666666, ans=0.0 2023-10-02 13:00:54,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:00:54,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 13:00:55,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 13:00:55,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 13:00:58,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:00:58,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:00:58,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=888286.6666666666, ans=0.2 2023-10-02 13:01:00,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 13:01:00,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:01:02,156 INFO [train.py:1046] (1/4) Epoch 26, batch 450, loss[loss=0.1869, simple_loss=0.2542, pruned_loss=0.0598, over 22700.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2451, pruned_loss=0.04497, over 4239121.13 frames. ], batch size: 322, lr: 3.97e-03, grad_scale: 32.0 2023-10-02 13:01:02,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:01:02,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:01:03,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 13:01:05,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:01:05,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:01:07,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:01:07,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 13:01:07,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:01:08,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:01:10,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:01:22,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:01:22,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:01:25,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 13:01:25,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 13:01:28,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:01:29,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:01:30,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:01:33,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:01:35,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:01:38,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 13:01:38,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 13:01:39,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 13:01:39,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:01:41,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:01:42,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:01:43,946 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 13:01:43,955 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 13:01:43,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:01:45,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:01:47,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 13:01:50,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:01:50,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:01:50,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 13:01:51,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 13:01:54,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:01:55,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:01:55,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:01:57,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 13:02:01,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:02:02,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 13:02:03,477 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.80 vs. limit=22.5 2023-10-02 13:02:04,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 13:02:04,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:02:08,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:02:10,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:02:10,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:02:10,690 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 13:02:10,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=888620.0, ans=0.035 2023-10-02 13:02:13,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:02:15,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:02:15,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=888686.6666666666, ans=0.2 2023-10-02 13:02:16,861 INFO [train.py:1046] (1/4) Epoch 26, batch 500, loss[loss=0.1687, simple_loss=0.2547, pruned_loss=0.04137, over 23986.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2454, pruned_loss=0.04501, over 4338108.91 frames. ], batch size: 80, lr: 3.97e-03, grad_scale: 32.0 2023-10-02 13:02:16,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:02:16,946 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 13:02:18,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 13:02:18,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:02:21,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 13:02:26,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 13:02:27,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:02:29,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:02:29,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:02:30,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:02:38,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:02:38,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:02:38,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 13:02:38,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:02:40,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 13:02:40,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:02:40,861 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.70 vs. limit=10.0 2023-10-02 13:02:41,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:02:41,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=888753.3333333334, ans=0.0 2023-10-02 13:02:43,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:02:44,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:02:44,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:02:44,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 13:02:49,390 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 13:02:52,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:02:53,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:02:55,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:02:55,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:02:56,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:02:57,650 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.502e+02 1.792e+02 1.964e+02 2.180e+02 2.813e+02, threshold=3.927e+02, percent-clipped=0.0 2023-10-02 13:02:57,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 13:03:00,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:03:00,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:03:05,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:03:08,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:03:13,053 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.45 vs. limit=10.0 2023-10-02 13:03:16,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:03:21,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 13:03:21,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:03:21,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:03:23,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 13:03:23,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:03:25,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:03:30,036 INFO [train.py:1046] (1/4) Epoch 26, batch 550, loss[loss=0.1819, simple_loss=0.2578, pruned_loss=0.05304, over 23390.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.246, pruned_loss=0.0454, over 4430120.23 frames. ], batch size: 285, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:03:30,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 13:03:31,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 13:03:31,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:03:31,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 13:03:32,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:03:32,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:03:34,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:03:35,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:03:35,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:03:35,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:03:38,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:03:38,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 13:03:38,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:03:44,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:03:44,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:03:47,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:03:47,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:03:52,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 13:03:52,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 13:03:55,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:03:59,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:03:59,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:04:02,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:04:03,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:04:03,966 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 13:04:05,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:04:05,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=889153.3333333334, ans=0.5 2023-10-02 13:04:06,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 13:04:08,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:04:10,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:04:10,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:04:11,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:04:12,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 13:04:14,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 13:04:15,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:04:15,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:04:15,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:04:15,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:04:19,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:04:19,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:04:19,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=889220.0, ans=0.125 2023-10-02 13:04:22,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:04:24,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:04:24,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 13:04:25,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:04:26,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:04:28,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 13:04:28,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:04:29,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:04:29,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 13:04:34,325 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.95 vs. limit=15.0 2023-10-02 13:04:35,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 13:04:39,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 13:04:41,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:04:41,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:04:41,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:04:44,872 INFO [train.py:1046] (1/4) Epoch 26, batch 600, loss[loss=0.1499, simple_loss=0.2311, pruned_loss=0.03433, over 24311.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2458, pruned_loss=0.04501, over 4499931.95 frames. ], batch size: 61, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:04:50,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:04:52,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=889353.3333333334, ans=0.125 2023-10-02 13:04:54,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 13:04:54,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 13:04:55,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:04:57,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:04:58,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:05:01,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 13:05:02,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:05:07,422 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.07 vs. limit=10.0 2023-10-02 13:05:08,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 13:05:10,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:05:10,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:05:10,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:05:11,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=889420.0, ans=0.125 2023-10-02 13:05:18,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:05:18,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:05:18,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:05:20,434 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=889486.6666666666, ans=0.125 2023-10-02 13:05:21,276 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.20 vs. limit=6.0 2023-10-02 13:05:22,275 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.26 vs. limit=15.0 2023-10-02 13:05:25,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:05:28,840 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.438e+02 1.874e+02 2.049e+02 2.312e+02 3.828e+02, threshold=4.099e+02, percent-clipped=0.0 2023-10-02 13:05:29,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:05:29,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:05:29,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:05:35,315 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.82 vs. limit=15.0 2023-10-02 13:05:37,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 13:05:42,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:05:42,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:05:47,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 13:05:47,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:05:50,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 13:05:52,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:05:52,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:05:58,316 INFO [train.py:1046] (1/4) Epoch 26, batch 650, loss[loss=0.1617, simple_loss=0.2193, pruned_loss=0.05204, over 22671.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.245, pruned_loss=0.04513, over 4542968.75 frames. ], batch size: 322, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:05:58,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 13:05:59,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 13:06:01,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:06:02,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:06:05,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:06,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 13:06:08,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:06:12,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:06:12,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:06:15,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=889753.3333333334, ans=0.0 2023-10-02 13:06:17,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:06:20,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 13:06:21,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:06:22,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:06:25,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:06:25,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=889753.3333333334, ans=0.0 2023-10-02 13:06:26,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 13:06:28,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:06:28,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:29,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 13:06:31,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:32,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:06:33,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 13:06:35,205 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 13:06:35,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:06:35,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:06:38,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:39,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:06:39,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:06:39,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:06:40,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 13:06:40,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:06:41,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=889886.6666666666, ans=0.07 2023-10-02 13:06:42,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:06:43,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:06:43,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:06:45,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:06:45,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 13:06:47,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 13:06:47,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:47,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:06:47,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:06:47,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:06:49,289 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.52 vs. limit=12.0 2023-10-02 13:06:49,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:06:55,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:55,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:06:57,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:07:00,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:07:00,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:07:00,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:07:07,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:07:08,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:07:08,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:07:08,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:07:08,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=889953.3333333334, ans=0.125 2023-10-02 13:07:11,131 INFO [train.py:1046] (1/4) Epoch 26, batch 700, loss[loss=0.1519, simple_loss=0.2194, pruned_loss=0.04218, over 18565.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2435, pruned_loss=0.04505, over 4565650.73 frames. ], batch size: 40, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:07:14,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 13:07:17,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 13:07:19,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 13:07:19,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:07:20,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:07:22,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 13:07:22,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=890020.0, ans=0.0 2023-10-02 13:07:26,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:07:28,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:07:30,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:07:31,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:07:31,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:07:33,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:07:37,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 13:07:37,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:07:39,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 13:07:42,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 13:07:43,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=890153.3333333334, ans=0.0 2023-10-02 13:07:45,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:07:46,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:07:47,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:07:47,599 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:07:49,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.70 vs. limit=15.0 2023-10-02 13:07:51,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:07:51,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 13:07:56,015 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.890e+02 2.209e+02 2.852e+02 4.841e+02, threshold=4.419e+02, percent-clipped=5.0 2023-10-02 13:07:56,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:07:57,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:07:57,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 13:08:03,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:08:04,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:08:07,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:08:11,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:08:13,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 13:08:16,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 13:08:17,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 13:08:18,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:08:19,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:08:20,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:08:22,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:08:22,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 13:08:25,906 INFO [train.py:1046] (1/4) Epoch 26, batch 750, loss[loss=0.158, simple_loss=0.2364, pruned_loss=0.03983, over 24348.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2429, pruned_loss=0.045, over 4589425.50 frames. ], batch size: 61, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:08:26,814 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.19 vs. limit=22.5 2023-10-02 13:08:27,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 13:08:28,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 13:08:28,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 13:08:28,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 13:08:29,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 13:08:29,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:08:30,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=890353.3333333334, ans=0.09899494936611666 2023-10-02 13:08:31,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 13:08:33,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:08:33,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:08:34,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:08:37,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:08:37,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:08:37,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:08:37,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=890353.3333333334, ans=0.125 2023-10-02 13:08:39,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:08:40,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:08:43,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:08:46,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:08:46,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:08:46,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 13:08:47,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=890420.0, ans=0.125 2023-10-02 13:08:47,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=890420.0, ans=0.0 2023-10-02 13:08:47,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=890420.0, ans=0.125 2023-10-02 13:08:49,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:08:49,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:08:50,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:08:52,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:08:53,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 13:08:53,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:08:55,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 13:08:55,136 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 13:08:56,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 13:08:56,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:08:56,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 13:08:59,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:09:07,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:09:07,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:09:07,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:09:08,007 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.98 vs. limit=15.0 2023-10-02 13:09:09,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:09:10,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:09:10,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 13:09:10,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:09:11,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 13:09:11,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:09:15,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:09:15,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 13:09:16,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:09:19,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:09:19,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=890553.3333333334, ans=0.125 2023-10-02 13:09:22,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:09:22,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:09:23,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:09:28,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 13:09:28,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:09:28,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:09:30,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:09:30,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:09:34,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:09:34,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:09:39,951 INFO [train.py:1046] (1/4) Epoch 26, batch 800, loss[loss=0.157, simple_loss=0.2322, pruned_loss=0.04086, over 20859.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2438, pruned_loss=0.04553, over 4610256.46 frames. ], batch size: 45, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 13:09:43,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:09:43,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:09:45,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:09:45,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:09:48,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:09:48,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:09:49,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:09:53,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:09:53,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:09:56,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 13:09:57,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:09:58,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:09:58,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:10:00,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:10:00,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 13:10:01,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:10:01,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 13:10:03,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:10:04,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=890753.3333333334, ans=15.0 2023-10-02 13:10:05,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:10:06,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:10:06,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:10:10,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:10:10,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:10:12,165 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.52 vs. limit=6.0 2023-10-02 13:10:12,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:10:14,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:10:14,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 13:10:17,308 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 13:10:17,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 13:10:17,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:10:18,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:10:20,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:10:20,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:10:24,220 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.808e+02 1.916e+02 2.137e+02 2.899e+02, threshold=3.832e+02, percent-clipped=0.0 2023-10-02 13:10:25,737 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 13:10:25,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 13:10:27,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:10:29,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:10:31,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=890886.6666666666, ans=0.125 2023-10-02 13:10:32,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:10:37,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:10:39,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 13:10:39,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:10:41,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 13:10:43,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=890953.3333333334, ans=0.2 2023-10-02 13:10:43,815 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.25 vs. limit=15.0 2023-10-02 13:10:47,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:10:50,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:10:50,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 13:10:51,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:10:53,189 INFO [train.py:1046] (1/4) Epoch 26, batch 850, loss[loss=0.1805, simple_loss=0.2513, pruned_loss=0.05487, over 23802.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2452, pruned_loss=0.04597, over 4631795.53 frames. ], batch size: 212, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 13:10:53,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:10:54,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 13:10:54,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:10:55,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:10:56,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=891020.0, ans=0.125 2023-10-02 13:10:57,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:10:58,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:11:00,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:11:01,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 13:11:02,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 13:11:02,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 13:11:04,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:11:04,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:11:07,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:11:07,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:11:08,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:11:12,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:11:12,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:11:12,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 13:11:15,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 13:11:18,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:11:19,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 13:11:22,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 13:11:24,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 13:11:25,521 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 13:11:25,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:11:25,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:11:25,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 13:11:28,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:11:31,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:11:32,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 13:11:34,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:11:34,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:11:37,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:11:37,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:11:38,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:11:40,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:11:40,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 13:11:46,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:11:46,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:11:47,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:11:47,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:11:47,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:11:49,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:11:51,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:11:53,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:11:53,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:11:55,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:11:58,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=891286.6666666666, ans=0.1 2023-10-02 13:11:59,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 13:12:00,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:12:02,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 13:12:02,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:12:02,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:12:02,802 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:12:05,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 13:12:07,824 INFO [train.py:1046] (1/4) Epoch 26, batch 900, loss[loss=0.187, simple_loss=0.258, pruned_loss=0.05795, over 23570.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2466, pruned_loss=0.0466, over 4648679.90 frames. ], batch size: 256, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 13:12:10,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:12:13,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:12:13,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 13:12:16,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:12:16,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 13:12:18,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 13:12:19,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:12:19,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:12:21,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:12:21,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:12:24,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=891420.0, ans=0.1 2023-10-02 13:12:29,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:12:29,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:12:31,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:12:32,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:12:39,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 13:12:41,027 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.07 vs. limit=15.0 2023-10-02 13:12:41,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:12:46,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:12:46,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:12:46,621 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 13:12:47,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 13:12:51,898 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.855e+02 2.022e+02 2.290e+02 3.129e+02, threshold=4.044e+02, percent-clipped=0.0 2023-10-02 13:12:55,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:12:55,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:12:55,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:13:00,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:13:00,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:13:03,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 13:13:03,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:13:03,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 13:13:06,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:13:06,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:13:08,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:13:08,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:13:12,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 13:13:12,750 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 13:13:13,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=891620.0, ans=0.125 2023-10-02 13:13:16,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 13:13:16,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 13:13:18,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:13:20,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 13:13:21,666 INFO [train.py:1046] (1/4) Epoch 26, batch 950, loss[loss=0.163, simple_loss=0.2376, pruned_loss=0.04426, over 24280.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2467, pruned_loss=0.04679, over 4666386.14 frames. ], batch size: 56, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:13:26,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:13:29,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:13:29,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:13:30,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 13:13:33,411 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 13:13:36,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:13:36,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:13:37,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:13:37,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:13:37,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 13:13:37,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=891753.3333333334, ans=0.1 2023-10-02 13:13:39,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 13:13:42,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:13:42,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 13:13:44,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:13:45,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:13:45,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:13:45,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:13:47,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 13:13:49,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 13:13:50,427 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.58 vs. limit=15.0 2023-10-02 13:13:51,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:13:52,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:13:57,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:13:58,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:13:58,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=891820.0, ans=0.0 2023-10-02 13:14:00,569 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=12.17 vs. limit=15.0 2023-10-02 13:14:02,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 13:14:04,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 13:14:04,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:14:04,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:14:04,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:14:04,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:14:08,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 13:14:10,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:14:14,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:14:14,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:14:14,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 13:14:14,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:14:14,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:14:16,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 13:14:19,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:14:20,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:14:26,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:14:29,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 13:14:29,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 13:14:32,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:14:34,914 INFO [train.py:1046] (1/4) Epoch 26, batch 1000, loss[loss=0.1725, simple_loss=0.262, pruned_loss=0.04146, over 24428.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2455, pruned_loss=0.04635, over 4674513.45 frames. ], batch size: 69, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:14:35,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 13:14:36,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:14:42,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:14:43,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 13:14:43,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 13:14:47,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=892020.0, ans=0.025 2023-10-02 13:14:49,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:14:49,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:14:50,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:14:55,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 13:14:58,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 13:14:59,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 13:14:59,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:15:02,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 13:15:03,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 13:15:03,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 13:15:04,489 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.00 vs. limit=12.0 2023-10-02 13:15:05,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:15:06,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:12,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:15:13,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:15:15,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:15,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:15:15,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 13:15:15,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:15:17,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:15:18,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:15:19,620 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.933e+02 2.169e+02 2.725e+02 4.611e+02, threshold=4.339e+02, percent-clipped=3.0 2023-10-02 13:15:19,709 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 13:15:23,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 13:15:24,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 13:15:25,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 13:15:26,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:15:34,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:34,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:15:34,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:35,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:15:37,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 13:15:38,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:15:38,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 13:15:40,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 13:15:41,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:15:41,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:15:43,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:15:46,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:15:46,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:15:50,044 INFO [train.py:1046] (1/4) Epoch 26, batch 1050, loss[loss=0.1612, simple_loss=0.2394, pruned_loss=0.0415, over 24460.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2446, pruned_loss=0.04625, over 4679080.71 frames. ], batch size: 58, lr: 3.96e-03, grad_scale: 8.0 2023-10-02 13:15:51,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:15:51,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:15:54,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 13:15:54,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:55,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:15:58,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:16:00,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:16:03,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:16:04,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:16:04,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:16:04,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:16:04,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=892420.0, ans=0.07 2023-10-02 13:16:06,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 13:16:06,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:16:06,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 13:16:10,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:16:10,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 13:16:10,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:16:16,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:16:16,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:16:16,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:16:20,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 13:16:21,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 13:16:21,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:16:25,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 13:16:28,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 13:16:29,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:16:32,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 13:16:35,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 13:16:35,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:16:36,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:16:40,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:16:43,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 13:16:45,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 13:16:45,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 13:16:47,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:16:47,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:16:48,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 13:16:50,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=892620.0, ans=0.125 2023-10-02 13:16:52,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:16:55,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:16:55,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:16:55,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:16:56,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:00,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:00,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 13:17:00,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:17:02,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 13:17:02,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 13:17:04,037 INFO [train.py:1046] (1/4) Epoch 26, batch 1100, loss[loss=0.1445, simple_loss=0.22, pruned_loss=0.03452, over 22206.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2442, pruned_loss=0.04559, over 4692048.90 frames. ], batch size: 49, lr: 3.96e-03, grad_scale: 8.0 2023-10-02 13:17:04,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:17:06,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:17:11,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:17:15,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:17:16,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:17:16,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:17:16,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 13:17:17,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:17:21,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 13:17:22,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:17:24,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:17:24,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 13:17:25,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 13:17:26,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:17:27,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:17:29,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:17:31,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:17:35,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:17:39,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 13:17:39,866 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 13:17:41,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:42,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:43,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:17:44,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:17:46,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 13:17:48,055 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.826e+02 2.020e+02 2.449e+02 3.878e+02, threshold=4.041e+02, percent-clipped=0.0 2023-10-02 13:17:48,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:17:48,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:17:48,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:17:50,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:50,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 13:17:54,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:17:54,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 13:17:55,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=892886.6666666666, ans=0.125 2023-10-02 13:17:57,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:18:00,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=892953.3333333334, ans=0.05 2023-10-02 13:18:01,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:18:03,822 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=892953.3333333334, ans=0.2 2023-10-02 13:18:05,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 13:18:06,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 13:18:07,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:18:07,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=892953.3333333334, ans=0.125 2023-10-02 13:18:09,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:18:10,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:18:10,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 13:18:11,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:18:11,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:18:13,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 13:18:13,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:18:13,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 13:18:14,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:18:14,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:18:16,035 INFO [train.py:1046] (1/4) Epoch 26, batch 1150, loss[loss=0.1893, simple_loss=0.2596, pruned_loss=0.05945, over 23879.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2445, pruned_loss=0.04508, over 4712299.27 frames. ], batch size: 195, lr: 3.96e-03, grad_scale: 8.0 2023-10-02 13:18:16,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:18:19,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=893020.0, ans=0.2 2023-10-02 13:18:20,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:18:21,913 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=13.89 vs. limit=15.0 2023-10-02 13:18:24,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:18:25,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:18:26,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:18:26,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 13:18:26,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:18:29,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 13:18:31,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:18:31,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:18:37,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 13:18:39,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:18:43,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:18:44,793 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.19 vs. limit=6.0 2023-10-02 13:18:45,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:18:45,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 13:18:45,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:18:45,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:18:48,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 13:18:49,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:18:51,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:19:03,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:19:05,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=893220.0, ans=0.2 2023-10-02 13:19:08,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=893220.0, ans=0.0 2023-10-02 13:19:10,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:19:10,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 13:19:10,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:19:12,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:19:16,329 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 13:19:17,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:19:23,632 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 13:19:23,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=893286.6666666666, ans=0.125 2023-10-02 13:19:28,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:19:28,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:19:29,670 INFO [train.py:1046] (1/4) Epoch 26, batch 1200, loss[loss=0.1731, simple_loss=0.2463, pruned_loss=0.04997, over 23502.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2452, pruned_loss=0.0453, over 4710245.90 frames. ], batch size: 256, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:19:29,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:19:29,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:19:34,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:19:39,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:19:39,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:19:39,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=893353.3333333334, ans=0.2 2023-10-02 13:19:41,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:19:41,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:19:41,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:19:44,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:19:46,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:19:46,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:19:46,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:19:46,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=893420.0, ans=0.125 2023-10-02 13:19:48,944 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 13:19:50,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 13:19:53,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:19:56,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:19:58,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:20:01,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:20:01,057 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 13:20:01,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:20:08,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:20:08,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:20:10,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 13:20:10,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:20:10,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=893486.6666666666, ans=0.125 2023-10-02 13:20:14,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 13:20:15,604 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.901e+02 2.098e+02 2.367e+02 3.990e+02, threshold=4.196e+02, percent-clipped=0.0 2023-10-02 13:20:18,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 13:20:18,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:20:19,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:20:21,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:20:21,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:20:21,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:20:21,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:20:23,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:20:23,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 13:20:23,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:20:23,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:20:23,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:20:24,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=893553.3333333334, ans=0.125 2023-10-02 13:20:26,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:20:26,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:20:30,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 13:20:31,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:20:33,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 13:20:38,028 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 13:20:39,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:20:43,360 INFO [train.py:1046] (1/4) Epoch 26, batch 1250, loss[loss=0.1721, simple_loss=0.2436, pruned_loss=0.05029, over 23676.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.247, pruned_loss=0.04631, over 4697969.55 frames. ], batch size: 149, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:20:43,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:20:43,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:20:44,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:20:47,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 13:20:47,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=893686.6666666666, ans=0.125 2023-10-02 13:20:52,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:20:53,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:20:53,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 13:20:54,429 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.17 vs. limit=6.0 2023-10-02 13:20:55,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:20:56,755 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.73 vs. limit=5.0 2023-10-02 13:20:57,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:20:57,833 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.56 vs. limit=15.0 2023-10-02 13:20:59,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 13:21:01,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:21:02,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:21:02,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:21:03,732 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.85 vs. limit=6.0 2023-10-02 13:21:05,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:21:06,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=893753.3333333334, ans=0.0 2023-10-02 13:21:07,305 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.09 vs. limit=15.0 2023-10-02 13:21:09,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=893753.3333333334, ans=0.125 2023-10-02 13:21:11,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 13:21:11,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:21:11,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:21:13,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:21:14,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:21:16,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:21:16,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=893820.0, ans=0.125 2023-10-02 13:21:17,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:21:21,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 13:21:21,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:21:24,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:21:26,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 13:21:26,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:21:26,391 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 13:21:28,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:21:28,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:21:30,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:21:32,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=893886.6666666666, ans=0.0 2023-10-02 13:21:33,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:21:33,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:21:35,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 13:21:35,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 13:21:36,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 13:21:37,872 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.78 vs. limit=15.0 2023-10-02 13:21:40,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:21:41,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 13:21:41,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:21:43,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 13:21:44,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:21:44,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=893953.3333333334, ans=0.125 2023-10-02 13:21:45,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 13:21:45,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 13:21:46,782 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.95 vs. limit=15.0 2023-10-02 13:21:47,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:21:47,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 13:21:48,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:21:51,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 13:21:53,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:21:53,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=893953.3333333334, ans=0.0 2023-10-02 13:21:55,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:21:55,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:21:57,244 INFO [train.py:1046] (1/4) Epoch 26, batch 1300, loss[loss=0.1837, simple_loss=0.2712, pruned_loss=0.04808, over 24354.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2475, pruned_loss=0.0463, over 4699170.46 frames. ], batch size: 77, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:21:58,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:22:02,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:22:02,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 13:22:06,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:22:08,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:22:09,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:22:11,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:22:12,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:22:14,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 13:22:18,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:22:19,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:22:21,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 13:22:23,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:22:25,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:22:27,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:22:28,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:22:30,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:22:30,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:22:31,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 13:22:31,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 13:22:37,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:22:38,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:22:40,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 13:22:42,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 13:22:43,835 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.906e+02 2.137e+02 2.529e+02 3.286e+02, threshold=4.274e+02, percent-clipped=0.0 2023-10-02 13:22:43,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:22:44,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=894220.0, ans=0.0 2023-10-02 13:22:45,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:22:45,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 13:22:46,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:22:46,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 13:22:48,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:22:51,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:22:51,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:22:55,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 13:22:55,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 13:22:57,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 13:22:59,185 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.16 vs. limit=15.0 2023-10-02 13:23:02,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:23:04,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 13:23:04,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=894286.6666666666, ans=0.125 2023-10-02 13:23:06,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:23:11,707 INFO [train.py:1046] (1/4) Epoch 26, batch 1350, loss[loss=0.1737, simple_loss=0.2618, pruned_loss=0.04276, over 24418.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2471, pruned_loss=0.04602, over 4701702.51 frames. ], batch size: 69, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:23:12,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 13:23:15,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:23:19,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:23:22,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:23:23,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:23:23,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=894353.3333333334, ans=0.125 2023-10-02 13:23:24,103 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.41 vs. limit=15.0 2023-10-02 13:23:24,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:23:25,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:23:27,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:23:30,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 13:23:30,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:23:31,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:23:33,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 13:23:33,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=894420.0, ans=0.2 2023-10-02 13:23:35,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:23:35,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=894420.0, ans=0.1 2023-10-02 13:23:36,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:23:37,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 13:23:37,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=894420.0, ans=0.0 2023-10-02 13:23:38,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 13:23:39,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 13:23:43,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:23:43,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 13:23:52,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:23:59,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:23:59,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:24:01,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 13:24:05,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:24:05,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 13:24:05,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:24:06,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:24:08,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:24:10,054 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:24:11,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 13:24:12,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=894620.0, ans=0.2 2023-10-02 13:24:14,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:24:21,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 13:24:22,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 13:24:25,931 INFO [train.py:1046] (1/4) Epoch 26, batch 1400, loss[loss=0.1727, simple_loss=0.2535, pruned_loss=0.04593, over 24455.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2461, pruned_loss=0.04532, over 4718426.12 frames. ], batch size: 63, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:24:26,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 13:24:27,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:24:31,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:24:31,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:24:37,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 13:24:39,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 13:24:45,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=894753.3333333334, ans=0.1 2023-10-02 13:24:49,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:24:51,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:24:53,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:24:53,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:24:55,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=894820.0, ans=0.125 2023-10-02 13:24:58,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:24:58,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 13:25:07,613 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:25:08,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:10,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:10,878 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.38 vs. limit=10.0 2023-10-02 13:25:11,513 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.828e+02 2.043e+02 2.426e+02 3.539e+02, threshold=4.086e+02, percent-clipped=0.0 2023-10-02 13:25:11,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=894886.6666666666, ans=0.1 2023-10-02 13:25:12,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 13:25:13,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:25:14,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:25:14,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:25:16,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:25:17,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:25:17,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:25:18,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:25:20,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 13:25:20,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:25:21,125 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.45 vs. limit=15.0 2023-10-02 13:25:23,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:24,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=894953.3333333334, ans=0.0 2023-10-02 13:25:27,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:25:31,066 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.85 vs. limit=15.0 2023-10-02 13:25:31,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=894953.3333333334, ans=0.0 2023-10-02 13:25:33,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=894953.3333333334, ans=0.1 2023-10-02 13:25:34,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 13:25:36,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 13:25:36,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:25:37,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 13:25:38,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:25:39,393 INFO [train.py:1046] (1/4) Epoch 26, batch 1450, loss[loss=0.1393, simple_loss=0.2176, pruned_loss=0.03046, over 24293.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2451, pruned_loss=0.0451, over 4712853.90 frames. ], batch size: 56, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:25:39,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:25:43,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:25:45,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:25:45,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:45,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 13:25:48,844 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.17 vs. limit=15.0 2023-10-02 13:25:49,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:25:49,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:25:50,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:25:50,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 13:25:51,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:25:52,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 13:25:52,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:54,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:25:54,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 13:25:55,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:25:55,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:25:57,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 13:25:57,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:25:58,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:25:59,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:26:02,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:26:07,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:26:07,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:26:09,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:26:09,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:26:10,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:26:10,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:26:11,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:26:12,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:26:15,666 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.06 vs. limit=6.0 2023-10-02 13:26:16,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 13:26:17,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:26:20,773 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 13:26:21,783 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.91 vs. limit=10.0 2023-10-02 13:26:22,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:26:25,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:26:26,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:26:28,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 13:26:29,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:26:31,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 13:26:32,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 13:26:35,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:26:37,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:26:37,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:26:40,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 13:26:42,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 13:26:43,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 13:26:45,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:26:46,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:26:53,479 INFO [train.py:1046] (1/4) Epoch 26, batch 1500, loss[loss=0.1903, simple_loss=0.2604, pruned_loss=0.06009, over 22853.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2451, pruned_loss=0.04548, over 4706775.42 frames. ], batch size: 322, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:26:55,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 13:26:56,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:26:56,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:26:57,484 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.10 vs. limit=15.0 2023-10-02 13:26:58,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:26:59,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:26:59,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:27:00,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 13:27:01,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:27:02,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:27:02,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:27:03,077 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.00 vs. limit=12.0 2023-10-02 13:27:04,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:27:05,822 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=895353.3333333334, ans=0.07 2023-10-02 13:27:06,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:27:08,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:27:12,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=895420.0, ans=0.125 2023-10-02 13:27:14,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:27:14,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 13:27:14,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:27:14,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:27:16,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:27:16,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=895420.0, ans=0.1 2023-10-02 13:27:18,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 13:27:21,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 13:27:24,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:27:24,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 13:27:27,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 13:27:27,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=895486.6666666666, ans=0.04949747468305833 2023-10-02 13:27:28,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:27:29,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:27:29,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:27:30,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 13:27:30,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:27:30,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:27:31,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 13:27:31,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:27:32,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=895486.6666666666, ans=0.0 2023-10-02 13:27:36,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:27:36,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 13:27:39,721 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.823e+02 1.989e+02 2.187e+02 2.730e+02, threshold=3.978e+02, percent-clipped=0.0 2023-10-02 13:27:41,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:27:44,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:27:48,593 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 13:27:48,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:27:48,648 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 13:27:50,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:27:51,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:27:52,812 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 13:27:54,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:27:57,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 13:27:58,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:28:00,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=895620.0, ans=0.0 2023-10-02 13:28:01,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:28:01,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:28:03,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:28:03,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:28:03,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:28:06,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 13:28:06,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 13:28:07,718 INFO [train.py:1046] (1/4) Epoch 26, batch 1550, loss[loss=0.1527, simple_loss=0.2404, pruned_loss=0.03256, over 24432.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2459, pruned_loss=0.04559, over 4717470.92 frames. ], batch size: 69, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:28:07,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:28:07,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 13:28:09,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 13:28:09,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=895686.6666666666, ans=0.0 2023-10-02 13:28:10,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:28:12,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:28:12,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:28:12,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:28:15,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:28:16,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:28:19,627 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 13:28:19,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:28:19,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:28:19,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:28:22,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:28:22,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 13:28:24,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=895753.3333333334, ans=0.0 2023-10-02 13:28:25,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:28:25,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 13:28:25,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 13:28:25,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 13:28:25,755 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=12.58 vs. limit=15.0 2023-10-02 13:28:26,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:28:28,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:28:32,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:28:32,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=895753.3333333334, ans=0.07 2023-10-02 13:28:33,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 13:28:33,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 13:28:40,746 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.40 vs. limit=15.0 2023-10-02 13:28:41,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:28:41,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=895820.0, ans=0.0 2023-10-02 13:28:45,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:28:45,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:28:45,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:28:47,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 13:28:48,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=895820.0, ans=0.035 2023-10-02 13:28:52,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:28:54,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:28:56,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:28:59,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:29:01,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:29:01,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 13:29:01,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:29:02,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:29:02,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:29:02,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 13:29:02,892 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 13:29:05,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:29:11,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 13:29:16,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:29:17,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:29:18,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 13:29:21,074 INFO [train.py:1046] (1/4) Epoch 26, batch 1600, loss[loss=0.1638, simple_loss=0.2362, pruned_loss=0.04569, over 23286.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2472, pruned_loss=0.04623, over 4719417.87 frames. ], batch size: 119, lr: 3.96e-03, grad_scale: 32.0 2023-10-02 13:29:21,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:29:22,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:29:22,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:29:22,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:29:25,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:29:28,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:29:28,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 13:29:29,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 13:29:29,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 13:29:31,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:29:34,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 13:29:34,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:29:36,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:29:41,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:29:44,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 13:29:44,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=896086.6666666666, ans=0.125 2023-10-02 13:29:45,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:29:45,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 13:29:47,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:29:47,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=896086.6666666666, ans=0.125 2023-10-02 13:29:49,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 13:29:52,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=896153.3333333334, ans=0.0 2023-10-02 13:29:55,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 13:30:02,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:30:02,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 13:30:04,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:30:04,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:30:04,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:30:06,791 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.834e+02 2.040e+02 2.278e+02 3.104e+02, threshold=4.080e+02, percent-clipped=0.0 2023-10-02 13:30:06,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 13:30:10,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 13:30:12,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:30:12,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:30:14,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:30:14,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:30:15,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:30:18,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:30:20,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:30:23,094 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:30:26,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:30:26,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:30:27,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 13:30:27,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:30:30,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 13:30:34,796 INFO [train.py:1046] (1/4) Epoch 26, batch 1650, loss[loss=0.1807, simple_loss=0.2586, pruned_loss=0.05142, over 23511.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2479, pruned_loss=0.04633, over 4719881.77 frames. ], batch size: 93, lr: 3.95e-03, grad_scale: 32.0 2023-10-02 13:30:36,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:30:37,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:30:37,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:30:37,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 13:30:38,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 13:30:38,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 13:30:38,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 13:30:42,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:30:43,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:30:43,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:30:43,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:30:45,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:30:46,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 13:30:47,408 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.96 vs. limit=15.0 2023-10-02 13:30:49,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:30:51,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:30:51,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:30:51,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:30:52,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 13:30:52,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 13:30:58,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:31:00,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:31:06,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=896486.6666666666, ans=0.0 2023-10-02 13:31:07,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 13:31:07,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:31:10,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 13:31:12,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:31:15,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=896486.6666666666, ans=0.125 2023-10-02 13:31:16,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:31:16,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:31:16,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:31:16,989 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.07 vs. limit=10.0 2023-10-02 13:31:19,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:31:19,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:31:22,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:31:23,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:31:23,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:31:23,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:31:25,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:31:26,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:31:27,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:31:29,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 13:31:29,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:31:29,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 13:31:33,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 13:31:33,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 13:31:33,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:31:34,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:31:35,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:31:37,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:31:37,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 13:31:37,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=896620.0, ans=0.125 2023-10-02 13:31:40,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:31:42,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:31:42,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:31:44,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 13:31:46,667 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.86 vs. limit=15.0 2023-10-02 13:31:48,797 INFO [train.py:1046] (1/4) Epoch 26, batch 1700, loss[loss=0.1481, simple_loss=0.2293, pruned_loss=0.03343, over 24317.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2473, pruned_loss=0.04633, over 4706061.13 frames. ], batch size: 61, lr: 3.95e-03, grad_scale: 32.0 2023-10-02 13:31:50,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:31:50,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:31:50,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 13:31:50,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:31:50,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:31:50,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:31:53,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:31:53,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:31:54,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 13:31:56,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:32:05,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:32:07,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:32:11,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:32:12,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:32:13,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:32:13,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:32:14,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 13:32:17,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:32:18,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:32:21,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:32:22,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:32:25,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 13:32:25,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 13:32:25,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:32:27,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 13:32:28,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:32:31,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=896820.0, ans=0.125 2023-10-02 13:32:31,916 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.04 vs. limit=15.0 2023-10-02 13:32:35,979 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.840e+02 2.040e+02 2.267e+02 3.457e+02, threshold=4.081e+02, percent-clipped=0.0 2023-10-02 13:32:37,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:32:40,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:32:40,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:32:43,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 13:32:43,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 13:32:43,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:32:46,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:32:46,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 13:32:46,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:32:46,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:32:46,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:32:46,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:32:48,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=896953.3333333334, ans=0.1 2023-10-02 13:32:49,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:32:49,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:32:49,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:32:50,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:32:50,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:32:55,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:32:55,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 13:32:58,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:32:59,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:33:00,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 13:33:04,822 INFO [train.py:1046] (1/4) Epoch 26, batch 1750, loss[loss=0.1538, simple_loss=0.2284, pruned_loss=0.03956, over 24437.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2456, pruned_loss=0.04598, over 4703092.37 frames. ], batch size: 58, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:33:06,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:33:08,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:33:08,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 13:33:10,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 13:33:10,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:33:13,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:33:13,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:33:17,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 13:33:19,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:33:22,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 13:33:22,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:33:22,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:33:25,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 13:33:26,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 13:33:29,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:33:29,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 13:33:38,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:33:40,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:33:40,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:33:43,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:33:43,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:33:45,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:33:46,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:33:46,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=897153.3333333334, ans=0.1 2023-10-02 13:33:49,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:33:49,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:33:52,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 13:33:52,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:33:55,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 13:33:56,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:33:57,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:33:59,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:34:03,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:34:05,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 13:34:06,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:34:06,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:34:10,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:34:12,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:34:14,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=897286.6666666666, ans=0.125 2023-10-02 13:34:14,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=897286.6666666666, ans=0.125 2023-10-02 13:34:15,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:34:17,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 13:34:17,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:34:18,360 INFO [train.py:1046] (1/4) Epoch 26, batch 1800, loss[loss=0.1705, simple_loss=0.2434, pruned_loss=0.04882, over 23673.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2448, pruned_loss=0.04571, over 4712209.77 frames. ], batch size: 232, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:34:18,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:34:18,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:34:18,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:34:18,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:34:18,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:34:18,728 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:34:23,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:34:23,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:34:25,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 13:34:28,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:34:30,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 13:34:31,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:34:34,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:34:35,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:34:35,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:34:37,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:34:39,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:34:39,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 13:34:39,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:34:43,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:34:44,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 13:34:45,009 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=897420.0, ans=0.1 2023-10-02 13:34:48,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 13:34:48,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 13:34:48,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:34:50,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:34:50,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:34:51,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:34:57,178 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 13:34:57,982 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.51 vs. limit=10.0 2023-10-02 13:34:58,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:34:59,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:35:01,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 13:35:01,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 13:35:03,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:35:04,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:35:05,755 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.971e+02 2.205e+02 2.598e+02 3.859e+02, threshold=4.411e+02, percent-clipped=0.0 2023-10-02 13:35:05,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:35:10,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 13:35:15,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:35:16,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 13:35:16,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:35:16,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:35:18,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:35:18,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 13:35:20,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:35:20,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:35:21,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 13:35:21,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:35:25,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:35:25,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:35:25,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:35:27,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:35:27,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:35:29,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:35:29,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:35:33,123 INFO [train.py:1046] (1/4) Epoch 26, batch 1850, loss[loss=0.1647, simple_loss=0.2403, pruned_loss=0.04455, over 22848.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2448, pruned_loss=0.04559, over 4715989.24 frames. ], batch size: 50, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:35:33,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:35:34,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:35:40,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:35:40,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 13:35:43,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=897686.6666666666, ans=0.1 2023-10-02 13:35:45,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 13:35:47,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 13:35:50,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:35:50,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 13:35:50,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 13:35:54,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=897753.3333333334, ans=0.125 2023-10-02 13:36:01,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:36:02,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 13:36:05,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:36:05,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:36:10,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 13:36:10,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:11,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:36:12,678 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.36 vs. limit=22.5 2023-10-02 13:36:13,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:36:16,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=897886.6666666666, ans=0.0 2023-10-02 13:36:17,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:36:19,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:36:21,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:36:21,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:23,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 13:36:23,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:36:24,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:36:26,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:36:29,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 13:36:30,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:36:33,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:36:34,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:36:34,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 13:36:34,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 13:36:36,281 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 13:36:36,358 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 13:36:38,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:36:38,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:36:39,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:36:39,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:40,946 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 13:36:40,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:36:41,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:42,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:36:44,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:36:45,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:36:45,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 13:36:47,271 INFO [train.py:1046] (1/4) Epoch 26, batch 1900, loss[loss=0.1628, simple_loss=0.25, pruned_loss=0.03777, over 24485.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2456, pruned_loss=0.04532, over 4720933.17 frames. ], batch size: 66, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:36:49,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:49,939 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 13:36:49,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:36:51,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:36:56,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:36:57,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=898020.0, ans=0.125 2023-10-02 13:36:58,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:36:58,930 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 13:37:00,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 13:37:00,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:37:00,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:37:00,414 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 13:37:01,774 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 13:37:02,158 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:37:04,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 13:37:05,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:37:07,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=898086.6666666666, ans=0.2 2023-10-02 13:37:10,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 13:37:12,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 13:37:23,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=898153.3333333334, ans=0.0 2023-10-02 13:37:24,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 13:37:24,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=898153.3333333334, ans=0.0 2023-10-02 13:37:25,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 13:37:25,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:37:27,137 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 13:37:27,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 13:37:28,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 13:37:28,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 13:37:28,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:37:33,868 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.10 vs. limit=15.0 2023-10-02 13:37:34,435 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 1.789e+02 2.022e+02 2.205e+02 2.844e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-02 13:37:34,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 13:37:37,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:37:40,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:37:40,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 13:37:42,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:37:47,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 13:37:47,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:37:52,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:37:52,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:37:52,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:37:54,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:37:54,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:37:55,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 13:37:55,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:37:55,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=898286.6666666666, ans=0.1 2023-10-02 13:37:59,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:37:59,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:38:01,078 INFO [train.py:1046] (1/4) Epoch 26, batch 1950, loss[loss=0.167, simple_loss=0.2484, pruned_loss=0.0428, over 24657.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2461, pruned_loss=0.04525, over 4734963.63 frames. ], batch size: 65, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:38:01,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:38:01,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:38:02,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:38:02,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:38:05,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:38:08,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:38:08,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:38:10,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:38:12,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 13:38:12,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 13:38:12,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:38:14,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:38:17,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:38:17,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:38:17,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:38:19,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:38:22,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:38:22,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:38:23,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:38:23,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:38:25,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=898420.0, ans=0.0 2023-10-02 13:38:26,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:38:30,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:38:30,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:38:30,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 13:38:30,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 13:38:31,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:38:31,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:38:32,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:38:35,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:38:39,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:38:42,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:38:44,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=898553.3333333334, ans=0.125 2023-10-02 13:38:45,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:38:45,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:38:46,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 13:38:47,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:38:50,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:38:51,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:38:51,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:39:00,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:39:00,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:39:03,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:39:04,152 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.04 vs. limit=15.0 2023-10-02 13:39:04,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:39:08,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:39:09,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:39:09,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 13:39:09,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:39:11,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:39:12,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 13:39:15,112 INFO [train.py:1046] (1/4) Epoch 26, batch 2000, loss[loss=0.1366, simple_loss=0.2156, pruned_loss=0.02876, over 18199.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2463, pruned_loss=0.04552, over 4726647.25 frames. ], batch size: 39, lr: 3.95e-03, grad_scale: 32.0 2023-10-02 13:39:15,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:39:15,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=898686.6666666666, ans=0.125 2023-10-02 13:39:16,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:39:18,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:39:18,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:39:20,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:39:23,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:39:26,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 13:39:26,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=898686.6666666666, ans=0.0 2023-10-02 13:39:27,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:39:29,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:39:30,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 13:39:31,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:39:32,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:39:33,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:39:35,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 13:39:36,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:39:39,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:39:39,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:39:39,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 13:39:40,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:39:41,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 13:39:41,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:39:45,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:39:45,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:39:47,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:39:47,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:39:49,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:39:49,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 13:39:53,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 13:39:53,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:39:53,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:39:56,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=898820.0, ans=0.015 2023-10-02 13:39:58,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=898886.6666666666, ans=0.1 2023-10-02 13:39:59,233 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.37 vs. limit=15.0 2023-10-02 13:39:59,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:40:02,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:40:02,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:40:02,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:40:02,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=898886.6666666666, ans=0.125 2023-10-02 13:40:03,568 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.946e+02 2.166e+02 2.855e+02 3.639e+02, threshold=4.333e+02, percent-clipped=0.0 2023-10-02 13:40:03,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:40:03,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:40:05,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:40:05,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:40:06,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:07,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=898886.6666666666, ans=0.125 2023-10-02 13:40:08,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:40:09,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 13:40:12,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:40:14,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:40:16,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:40:16,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:40:20,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:21,334 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.55 vs. limit=22.5 2023-10-02 13:40:23,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:40:23,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:24,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:40:24,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:40:29,535 INFO [train.py:1046] (1/4) Epoch 26, batch 2050, loss[loss=0.1579, simple_loss=0.2414, pruned_loss=0.03723, over 24637.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2472, pruned_loss=0.04577, over 4729242.53 frames. ], batch size: 68, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:40:29,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:40:30,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:32,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:40:33,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:36,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:40:39,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:40:41,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:42,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:40:43,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 13:40:43,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:40:45,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:40:46,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:40:56,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:40:56,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:40:58,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 13:41:00,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:41:02,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 13:41:02,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:41:04,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:41:06,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:41:06,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:41:07,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:41:10,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:41:11,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:41:12,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:41:15,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:41:17,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:41:19,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:41:21,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:41:21,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=899220.0, ans=0.2 2023-10-02 13:41:25,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:41:28,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=899286.6666666666, ans=0.035 2023-10-02 13:41:30,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:41:31,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 13:41:36,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:41:37,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:41:38,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:41:40,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 13:41:43,097 INFO [train.py:1046] (1/4) Epoch 26, batch 2100, loss[loss=0.152, simple_loss=0.2388, pruned_loss=0.0326, over 24468.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2467, pruned_loss=0.04519, over 4743373.95 frames. ], batch size: 66, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:41:45,040 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 13:41:45,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:41:45,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:41:45,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=899353.3333333334, ans=0.125 2023-10-02 13:41:46,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:41:46,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:41:46,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 13:41:47,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 13:41:47,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:41:51,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:41:52,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:41:54,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:41:55,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:41:55,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 13:41:57,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:41:57,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 13:41:57,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 13:42:00,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:42:00,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:42:00,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 13:42:01,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 13:42:07,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 13:42:07,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:42:08,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:42:09,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:42:14,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:42:14,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 13:42:14,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:42:14,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 13:42:17,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 13:42:17,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:42:17,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 13:42:17,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 13:42:18,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 13:42:21,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:42:23,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:42:26,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:42:26,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:42:28,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:42:29,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:42:29,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 13:42:29,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:42:29,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:42:31,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:42:31,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 13:42:32,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 13:42:33,882 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.840e+02 2.077e+02 2.431e+02 3.502e+02, threshold=4.154e+02, percent-clipped=0.0 2023-10-02 13:42:33,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 13:42:36,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:42:40,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:42:40,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 13:42:44,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:42:47,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:42:48,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:42:48,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:42:48,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 13:42:48,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:42:49,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:42:49,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:42:51,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:42:51,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:42:52,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 13:42:54,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 13:42:54,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:42:57,322 INFO [train.py:1046] (1/4) Epoch 26, batch 2150, loss[loss=0.1697, simple_loss=0.2373, pruned_loss=0.05102, over 22877.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2449, pruned_loss=0.04522, over 4720589.25 frames. ], batch size: 322, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:42:58,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:42:58,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:42:58,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:42:58,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:43:03,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 13:43:05,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:43:07,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:43:09,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:43:09,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:11,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:43:12,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:43:13,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:43:13,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:43:18,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:18,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 13:43:23,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:43:25,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:43:25,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:26,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:43:26,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:26,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:43:26,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:43:28,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:43:28,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:43:30,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 13:43:31,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:43:32,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:43:33,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:43:35,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:43:35,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:43:38,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:43:38,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:43:40,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:43:40,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 13:43:41,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:43:44,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:43:45,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:46,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:43:49,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:43:49,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:43:49,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:49,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 13:43:50,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 13:43:51,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:43:51,930 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 13:43:53,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:43:53,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:43:55,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 13:43:55,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:43:55,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 13:43:55,218 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 13:43:55,218 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 13:43:56,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 13:43:56,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:43:59,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:43:59,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:43:59,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:01,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 13:44:01,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=899953.3333333334, ans=0.125 2023-10-02 13:44:02,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:44:02,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:04,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=899953.3333333334, ans=0.0 2023-10-02 13:44:11,260 INFO [train.py:1046] (1/4) Epoch 26, batch 2200, loss[loss=0.1814, simple_loss=0.2671, pruned_loss=0.04783, over 23980.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2451, pruned_loss=0.04494, over 4723890.84 frames. ], batch size: 80, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:44:11,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:44:12,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 13:44:17,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:44:21,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:22,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:44:22,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:44:24,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:44:26,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:44:26,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:44:26,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 13:44:30,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 13:44:32,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:44:35,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=900086.6666666666, ans=0.5 2023-10-02 13:44:38,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 13:44:39,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=900153.3333333334, ans=0.0 2023-10-02 13:44:41,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:41,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:44:42,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:44:42,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=900153.3333333334, ans=0.1 2023-10-02 13:44:45,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:44:45,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 13:44:50,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:44:51,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:51,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 13:44:57,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:44:57,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:44:58,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:45:00,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:45:01,646 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.763e+02 1.852e+02 2.075e+02 2.576e+02, threshold=3.704e+02, percent-clipped=0.0 2023-10-02 13:45:01,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 13:45:03,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:45:05,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 13:45:08,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:45:08,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 13:45:08,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:45:11,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:45:11,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:45:11,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:45:11,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:45:12,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:45:12,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:45:15,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 13:45:16,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 13:45:18,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:45:19,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:45:19,872 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 13:45:22,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:45:22,599 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 13:45:22,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 13:45:24,544 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 13:45:25,726 INFO [train.py:1046] (1/4) Epoch 26, batch 2250, loss[loss=0.1718, simple_loss=0.2479, pruned_loss=0.04782, over 23268.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2456, pruned_loss=0.04504, over 4726990.21 frames. ], batch size: 119, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:45:25,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:45:26,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=900353.3333333334, ans=0.1 2023-10-02 13:45:27,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 13:45:27,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:45:28,617 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 13:45:29,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:45:31,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:45:36,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:45:36,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=900353.3333333334, ans=0.125 2023-10-02 13:45:37,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:45:40,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:45:41,424 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.60 vs. limit=15.0 2023-10-02 13:45:41,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:45:42,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:45:45,329 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.73 vs. limit=15.0 2023-10-02 13:45:46,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 13:45:46,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:45:46,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:45:48,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 13:45:49,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:45:49,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:45:50,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:45:55,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:45:55,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 13:45:56,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:45:58,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 13:45:59,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:46:01,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:46:04,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=900486.6666666666, ans=0.125 2023-10-02 13:46:04,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=900486.6666666666, ans=0.125 2023-10-02 13:46:05,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:46:05,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:46:09,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:46:09,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:46:10,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:46:12,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:46:12,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=900553.3333333334, ans=0.125 2023-10-02 13:46:13,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=900553.3333333334, ans=0.95 2023-10-02 13:46:16,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:46:19,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:46:22,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 13:46:23,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:46:23,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:46:24,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=900620.0, ans=0.125 2023-10-02 13:46:26,143 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.59 vs. limit=10.0 2023-10-02 13:46:28,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 13:46:31,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:46:31,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 13:46:31,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:46:32,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:46:35,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 13:46:38,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:46:38,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:46:39,674 INFO [train.py:1046] (1/4) Epoch 26, batch 2300, loss[loss=0.1861, simple_loss=0.2737, pruned_loss=0.04927, over 23942.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2465, pruned_loss=0.04493, over 4731187.48 frames. ], batch size: 80, lr: 3.94e-03, grad_scale: 8.0 2023-10-02 13:46:44,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:46:44,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:46:46,984 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 13:46:49,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:46:50,060 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.13 vs. limit=12.0 2023-10-02 13:46:57,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:46:57,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:46:57,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:46:57,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:46:57,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 13:46:59,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:47:01,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:47:01,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:47:05,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:47:05,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=900753.3333333334, ans=0.125 2023-10-02 13:47:06,153 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.85 vs. limit=10.0 2023-10-02 13:47:06,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=900753.3333333334, ans=0.0 2023-10-02 13:47:08,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:47:09,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:47:14,808 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.85 vs. limit=15.0 2023-10-02 13:47:15,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:47:15,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:47:18,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:47:21,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:47:24,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:47:24,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:47:26,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:47:26,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 13:47:30,974 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 2.002e+02 2.209e+02 2.529e+02 4.134e+02, threshold=4.417e+02, percent-clipped=1.0 2023-10-02 13:47:32,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 13:47:32,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:47:32,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:47:32,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:47:32,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:47:32,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=900886.6666666666, ans=0.0 2023-10-02 13:47:33,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 13:47:33,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:47:33,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 13:47:33,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:47:34,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=900886.6666666666, ans=0.07 2023-10-02 13:47:35,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:47:35,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 13:47:40,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:47:41,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=900953.3333333334, ans=0.125 2023-10-02 13:47:43,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=900953.3333333334, ans=0.2 2023-10-02 13:47:43,515 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.88 vs. limit=15.0 2023-10-02 13:47:45,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:47:48,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:47:48,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:47:48,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:47:49,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:47:49,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:47:51,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:47:51,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 13:47:54,267 INFO [train.py:1046] (1/4) Epoch 26, batch 2350, loss[loss=0.1797, simple_loss=0.2644, pruned_loss=0.0475, over 24450.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2476, pruned_loss=0.04553, over 4725591.01 frames. ], batch size: 69, lr: 3.94e-03, grad_scale: 8.0 2023-10-02 13:47:58,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:47:58,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 13:48:03,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 13:48:05,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:48:10,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:48:10,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:48:10,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:48:11,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:48:11,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 13:48:15,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:48:20,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 13:48:22,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:48:25,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:48:26,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:48:29,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:48:29,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 13:48:31,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:48:33,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:48:33,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:48:35,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:48:37,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:48:40,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 13:48:40,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:48:43,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:48:45,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:48:46,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 13:48:47,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:48:49,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 13:48:49,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:48:53,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 13:48:58,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 13:48:58,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:48:58,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 13:48:59,534 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 13:48:59,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 13:49:02,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 13:49:02,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=901286.6666666666, ans=0.07 2023-10-02 13:49:05,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:49:08,599 INFO [train.py:1046] (1/4) Epoch 26, batch 2400, loss[loss=0.143, simple_loss=0.2189, pruned_loss=0.03361, over 24437.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.246, pruned_loss=0.04478, over 4726493.60 frames. ], batch size: 58, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:49:10,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:49:14,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:49:15,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:49:15,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 13:49:17,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 13:49:22,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 13:49:22,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:49:24,609 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:49:25,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 13:49:25,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:49:26,332 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.70 vs. limit=22.5 2023-10-02 13:49:27,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:49:27,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 13:49:29,661 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.80 vs. limit=22.5 2023-10-02 13:49:33,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:49:34,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 13:49:40,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 13:49:43,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 13:49:46,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:49:49,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:49:52,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:49:52,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 13:49:52,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:49:58,274 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.892e+02 2.173e+02 2.694e+02 4.951e+02, threshold=4.347e+02, percent-clipped=1.0 2023-10-02 13:50:00,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:50:02,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:50:03,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:50:05,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:50:05,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:50:05,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:50:05,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:50:05,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:50:05,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:50:11,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:50:11,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:50:11,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 13:50:13,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 13:50:15,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:50:15,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:50:16,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 13:50:16,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 13:50:16,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 13:50:16,596 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 13:50:17,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 13:50:19,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:50:20,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:50:20,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:50:21,009 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=901686.6666666666, ans=0.125 2023-10-02 13:50:22,014 INFO [train.py:1046] (1/4) Epoch 26, batch 2450, loss[loss=0.1639, simple_loss=0.2318, pruned_loss=0.04796, over 23648.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2443, pruned_loss=0.04455, over 4708679.39 frames. ], batch size: 232, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:50:22,088 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 13:50:23,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:50:23,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 13:50:28,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:50:28,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:50:32,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:50:32,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:50:32,869 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.07 vs. limit=15.0 2023-10-02 13:50:33,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 13:50:38,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:50:38,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:50:38,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=901753.3333333334, ans=0.125 2023-10-02 13:50:42,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:50:42,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:50:42,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:50:42,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 13:50:48,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:50:50,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:50:50,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:50:54,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:50:54,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:50:56,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:50:56,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:50:58,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 13:50:59,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:51:01,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=901820.0, ans=0.125 2023-10-02 13:51:01,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=901820.0, ans=0.1 2023-10-02 13:51:05,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:51:07,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:51:07,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:51:07,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:51:09,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:51:10,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:51:12,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 13:51:13,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:51:13,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:51:16,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:51:16,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:51:21,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:51:21,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 13:51:22,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:51:22,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:51:23,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 13:51:23,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:51:25,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:51:29,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:51:32,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:51:32,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:51:34,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 13:51:34,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=902020.0, ans=0.125 2023-10-02 13:51:35,253 INFO [train.py:1046] (1/4) Epoch 26, batch 2500, loss[loss=0.152, simple_loss=0.237, pruned_loss=0.03349, over 24478.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2436, pruned_loss=0.04457, over 4723432.96 frames. ], batch size: 66, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:51:35,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:51:41,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:51:49,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:51:49,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:51:50,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:51:50,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 13:51:52,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=902086.6666666666, ans=0.025 2023-10-02 13:51:58,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:51:58,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:51:59,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:51:59,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 13:51:59,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 13:52:01,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:52:01,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=902086.6666666666, ans=0.0 2023-10-02 13:52:02,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:52:02,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 13:52:02,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:52:04,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 13:52:04,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:52:07,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=902153.3333333334, ans=0.0 2023-10-02 13:52:08,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:52:08,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:52:11,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 13:52:13,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 13:52:14,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:52:16,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:52:16,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=902153.3333333334, ans=0.0 2023-10-02 13:52:20,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:52:22,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=902220.0, ans=0.0 2023-10-02 13:52:23,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=902220.0, ans=0.125 2023-10-02 13:52:24,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:52:25,910 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.808e+02 1.922e+02 2.142e+02 2.952e+02, threshold=3.844e+02, percent-clipped=0.0 2023-10-02 13:52:26,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:52:31,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 13:52:35,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 13:52:35,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:52:35,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 13:52:36,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:52:36,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 13:52:38,288 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 13:52:38,289 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 13:52:38,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 13:52:39,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:52:42,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 13:52:42,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 13:52:43,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:52:44,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 13:52:47,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 13:52:50,317 INFO [train.py:1046] (1/4) Epoch 26, batch 2550, loss[loss=0.1671, simple_loss=0.2446, pruned_loss=0.04474, over 23616.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2449, pruned_loss=0.04504, over 4710219.85 frames. ], batch size: 149, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:52:50,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:52:53,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:52:53,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:52:55,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:52:58,105 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:52:59,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 13:52:59,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:53:01,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 13:53:03,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:53:06,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:08,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:53:08,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 13:53:08,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:53:09,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:53:09,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:53:12,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:53:12,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 13:53:13,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 13:53:13,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:13,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 13:53:19,961 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.80 vs. limit=15.0 2023-10-02 13:53:21,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=902486.6666666666, ans=0.125 2023-10-02 13:53:25,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:53:30,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:53:30,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:30,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:53:32,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:53:35,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=902553.3333333334, ans=0.2 2023-10-02 13:53:40,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:53:40,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=902553.3333333334, ans=0.125 2023-10-02 13:53:41,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:53:41,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:53:41,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:53:41,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:53:43,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 13:53:45,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:53:47,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:52,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:53:53,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 13:53:53,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:53:53,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:53,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:53:55,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:53:56,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:54:03,849 INFO [train.py:1046] (1/4) Epoch 26, batch 2600, loss[loss=0.1639, simple_loss=0.2565, pruned_loss=0.03567, over 24667.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2454, pruned_loss=0.04479, over 4726552.27 frames. ], batch size: 73, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:54:03,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:54:07,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:54:09,027 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 13:54:11,745 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 13:54:11,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:54:11,803 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 13:54:13,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 13:54:13,182 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 13:54:15,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:54:15,904 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 13:54:17,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 13:54:19,118 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 13:54:20,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:54:21,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 13:54:23,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 13:54:24,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=902753.3333333334, ans=0.2 2023-10-02 13:54:25,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:54:25,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 13:54:26,624 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 13:54:26,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 13:54:28,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=902753.3333333334, ans=0.0 2023-10-02 13:54:34,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:54:35,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:54:36,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:54:36,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 13:54:40,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:54:44,381 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 13:54:48,161 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.16 vs. limit=8.0 2023-10-02 13:54:50,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:54:50,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:54:51,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 13:54:53,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:54:53,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:54:53,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 13:54:54,629 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.918e+02 2.069e+02 2.446e+02 3.571e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-02 13:54:54,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:54:55,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=902886.6666666666, ans=0.0 2023-10-02 13:54:55,677 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.48 vs. limit=15.0 2023-10-02 13:54:56,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:54:56,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=902886.6666666666, ans=0.125 2023-10-02 13:54:59,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:54:59,986 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.67 vs. limit=22.5 2023-10-02 13:55:02,035 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 13:55:02,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:55:02,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:55:02,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=902953.3333333334, ans=0.1 2023-10-02 13:55:04,059 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.92 vs. limit=22.5 2023-10-02 13:55:08,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:55:09,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:55:09,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 13:55:10,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:55:12,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:55:14,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:55:18,196 INFO [train.py:1046] (1/4) Epoch 26, batch 2650, loss[loss=0.1792, simple_loss=0.2477, pruned_loss=0.0554, over 23540.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2457, pruned_loss=0.04496, over 4726420.41 frames. ], batch size: 256, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:55:19,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 13:55:19,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:55:22,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 13:55:26,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 13:55:26,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:55:26,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:55:28,841 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 13:55:28,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:55:31,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:55:33,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 13:55:33,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=903086.6666666666, ans=0.125 2023-10-02 13:55:34,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:55:37,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:55:37,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 13:55:37,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:55:37,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:55:42,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 13:55:42,824 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 13:55:44,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:55:46,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 13:55:46,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:55:46,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 13:55:51,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:55:51,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 13:55:51,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:55:52,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:55:56,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 13:55:56,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 13:55:59,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:56:03,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 13:56:03,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:56:04,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:04,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:56:04,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=903220.0, ans=0.2 2023-10-02 13:56:05,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:56:05,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:56:08,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:56:09,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:56:11,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:56:12,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:56:13,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:56:15,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:56:16,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:56:16,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:56:19,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:56:19,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 13:56:22,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:23,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:56:23,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:56:23,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 13:56:28,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:56:29,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:29,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:31,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:56:31,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=903353.3333333334, ans=0.0 2023-10-02 13:56:33,058 INFO [train.py:1046] (1/4) Epoch 26, batch 2700, loss[loss=0.17, simple_loss=0.2361, pruned_loss=0.0519, over 23695.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.247, pruned_loss=0.04581, over 4721306.11 frames. ], batch size: 232, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:56:33,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:56:33,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:56:35,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:56:35,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 13:56:39,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:56:40,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 13:56:42,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:56:43,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:56:43,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:56:43,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:56:43,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:56:43,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:56:45,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:56:45,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 13:56:45,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:56:48,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:56:49,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:56:49,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:52,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=903420.0, ans=0.95 2023-10-02 13:56:53,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:56:53,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 13:56:53,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:56:57,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:56:57,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:57:02,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=903486.6666666666, ans=0.2 2023-10-02 13:57:03,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:57:03,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:57:03,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:57:06,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:57:07,434 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=903486.6666666666, ans=0.0 2023-10-02 13:57:08,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:57:11,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:57:11,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:57:11,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:57:16,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:57:16,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:57:19,651 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.87 vs. limit=15.0 2023-10-02 13:57:22,898 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.793e+02 2.064e+02 2.296e+02 3.697e+02, threshold=4.128e+02, percent-clipped=0.0 2023-10-02 13:57:23,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:57:24,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:57:27,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:57:27,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:57:31,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:57:33,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:57:34,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:57:36,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:57:36,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:57:38,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:57:41,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:57:43,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:57:43,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:57:44,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 13:57:46,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:57:47,401 INFO [train.py:1046] (1/4) Epoch 26, batch 2750, loss[loss=0.1659, simple_loss=0.2529, pruned_loss=0.03943, over 24022.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2472, pruned_loss=0.04607, over 4717709.53 frames. ], batch size: 80, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:57:48,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:57:48,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 13:57:50,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 13:57:50,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:57:52,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:57:54,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:57:55,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:57:55,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:57:55,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:57:57,917 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.02 vs. limit=6.0 2023-10-02 13:57:59,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:57:59,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 13:57:59,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:57:59,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:57:59,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 13:57:59,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:57:59,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:58:05,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 13:58:07,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:58:08,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:58:09,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:58:09,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 13:58:10,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:58:11,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:58:11,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:58:12,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:58:15,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:58:15,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 13:58:16,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:58:18,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:58:18,651 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=21.50 vs. limit=15.0 2023-10-02 13:58:19,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:58:21,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.63 vs. limit=22.5 2023-10-02 13:58:22,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=903820.0, ans=0.035 2023-10-02 13:58:25,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:58:27,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 13:58:27,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:58:33,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:58:33,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:58:33,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:58:38,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:58:39,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:58:39,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 13:58:43,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:58:44,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 13:58:48,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:58:49,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:58:51,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 13:58:51,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:58:52,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:58:52,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=903953.3333333334, ans=0.1 2023-10-02 13:58:54,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 13:58:54,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:58:57,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 13:58:57,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:58:58,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:59:00,224 INFO [train.py:1046] (1/4) Epoch 26, batch 2800, loss[loss=0.1791, simple_loss=0.2549, pruned_loss=0.05168, over 23291.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2457, pruned_loss=0.04579, over 4720823.47 frames. ], batch size: 105, lr: 3.94e-03, grad_scale: 32.0 2023-10-02 13:59:00,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 13:59:00,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:00,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:59:00,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=904020.0, ans=0.0 2023-10-02 13:59:04,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:04,818 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 13:59:04,819 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 13:59:07,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=904020.0, ans=0.2 2023-10-02 13:59:08,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:59:09,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:59:09,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:59:13,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:59:16,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 13:59:17,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 13:59:19,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 13:59:20,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:59:20,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:59:20,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:59:24,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:59:24,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:59:24,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 13:59:26,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:59:33,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:59:35,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:59:36,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=904153.3333333334, ans=0.0 2023-10-02 13:59:39,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:39,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:59:41,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:59:44,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:59:44,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=904220.0, ans=0.125 2023-10-02 13:59:45,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 13:59:45,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:59:46,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:59:46,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:59:50,941 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.855e+02 1.985e+02 2.213e+02 3.780e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-02 13:59:51,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:59:51,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:51,551 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.29 vs. limit=15.0 2023-10-02 13:59:55,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:59:56,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:59:56,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:56,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 13:59:56,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:59:58,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:59:59,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:59:59,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 13:59:59,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:00:00,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:00:00,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:00:02,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 14:00:04,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:00:04,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:00:05,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:00:07,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 14:00:10,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:00:11,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=904286.6666666666, ans=0.05 2023-10-02 14:00:12,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:00:12,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:00:13,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:00:14,376 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.33 vs. limit=15.0 2023-10-02 14:00:14,812 INFO [train.py:1046] (1/4) Epoch 26, batch 2850, loss[loss=0.1713, simple_loss=0.2431, pruned_loss=0.04972, over 23491.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2456, pruned_loss=0.04542, over 4724633.82 frames. ], batch size: 285, lr: 3.94e-03, grad_scale: 32.0 2023-10-02 14:00:16,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:00:16,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:00:16,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:00:20,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:00:20,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:00:23,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:00:23,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 14:00:30,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=904420.0, ans=0.125 2023-10-02 14:00:31,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 14:00:31,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:00:33,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 14:00:33,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:00:36,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 14:00:36,989 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:00:37,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 14:00:38,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=904420.0, ans=0.0 2023-10-02 14:00:39,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:00:40,362 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.04 vs. limit=6.0 2023-10-02 14:00:50,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:00:50,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=904486.6666666666, ans=0.125 2023-10-02 14:00:53,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:00:53,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:00:53,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 14:00:53,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:00:53,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:00:54,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:00:54,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 14:00:57,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:00:57,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:00:58,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:01:00,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:01:01,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:01:01,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:01:03,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:01:03,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=904553.3333333334, ans=0.07 2023-10-02 14:01:06,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:01:08,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:01:08,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:01:08,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=904553.3333333334, ans=0.07 2023-10-02 14:01:09,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:01:11,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:01:15,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:01:17,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 14:01:18,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 14:01:18,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:01:19,123 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.18 vs. limit=15.0 2023-10-02 14:01:19,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:01:21,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 14:01:21,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:01:22,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:01:22,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:01:22,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:01:22,665 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 14:01:22,697 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 14:01:22,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:01:22,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=904620.0, ans=0.5 2023-10-02 14:01:24,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:01:28,029 INFO [train.py:1046] (1/4) Epoch 26, batch 2900, loss[loss=0.2038, simple_loss=0.2697, pruned_loss=0.06895, over 23705.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2458, pruned_loss=0.04517, over 4741512.85 frames. ], batch size: 164, lr: 3.94e-03, grad_scale: 32.0 2023-10-02 14:01:28,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 14:01:28,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:01:29,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:01:29,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 14:01:32,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:01:32,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 14:01:34,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 14:01:37,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:01:37,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:01:40,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:01:44,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:01:46,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:01:46,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:01:50,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 14:01:51,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 14:01:52,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:01:53,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:01:55,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 14:01:55,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 14:01:57,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:01:57,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 14:01:57,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:02:00,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:02:00,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 14:02:03,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:02:03,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:02:06,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:02:09,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:02:09,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 14:02:10,615 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.65 vs. limit=15.0 2023-10-02 14:02:11,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 14:02:11,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:02:15,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:02:16,018 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=904886.6666666666, ans=0.125 2023-10-02 14:02:18,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 14:02:19,823 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.775e+02 2.023e+02 2.342e+02 3.264e+02, threshold=4.046e+02, percent-clipped=0.0 2023-10-02 14:02:19,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:02:22,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:02:33,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:02:33,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:02:34,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 14:02:37,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:02:37,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 14:02:37,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:02:38,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:02:41,849 INFO [train.py:1046] (1/4) Epoch 26, batch 2950, loss[loss=0.1645, simple_loss=0.2409, pruned_loss=0.04404, over 24315.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2467, pruned_loss=0.04532, over 4738353.21 frames. ], batch size: 61, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 14:02:43,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:02:45,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 14:02:47,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:02:47,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:02:47,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:02:48,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:02:49,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 14:02:51,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 14:02:51,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:02:51,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:02:54,680 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.87 vs. limit=22.5 2023-10-02 14:02:56,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:02:58,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:03:00,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:03:00,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:03:04,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:03:04,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:03:07,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:03:07,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:03:07,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:03:11,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 14:03:15,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 14:03:15,715 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 14:03:17,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:03:18,448 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 14:03:19,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 14:03:19,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:03:19,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:03:19,915 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 14:03:19,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 14:03:22,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 14:03:23,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:03:24,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:03:26,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:03:28,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:03:28,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:03:28,176 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 14:03:28,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:03:29,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 14:03:34,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=905220.0, ans=0.125 2023-10-02 14:03:35,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:03:36,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:03:38,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 14:03:38,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:03:39,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 14:03:40,412 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.93 vs. limit=15.0 2023-10-02 14:03:41,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:03:41,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=905286.6666666666, ans=0.125 2023-10-02 14:03:43,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:03:43,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:03:45,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:03:45,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 14:03:48,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:03:48,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:03:48,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:03:49,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:03:49,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:03:49,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=905286.6666666666, ans=0.2 2023-10-02 14:03:50,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:03:53,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:03:53,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 14:03:55,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:03:56,578 INFO [train.py:1046] (1/4) Epoch 26, batch 3000, loss[loss=0.164, simple_loss=0.2514, pruned_loss=0.03831, over 24632.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2478, pruned_loss=0.0461, over 4736298.87 frames. ], batch size: 68, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:03:56,578 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 14:04:08,892 INFO [train.py:1078] (1/4) Epoch 26, validation: loss=0.3521, simple_loss=0.2784, pruned_loss=0.2129, over 1125622.00 frames. 2023-10-02 14:04:08,893 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 14:04:10,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:04:11,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:04:17,136 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 14:04:17,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 14:04:19,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:04:19,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:04:20,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 14:04:20,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:04:24,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=905420.0, ans=0.07 2023-10-02 14:04:26,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:04:33,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:04:39,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 14:04:40,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:04:44,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:04:46,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:04:46,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:04:48,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:04:48,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 14:04:50,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 14:04:52,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:04:52,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:04:54,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:04:55,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:04:56,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:04:56,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:04:57,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:04:59,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:04:59,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:05:00,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:05:03,752 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.996e+02 2.253e+02 2.544e+02 4.342e+02, threshold=4.506e+02, percent-clipped=3.0 2023-10-02 14:05:03,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 14:05:03,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:05:04,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=905553.3333333334, ans=0.125 2023-10-02 14:05:05,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:05:05,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:05:09,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:05:09,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:05:10,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 14:05:10,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 14:05:10,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:05:10,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 14:05:12,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:05:12,711 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.13 vs. limit=12.0 2023-10-02 14:05:15,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 14:05:19,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:05:20,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:05:20,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 14:05:21,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 14:05:21,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 14:05:23,233 INFO [train.py:1046] (1/4) Epoch 26, batch 3050, loss[loss=0.161, simple_loss=0.2372, pruned_loss=0.04245, over 23374.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2479, pruned_loss=0.04604, over 4735282.55 frames. ], batch size: 119, lr: 3.93e-03, grad_scale: 4.0 2023-10-02 14:05:23,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:05:24,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:05:24,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 14:05:24,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:05:26,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:05:27,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 14:05:30,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:05:31,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:05:32,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:05:34,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=905686.6666666666, ans=0.1 2023-10-02 14:05:35,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:05:38,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 14:05:43,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 14:05:44,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 14:05:44,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:05:49,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:05:54,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:05:54,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:05:54,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:05:57,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:05:57,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:05:58,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:05:58,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:05:58,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:06:00,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:06:01,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:01,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=905820.0, ans=0.125 2023-10-02 14:06:03,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:06:03,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 14:06:04,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:06:04,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:06:05,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=905886.6666666666, ans=0.125 2023-10-02 14:06:07,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:06:07,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:06:07,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:06:07,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=905886.6666666666, ans=0.125 2023-10-02 14:06:08,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:06:08,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=905886.6666666666, ans=0.0 2023-10-02 14:06:13,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:06:14,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:06:19,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:19,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:06:19,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:06:21,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:06:22,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:06:22,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:06:24,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 14:06:24,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:06:24,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:25,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 14:06:26,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:06:32,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:06:33,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:06:36,517 INFO [train.py:1046] (1/4) Epoch 26, batch 3100, loss[loss=0.1429, simple_loss=0.2204, pruned_loss=0.03267, over 19766.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2475, pruned_loss=0.04594, over 4727557.17 frames. ], batch size: 43, lr: 3.93e-03, grad_scale: 8.0 2023-10-02 14:06:36,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:06:36,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=906020.0, ans=0.2 2023-10-02 14:06:38,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 14:06:39,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 14:06:41,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 14:06:42,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:06:47,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:06:47,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:50,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 14:06:53,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:57,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=906086.6666666666, ans=0.125 2023-10-02 14:06:59,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 14:07:04,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:07:04,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:04,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:07:04,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=906153.3333333334, ans=0.1 2023-10-02 14:07:06,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:07:07,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 14:07:09,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:07:09,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 14:07:09,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:07:10,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:07:12,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 14:07:13,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:07:16,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:07:16,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 14:07:16,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 14:07:18,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:19,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:07:21,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:07:21,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:23,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:07:23,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:07:23,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:07:24,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:07:24,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:07:26,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:26,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 14:07:29,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=906220.0, ans=0.125 2023-10-02 14:07:30,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:07:31,914 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.814e+02 2.073e+02 2.408e+02 3.405e+02, threshold=4.147e+02, percent-clipped=0.0 2023-10-02 14:07:32,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 14:07:34,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:07:36,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 14:07:36,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:07:37,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:37,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 14:07:48,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 14:07:48,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=906286.6666666666, ans=0.0 2023-10-02 14:07:51,598 INFO [train.py:1046] (1/4) Epoch 26, batch 3150, loss[loss=0.1839, simple_loss=0.2657, pruned_loss=0.05107, over 23699.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2465, pruned_loss=0.04569, over 4723267.06 frames. ], batch size: 85, lr: 3.93e-03, grad_scale: 8.0 2023-10-02 14:07:51,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:07:51,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:54,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:07:54,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:07:55,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 14:07:56,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:07:57,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 14:07:59,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 14:08:00,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:08:02,149 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 14:08:03,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=906353.3333333334, ans=0.0 2023-10-02 14:08:04,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 14:08:05,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:08:05,072 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 14:08:06,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 14:08:07,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 14:08:09,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 14:08:09,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 14:08:09,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:08:09,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:08:10,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:08:12,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 14:08:12,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=906420.0, ans=0.0 2023-10-02 14:08:13,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:08:13,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:08:14,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:08:17,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 14:08:21,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 14:08:22,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:08:25,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:08:26,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:08:26,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 14:08:28,632 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.74 vs. limit=22.5 2023-10-02 14:08:29,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 14:08:30,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:08:30,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:08:32,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:08:32,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:08:32,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:08:33,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:08:33,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:08:36,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 14:08:36,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:08:36,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:08:37,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:08:37,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:08:38,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 14:08:38,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:08:42,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 14:08:42,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:08:43,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 14:08:43,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 14:08:45,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:08:46,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:08:46,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 14:08:46,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 14:08:48,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:08:51,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:08:52,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:08:52,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:08:57,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:08:58,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:09:00,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 14:09:08,283 INFO [train.py:1046] (1/4) Epoch 26, batch 3200, loss[loss=0.1528, simple_loss=0.2368, pruned_loss=0.0344, over 24490.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2459, pruned_loss=0.04535, over 4731391.58 frames. ], batch size: 63, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:09:08,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:09:08,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 14:09:11,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:09:11,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:09:11,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 14:09:14,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:09:19,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:09:21,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:09:25,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=906753.3333333334, ans=0.125 2023-10-02 14:09:28,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=906753.3333333334, ans=0.125 2023-10-02 14:09:29,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:09:34,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=906753.3333333334, ans=0.0 2023-10-02 14:09:36,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 14:09:38,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:09:41,126 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.09 vs. limit=12.0 2023-10-02 14:09:41,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 14:09:41,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:09:44,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:09:44,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:09:46,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:09:49,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 14:09:50,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 14:09:51,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 14:09:52,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=906886.6666666666, ans=0.1 2023-10-02 14:09:54,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 14:09:56,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:10:01,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:10:01,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:10:01,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:10:02,458 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 14:10:02,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:10:03,776 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.962e+02 2.253e+02 2.668e+02 3.638e+02, threshold=4.506e+02, percent-clipped=0.0 2023-10-02 14:10:06,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:10:09,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 14:10:09,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 14:10:11,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 14:10:11,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=906953.3333333334, ans=0.1 2023-10-02 14:10:13,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 14:10:14,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:10:17,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 14:10:17,195 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 14:10:17,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:10:17,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:19,914 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 14:10:22,562 INFO [train.py:1046] (1/4) Epoch 26, batch 3250, loss[loss=0.1756, simple_loss=0.2614, pruned_loss=0.04486, over 24552.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.246, pruned_loss=0.04563, over 4733775.03 frames. ], batch size: 71, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:10:24,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:10:24,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=907020.0, ans=0.035 2023-10-02 14:10:26,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:10:31,674 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.89 vs. limit=15.0 2023-10-02 14:10:33,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:10:33,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 14:10:35,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:10:35,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:10:35,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:10:36,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:10:38,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:10:41,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:41,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:10:42,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:10:42,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:42,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:42,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:10:46,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:10:46,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:10:48,351 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.20 vs. limit=15.0 2023-10-02 14:10:48,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:10:48,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:50,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:10:51,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:10:51,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:10:54,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 14:10:55,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:10:55,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:10:57,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:10:58,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:11:05,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:11:11,190 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=7.32 vs. limit=12.0 2023-10-02 14:11:12,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:11:12,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=907220.0, ans=0.125 2023-10-02 14:11:13,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:11:13,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 14:11:13,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:11:13,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 14:11:15,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:11:15,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=907220.0, ans=0.125 2023-10-02 14:11:16,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 14:11:18,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 14:11:18,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:11:18,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=907220.0, ans=0.0 2023-10-02 14:11:19,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:11:21,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:11:21,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 14:11:21,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=907286.6666666666, ans=0.2 2023-10-02 14:11:22,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:11:25,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:11:25,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:11:28,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 14:11:28,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:11:31,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:11:31,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 14:11:34,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:11:34,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 14:11:36,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 14:11:36,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=907353.3333333334, ans=0.0 2023-10-02 14:11:37,356 INFO [train.py:1046] (1/4) Epoch 26, batch 3300, loss[loss=0.1559, simple_loss=0.2262, pruned_loss=0.04278, over 18596.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2463, pruned_loss=0.04547, over 4724933.66 frames. ], batch size: 40, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:11:37,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 14:11:37,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:11:37,686 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:11:40,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:11:41,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:11:41,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=907353.3333333334, ans=0.125 2023-10-02 14:11:43,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:11:44,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 14:11:44,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:11:47,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:11:48,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:11:54,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 14:11:54,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:11:54,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:11:56,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:11:57,584 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 14:11:57,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:11:58,296 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.18 vs. limit=10.0 2023-10-02 14:11:58,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:12:00,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:12:00,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:12:00,433 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 14:12:03,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=907420.0, ans=0.04949747468305833 2023-10-02 14:12:05,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:12:05,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:12:07,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:12:07,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 14:12:08,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 14:12:08,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:12:09,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:12:12,580 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 14:12:14,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 14:12:14,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:12:16,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 14:12:16,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:12:20,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:12:20,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:12:23,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:12:23,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:12:23,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:12:23,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:12:24,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:12:24,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:12:25,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=907553.3333333334, ans=0.0 2023-10-02 14:12:26,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:12:27,637 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 14:12:28,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 14:12:32,059 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.777e+02 1.941e+02 2.097e+02 3.420e+02, threshold=3.882e+02, percent-clipped=0.0 2023-10-02 14:12:32,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 14:12:34,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:12:34,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:12:35,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:12:35,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:12:36,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:12:38,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:12:38,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 14:12:39,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:12:39,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:12:42,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 14:12:42,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:12:43,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:12:45,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:12:46,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:12:47,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:12:50,870 INFO [train.py:1046] (1/4) Epoch 26, batch 3350, loss[loss=0.1644, simple_loss=0.2553, pruned_loss=0.03669, over 24431.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2471, pruned_loss=0.04617, over 4724366.69 frames. ], batch size: 69, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:12:50,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:12:50,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:12:53,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:12:55,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:12:55,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=907686.6666666666, ans=0.125 2023-10-02 14:12:57,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:12:58,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:01,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:13:02,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:13:03,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:13:04,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 14:13:04,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=907753.3333333334, ans=0.1 2023-10-02 14:13:04,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=907753.3333333334, ans=0.5 2023-10-02 14:13:04,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=907753.3333333334, ans=0.1 2023-10-02 14:13:07,836 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 14:13:07,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:13:10,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 14:13:10,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 14:13:10,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:13:11,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:13:13,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:13:13,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 14:13:14,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:14,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:13:17,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:18,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:13:20,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:20,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:13:24,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:13:26,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:13:28,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:13:32,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:13:32,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:35,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:13:35,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:13:38,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:13:40,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 14:13:40,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 14:13:40,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 14:13:40,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:13:42,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 14:13:43,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:13:44,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:13:51,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:13:53,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 14:13:53,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:13:55,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:13:55,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:13:56,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=907953.3333333334, ans=0.0 2023-10-02 14:13:56,146 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:14:02,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:14:03,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 14:14:05,073 INFO [train.py:1046] (1/4) Epoch 26, batch 3400, loss[loss=0.1646, simple_loss=0.2471, pruned_loss=0.04103, over 23426.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2469, pruned_loss=0.04579, over 4730749.02 frames. ], batch size: 93, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:14:05,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:14:05,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:14:06,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:14:07,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 14:14:09,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:14:09,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 14:14:09,748 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.93 vs. limit=15.0 2023-10-02 14:14:10,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:14:10,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:14:10,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:14:12,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:14:12,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 14:14:12,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=908020.0, ans=0.5 2023-10-02 14:14:14,327 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=908020.0, ans=0.1 2023-10-02 14:14:16,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 14:14:16,773 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 14:14:16,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:14:20,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:14:20,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:14:20,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:14:22,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:14:23,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=908086.6666666666, ans=0.125 2023-10-02 14:14:23,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=908086.6666666666, ans=0.2 2023-10-02 14:14:24,089 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:14:26,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:14:29,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 14:14:34,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:14:37,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:14:38,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:14:38,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 14:14:44,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:14:47,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 14:14:51,097 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.43 vs. limit=6.0 2023-10-02 14:14:51,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:14:51,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:14:52,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 14:14:52,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:14:54,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:14:54,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:14:55,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:14:57,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:14:58,790 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.837e+02 2.033e+02 2.297e+02 3.671e+02, threshold=4.066e+02, percent-clipped=0.0 2023-10-02 14:15:00,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:15:00,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:15:06,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:15:06,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 14:15:13,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:15:17,848 INFO [train.py:1046] (1/4) Epoch 26, batch 3450, loss[loss=0.1511, simple_loss=0.2134, pruned_loss=0.04434, over 23438.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2471, pruned_loss=0.04597, over 4728998.32 frames. ], batch size: 285, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:15:17,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 14:15:18,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=908353.3333333334, ans=0.125 2023-10-02 14:15:20,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 14:15:20,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:15:22,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:15:22,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 14:15:23,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:15:28,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:15:34,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:15:35,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:15:36,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:15:36,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:15:37,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:15:37,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=908420.0, ans=0.2 2023-10-02 14:15:38,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=908420.0, ans=0.0 2023-10-02 14:15:43,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=908420.0, ans=0.2 2023-10-02 14:15:44,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 14:15:45,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=908420.0, ans=0.125 2023-10-02 14:15:49,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 14:15:50,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:15:50,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:15:51,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:15:55,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 14:15:56,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:15:59,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:16:00,083 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.62 vs. limit=15.0 2023-10-02 14:16:01,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:16:02,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:16:03,436 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.92 vs. limit=10.0 2023-10-02 14:16:03,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:16:05,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 14:16:05,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:16:06,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:16:09,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:16:10,798 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.73 vs. limit=15.0 2023-10-02 14:16:11,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 14:16:11,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=908553.3333333334, ans=0.125 2023-10-02 14:16:14,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:16:16,539 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.88 vs. limit=15.0 2023-10-02 14:16:19,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:16:21,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:16:24,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:16:27,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:16:27,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:16:29,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:16:29,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:16:32,138 INFO [train.py:1046] (1/4) Epoch 26, batch 3500, loss[loss=0.1582, simple_loss=0.2118, pruned_loss=0.05237, over 22818.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2451, pruned_loss=0.04577, over 4715438.36 frames. ], batch size: 322, lr: 3.93e-03, grad_scale: 8.0 2023-10-02 14:16:33,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:16:38,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:16:39,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 14:16:40,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:16:43,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 14:16:45,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:16:46,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 14:16:47,367 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.78 vs. limit=15.0 2023-10-02 14:16:47,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=908753.3333333334, ans=0.1 2023-10-02 14:16:51,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:16:52,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:16:52,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:16:52,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:16:53,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 14:16:54,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:16:55,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:16:55,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 14:16:58,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:16:58,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 14:16:58,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=908753.3333333334, ans=0.125 2023-10-02 14:17:00,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:17:04,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:05,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 14:17:06,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:17:09,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:17:12,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:17:12,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:12,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=908820.0, ans=0.0 2023-10-02 14:17:12,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=908820.0, ans=0.1 2023-10-02 14:17:13,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:17:13,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:17:15,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 14:17:16,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 14:17:16,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 14:17:16,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:17:19,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:19,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:17:19,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:17:22,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 14:17:23,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:17:28,015 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.927e+02 2.309e+02 3.023e+02 4.699e+02, threshold=4.619e+02, percent-clipped=3.0 2023-10-02 14:17:28,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:17:29,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 14:17:29,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 14:17:29,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:17:31,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=908953.3333333334, ans=0.125 2023-10-02 14:17:32,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:17:34,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:17:34,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:35,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=908953.3333333334, ans=0.125 2023-10-02 14:17:37,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 14:17:38,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:17:40,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:17:41,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 14:17:43,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 14:17:45,641 INFO [train.py:1046] (1/4) Epoch 26, batch 3550, loss[loss=0.1771, simple_loss=0.2529, pruned_loss=0.05069, over 23249.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2432, pruned_loss=0.04531, over 4696265.40 frames. ], batch size: 105, lr: 3.93e-03, grad_scale: 8.0 2023-10-02 14:17:45,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:47,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:17:47,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:17:47,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:17:48,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=909020.0, ans=0.125 2023-10-02 14:17:50,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:17:57,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:17:59,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 14:18:03,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:18:03,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:18:05,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:18:05,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:18:06,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:18:09,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:18:09,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:18:09,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:18:09,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 14:18:10,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:18:16,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:18:16,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:18:17,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:18:17,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:18:17,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:18:18,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 14:18:19,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:18:20,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:18:20,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 14:18:25,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=909153.3333333334, ans=0.1 2023-10-02 14:18:26,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:18:26,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:18:27,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:18:31,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 14:18:32,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:18:32,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 14:18:32,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:18:36,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:18:36,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:18:40,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 14:18:42,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:18:47,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=909286.6666666666, ans=0.0 2023-10-02 14:18:48,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:18:48,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 14:18:48,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:18:52,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:18:54,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 14:18:58,932 INFO [train.py:1046] (1/4) Epoch 26, batch 3600, loss[loss=0.1465, simple_loss=0.2236, pruned_loss=0.03474, over 24325.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.243, pruned_loss=0.04519, over 4695627.23 frames. ], batch size: 56, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:19:00,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 14:19:00,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:19:00,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:19:03,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:19:05,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:19:07,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:19:10,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:19:11,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:19:13,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:19:13,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:19:14,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:19:14,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 14:19:18,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:19:18,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:19:21,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:19:24,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:19:25,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:19:25,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:19:25,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 14:19:27,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:19:28,283 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.54 vs. limit=15.0 2023-10-02 14:19:29,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=909486.6666666666, ans=0.125 2023-10-02 14:19:30,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:19:30,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:19:31,192 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.09 vs. limit=15.0 2023-10-02 14:19:31,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:19:34,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:19:37,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:19:37,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 14:19:41,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=909486.6666666666, ans=0.2 2023-10-02 14:19:42,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:19:44,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:19:44,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 14:19:48,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:19:48,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=909553.3333333334, ans=0.04949747468305833 2023-10-02 14:19:52,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:19:55,939 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.927e+02 2.147e+02 2.530e+02 3.358e+02, threshold=4.294e+02, percent-clipped=0.0 2023-10-02 14:19:56,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:20:00,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:20:00,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:20:00,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 14:20:02,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 14:20:03,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 14:20:07,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:20:07,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:20:07,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 14:20:08,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:20:08,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:20:08,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:20:09,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 14:20:10,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 14:20:12,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:20:14,028 INFO [train.py:1046] (1/4) Epoch 26, batch 3650, loss[loss=0.1938, simple_loss=0.2655, pruned_loss=0.06107, over 23737.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.244, pruned_loss=0.0453, over 4704517.39 frames. ], batch size: 179, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:20:14,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 14:20:18,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 14:20:18,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=909686.6666666666, ans=0.2 2023-10-02 14:20:18,856 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.40 vs. limit=15.0 2023-10-02 14:20:19,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:20:22,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 14:20:23,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 14:20:29,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:20:29,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:20:31,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:20:31,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=909753.3333333334, ans=0.2 2023-10-02 14:20:35,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 14:20:35,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:20:35,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 14:20:35,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:20:35,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:20:37,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 14:20:38,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 14:20:38,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:20:38,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:20:41,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:20:42,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 14:20:44,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 14:20:44,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:20:45,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 14:20:48,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:20:48,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:20:53,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:20:55,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:20:55,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:20:56,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:20:58,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:21:00,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:21:00,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=909886.6666666666, ans=0.125 2023-10-02 14:21:00,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=909886.6666666666, ans=0.2 2023-10-02 14:21:02,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:21:04,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:21:04,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:21:04,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:21:06,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:21:06,957 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.17 vs. limit=15.0 2023-10-02 14:21:07,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:21:10,114 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.84 vs. limit=22.5 2023-10-02 14:21:13,384 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 14:21:16,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:21:16,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:21:17,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:21:18,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:21:18,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:21:20,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:21:21,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 14:21:21,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:21:24,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:21:27,019 INFO [train.py:1046] (1/4) Epoch 26, batch 3700, loss[loss=0.1754, simple_loss=0.2433, pruned_loss=0.05376, over 23868.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2456, pruned_loss=0.04628, over 4694477.67 frames. ], batch size: 212, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:21:27,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:21:28,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:21:31,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:21:31,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 14:21:31,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:21:31,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:21:33,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:21:37,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:21:41,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:21:41,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:21:41,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:21:41,839 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.12 vs. limit=15.0 2023-10-02 14:21:42,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:21:42,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 14:21:45,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:21:46,641 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 14:21:46,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=910086.6666666666, ans=0.125 2023-10-02 14:21:52,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:21:52,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:21:53,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:21:53,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 14:21:55,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:21:55,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=910153.3333333334, ans=0.125 2023-10-02 14:21:59,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:21:59,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 14:22:00,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:22:02,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:22:04,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:22:05,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:22:08,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 14:22:12,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:22:12,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 14:22:13,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:22:13,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 14:22:19,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:22:19,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:22:21,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:22:22,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 14:22:23,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=910220.0, ans=0.1 2023-10-02 14:22:24,735 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.885e+02 2.100e+02 2.312e+02 3.361e+02, threshold=4.200e+02, percent-clipped=0.0 2023-10-02 14:22:24,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:22:24,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:22:24,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:22:24,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:22:27,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:22:28,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 14:22:30,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 14:22:31,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:22:31,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:22:33,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:22:34,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:22:35,340 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.10 vs. limit=10.0 2023-10-02 14:22:38,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:22:41,247 INFO [train.py:1046] (1/4) Epoch 26, batch 3750, loss[loss=0.1649, simple_loss=0.2528, pruned_loss=0.03856, over 24644.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2467, pruned_loss=0.0467, over 4700592.85 frames. ], batch size: 68, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:22:41,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:22:41,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=910353.3333333334, ans=0.2 2023-10-02 14:22:42,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:22:42,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=910353.3333333334, ans=0.125 2023-10-02 14:22:44,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 14:22:44,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=910353.3333333334, ans=0.125 2023-10-02 14:22:45,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 14:22:47,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 14:22:48,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 14:22:48,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:22:49,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:22:51,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:22:52,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:22:56,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:22:59,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:23:00,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:23:02,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:23:03,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:23:05,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 14:23:07,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:23:08,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:23:10,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:23:12,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 14:23:17,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 14:23:18,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=910486.6666666666, ans=0.0 2023-10-02 14:23:19,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:23:19,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:23:20,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:23:24,204 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.05 vs. limit=15.0 2023-10-02 14:23:26,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:23:27,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 14:23:27,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=910553.3333333334, ans=0.0 2023-10-02 14:23:27,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=910553.3333333334, ans=0.125 2023-10-02 14:23:30,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 14:23:31,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:23:32,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=910553.3333333334, ans=0.125 2023-10-02 14:23:36,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:23:36,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:23:39,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:23:42,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 14:23:44,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:23:46,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:23:47,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:23:50,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:23:54,552 INFO [train.py:1046] (1/4) Epoch 26, batch 3800, loss[loss=0.1618, simple_loss=0.2529, pruned_loss=0.0353, over 24396.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2465, pruned_loss=0.04691, over 4697429.92 frames. ], batch size: 69, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:23:58,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:24:01,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:24:01,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 14:24:02,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 14:24:04,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:24:06,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:24:07,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 14:24:08,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 14:24:08,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:24:10,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:24:11,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:24:11,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:24:13,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:24:13,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 14:24:17,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 14:24:18,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:24:20,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:24:21,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:24:23,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:24:24,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 14:24:24,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:24:27,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:24:28,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:24:32,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 14:24:32,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 14:24:34,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:24:42,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:24:47,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:24:50,719 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.895e+02 2.051e+02 2.424e+02 3.630e+02, threshold=4.102e+02, percent-clipped=0.0 2023-10-02 14:24:50,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 14:24:50,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 14:24:50,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:24:53,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:24:53,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:24:56,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 14:24:59,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 14:24:59,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 14:24:59,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:25:00,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:25:03,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:25:05,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:25:06,911 INFO [train.py:1046] (1/4) Epoch 26, batch 3850, loss[loss=0.1731, simple_loss=0.2702, pruned_loss=0.03794, over 24302.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2459, pruned_loss=0.04646, over 4700917.72 frames. ], batch size: 74, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:25:12,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:25:13,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 14:25:14,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:25:14,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:25:18,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:25:19,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=911020.0, ans=0.0 2023-10-02 14:25:21,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:25:23,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 14:25:24,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 14:25:28,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:29,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=911086.6666666666, ans=0.0 2023-10-02 14:25:29,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=911086.6666666666, ans=0.0 2023-10-02 14:25:31,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:25:34,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:25:35,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:25:37,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:38,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:25:39,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:25:39,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:25:40,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:25:43,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:25:43,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:44,124 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:25:45,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:25:45,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 14:25:45,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 14:25:47,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:25:47,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:49,517 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.07 vs. limit=12.0 2023-10-02 14:25:49,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:25:49,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:50,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 14:25:52,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 14:25:54,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:25:56,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 14:25:58,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 14:26:02,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:26:02,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:26:06,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:26:06,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 14:26:10,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 14:26:10,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:26:10,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:26:13,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=911286.6666666666, ans=0.125 2023-10-02 14:26:15,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:26:15,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:26:15,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:16,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:16,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:26:16,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 14:26:19,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:26:20,299 INFO [train.py:1046] (1/4) Epoch 26, batch 3900, loss[loss=0.1721, simple_loss=0.2556, pruned_loss=0.04426, over 24005.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.245, pruned_loss=0.04587, over 4705157.73 frames. ], batch size: 80, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:26:20,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 14:26:21,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:21,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:26:23,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:26:23,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:24,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:26:24,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:26:24,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:26:25,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:26:25,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 14:26:27,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:30,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:26:30,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:26:31,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:26:32,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:26:34,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:26:34,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:37,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:26:38,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 14:26:38,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:26:40,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 14:26:40,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:41,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=911420.0, ans=0.1 2023-10-02 14:26:42,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 14:26:42,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 14:26:43,432 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=13.84 vs. limit=15.0 2023-10-02 14:26:45,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:26:48,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:26:48,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:26:48,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:26:52,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:26:55,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:26:58,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:26:58,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:26:59,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:27:05,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:27:05,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:27:05,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=911553.3333333334, ans=0.125 2023-10-02 14:27:08,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=911553.3333333334, ans=0.0 2023-10-02 14:27:13,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:27:14,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:27:17,263 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.831e+02 1.989e+02 2.156e+02 3.105e+02, threshold=3.979e+02, percent-clipped=0.0 2023-10-02 14:27:20,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:27:23,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:27:23,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 14:27:23,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 14:27:25,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:27:26,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 14:27:26,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:27:26,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 14:27:32,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:27:33,702 INFO [train.py:1046] (1/4) Epoch 26, batch 3950, loss[loss=0.1653, simple_loss=0.2344, pruned_loss=0.04816, over 23351.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.245, pruned_loss=0.04612, over 4698819.94 frames. ], batch size: 285, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:27:33,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 14:27:35,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:27:38,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:27:39,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=911686.6666666666, ans=0.0 2023-10-02 14:27:40,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:27:45,849 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 14:27:47,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:27:47,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 14:27:48,522 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 14:27:48,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:27:51,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:27:51,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:27:51,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:27:54,265 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.67 vs. limit=6.0 2023-10-02 14:27:56,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 14:27:56,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=911753.3333333334, ans=0.125 2023-10-02 14:27:57,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:27:57,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:27:57,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:27:57,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:27:58,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:28:05,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=911820.0, ans=0.125 2023-10-02 14:28:09,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:28:10,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:28:14,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 14:28:19,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 14:28:19,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 14:28:20,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:28:22,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:28:29,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:28:29,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:28:29,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:28:30,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:28:30,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 14:28:35,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:28:37,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:28:37,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=911953.3333333334, ans=0.125 2023-10-02 14:28:41,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 14:28:45,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=911953.3333333334, ans=0.125 2023-10-02 14:28:48,307 INFO [train.py:1046] (1/4) Epoch 26, batch 4000, loss[loss=0.1643, simple_loss=0.2474, pruned_loss=0.04061, over 24687.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2453, pruned_loss=0.04653, over 4706396.90 frames. ], batch size: 65, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:28:48,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:28:54,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:28:57,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=912020.0, ans=0.0 2023-10-02 14:28:58,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:29:00,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:29:00,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:29:00,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 14:29:00,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:29:01,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 14:29:01,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:29:01,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 14:29:03,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:29:06,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:29:06,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:29:07,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:29:08,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:29:08,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 14:29:09,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:29:10,044 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 14:29:11,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:29:12,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:29:15,571 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 14:29:16,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:29:16,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:29:24,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=912153.3333333334, ans=0.2 2023-10-02 14:29:25,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 14:29:26,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:29:28,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:29:28,450 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 14:29:29,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:29:29,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 14:29:29,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:29:31,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:29:31,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=912220.0, ans=0.0 2023-10-02 14:29:32,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:29:33,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:29:33,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:29:33,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:29:35,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 14:29:37,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:29:37,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=912220.0, ans=0.125 2023-10-02 14:29:38,660 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 14:29:45,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:29:46,604 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.860e+02 2.015e+02 2.242e+02 2.909e+02, threshold=4.030e+02, percent-clipped=0.0 2023-10-02 14:29:46,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=912286.6666666666, ans=0.125 2023-10-02 14:29:48,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 14:29:49,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:29:49,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:29:49,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:29:51,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:29:55,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:29:58,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 14:29:58,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 14:29:59,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:29:59,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:30:02,648 INFO [train.py:1046] (1/4) Epoch 26, batch 4050, loss[loss=0.2182, simple_loss=0.2784, pruned_loss=0.079, over 19571.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2468, pruned_loss=0.04671, over 4712680.42 frames. ], batch size: 389, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:30:02,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:30:04,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:30:04,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:30:04,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=912353.3333333334, ans=0.125 2023-10-02 14:30:08,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:30:11,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:30:11,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 14:30:12,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:30:12,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:30:17,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:30:19,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:30:20,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=912420.0, ans=0.0 2023-10-02 14:30:21,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 14:30:22,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 14:30:22,803 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 14:30:26,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:30:31,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 14:30:33,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:30:36,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:30:39,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:30:39,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:30:39,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:30:42,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:30:47,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 14:30:47,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 14:30:48,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:30:50,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 14:30:55,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:30:55,822 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:31:01,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 14:31:01,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:31:01,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:31:04,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 14:31:04,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 14:31:04,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:31:06,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:31:09,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:09,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:31:15,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=912620.0, ans=0.125 2023-10-02 14:31:16,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 14:31:18,136 INFO [train.py:1046] (1/4) Epoch 26, batch 4100, loss[loss=0.1658, simple_loss=0.2421, pruned_loss=0.04477, over 24448.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2477, pruned_loss=0.04691, over 4718659.42 frames. ], batch size: 58, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:31:18,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 14:31:19,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 14:31:20,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 14:31:21,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:31:21,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:22,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:22,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:31:22,404 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 14:31:23,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:31:25,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:31:25,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:31:26,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:31:26,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=912686.6666666666, ans=0.0 2023-10-02 14:31:32,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:31:32,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:31:34,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:31:34,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 14:31:34,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:34,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:31:34,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:31:36,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:31:37,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 14:31:41,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:31:41,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=912753.3333333334, ans=0.05 2023-10-02 14:31:42,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 14:31:44,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:31:44,957 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.24 vs. limit=15.0 2023-10-02 14:31:47,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:31:47,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 14:31:47,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:31:49,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:31:49,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:31:51,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 14:31:52,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:31:54,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:31:56,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 14:31:56,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:56,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:31:58,695 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.57 vs. limit=22.5 2023-10-02 14:31:59,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:32:07,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:32:08,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:32:08,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:32:16,361 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.437e+02 1.852e+02 2.033e+02 2.251e+02 3.212e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-02 14:32:17,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:32:17,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:32:21,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:32:23,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:32:24,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=912953.3333333334, ans=0.125 2023-10-02 14:32:24,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=912953.3333333334, ans=0.1 2023-10-02 14:32:26,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:32:28,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:32:29,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:32:29,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:32:32,310 INFO [train.py:1046] (1/4) Epoch 26, batch 4150, loss[loss=0.1584, simple_loss=0.2299, pruned_loss=0.04343, over 23763.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2472, pruned_loss=0.04649, over 4712979.45 frames. ], batch size: 149, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:32:32,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 14:32:32,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:32:33,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 14:32:33,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 14:32:34,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=913020.0, ans=0.0 2023-10-02 14:32:35,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 14:32:36,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:32:40,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:32:40,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:32:43,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:32:44,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:32:46,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:32:46,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=913086.6666666666, ans=0.09899494936611666 2023-10-02 14:32:47,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 14:32:47,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:32:49,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 14:32:54,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:32:59,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:33:00,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 14:33:03,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 14:33:03,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:33:03,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 14:33:03,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:33:04,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:33:07,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:33:07,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:33:14,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 14:33:16,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:33:18,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:33:18,977 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.56 vs. limit=15.0 2023-10-02 14:33:20,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 14:33:20,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:33:21,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 14:33:24,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:33:24,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:33:26,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:33:26,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 14:33:26,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:33:26,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 14:33:28,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:33:31,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 14:33:31,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:33:31,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:33:32,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:33:32,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 14:33:32,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:33:34,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:33:34,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:33:37,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:33:37,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 14:33:37,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:33:42,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:33:44,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 14:33:44,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=913286.6666666666, ans=0.0 2023-10-02 14:33:45,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:33:46,858 INFO [train.py:1046] (1/4) Epoch 26, batch 4200, loss[loss=0.1535, simple_loss=0.2369, pruned_loss=0.03504, over 24461.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2465, pruned_loss=0.04643, over 4718133.94 frames. ], batch size: 63, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:33:47,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:33:47,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=913353.3333333334, ans=0.05 2023-10-02 14:33:48,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:33:50,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:33:50,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:33:50,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=913353.3333333334, ans=0.125 2023-10-02 14:33:50,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=913353.3333333334, ans=0.1 2023-10-02 14:33:51,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 14:33:56,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 14:33:57,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:33:58,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:34:00,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:34:03,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 14:34:04,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:34:05,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:34:05,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 14:34:05,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:34:07,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:34:07,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:34:07,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:34:08,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:34:12,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 14:34:13,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:34:17,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:34:19,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:34:20,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:34:22,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:34:25,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:34:25,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 14:34:25,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:34:27,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:34:27,980 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.55 vs. limit=22.5 2023-10-02 14:34:32,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:34:34,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=913553.3333333334, ans=0.035 2023-10-02 14:34:35,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:34:35,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=913553.3333333334, ans=0.0 2023-10-02 14:34:39,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:34:42,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 14:34:43,626 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.384e+02 1.882e+02 2.071e+02 2.346e+02 3.876e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-02 14:34:43,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:34:45,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=913620.0, ans=0.0 2023-10-02 14:34:50,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 14:34:51,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:34:53,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 14:34:59,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:35:01,177 INFO [train.py:1046] (1/4) Epoch 26, batch 4250, loss[loss=0.16, simple_loss=0.2308, pruned_loss=0.04459, over 23602.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2456, pruned_loss=0.04567, over 4719417.73 frames. ], batch size: 256, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:35:02,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:35:02,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:35:03,669 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.46 vs. limit=15.0 2023-10-02 14:35:04,781 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.50 vs. limit=15.0 2023-10-02 14:35:05,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:35:09,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:35:09,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 14:35:11,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:35:12,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:35:15,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:35:20,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:35:20,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:35:21,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:35:21,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:35:23,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:35:25,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:35:25,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:35:28,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:35:29,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:35:31,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 14:35:35,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 14:35:36,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:35:36,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:35:36,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:35:36,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:35:36,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:35:38,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:35:42,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 14:35:42,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:35:45,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:35:47,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:35:48,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 14:35:48,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:35:50,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 14:35:51,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:35:53,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:35:57,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:35:57,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:35:58,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 14:35:58,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=913886.6666666666, ans=0.125 2023-10-02 14:35:59,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:36:00,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=913953.3333333334, ans=0.125 2023-10-02 14:36:00,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=913953.3333333334, ans=0.0 2023-10-02 14:36:01,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:36:04,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:36:06,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:36:08,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:36:09,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:36:11,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:36:13,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:36:13,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:36:13,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 14:36:15,021 INFO [train.py:1046] (1/4) Epoch 26, batch 4300, loss[loss=0.1591, simple_loss=0.2413, pruned_loss=0.03847, over 23326.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2454, pruned_loss=0.04528, over 4716535.15 frames. ], batch size: 93, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:36:15,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:36:21,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:36:21,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:36:23,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=914020.0, ans=0.125 2023-10-02 14:36:24,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:36:29,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=914086.6666666666, ans=0.125 2023-10-02 14:36:32,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:36:32,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 14:36:33,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:36:35,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:36:36,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:36:36,371 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 14:36:40,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:36:41,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:36:44,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 14:36:44,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:36:44,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 14:36:47,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 14:36:48,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:36:49,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=914153.3333333334, ans=0.1 2023-10-02 14:36:49,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=914153.3333333334, ans=0.125 2023-10-02 14:36:51,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=914153.3333333334, ans=0.0 2023-10-02 14:36:52,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:36:52,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=914153.3333333334, ans=0.0 2023-10-02 14:36:53,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:36:53,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:36:55,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:36:57,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:36:57,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 14:36:58,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 14:37:00,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:37:02,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:37:02,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:37:02,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:37:02,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:37:02,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 14:37:02,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 14:37:02,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=914220.0, ans=0.04949747468305833 2023-10-02 14:37:03,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 14:37:04,373 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.72 vs. limit=15.0 2023-10-02 14:37:05,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:37:06,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 14:37:06,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 14:37:10,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:37:11,857 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 14:37:13,156 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.468e+02 1.843e+02 2.069e+02 2.319e+02 3.612e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-02 14:37:13,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:37:14,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:37:14,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:37:18,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 14:37:18,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:37:18,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:37:18,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:37:18,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:37:18,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=914286.6666666666, ans=0.125 2023-10-02 14:37:19,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:37:21,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:37:24,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:37:24,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:37:26,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:37:29,590 INFO [train.py:1046] (1/4) Epoch 26, batch 4350, loss[loss=0.1741, simple_loss=0.2415, pruned_loss=0.05334, over 23841.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.246, pruned_loss=0.04528, over 4716270.94 frames. ], batch size: 212, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:37:29,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=914353.3333333334, ans=0.125 2023-10-02 14:37:32,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 14:37:32,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 14:37:38,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:37:41,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:37:43,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:37:43,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:37:46,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=914420.0, ans=0.1 2023-10-02 14:37:49,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:37:52,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:37:53,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:37:53,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:37:56,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:38:00,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:38:02,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:38:05,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=914486.6666666666, ans=0.0 2023-10-02 14:38:06,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 14:38:07,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:38:07,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:12,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:13,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 14:38:16,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:38:16,997 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.43 vs. limit=12.0 2023-10-02 14:38:17,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:38:21,818 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 14:38:23,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:38:23,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:38:25,053 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 14:38:25,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=914553.3333333334, ans=0.125 2023-10-02 14:38:26,396 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 14:38:26,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:38:26,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:38:28,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:38:28,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:38:29,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:38:29,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:38:32,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 14:38:32,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:32,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:38:32,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:33,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=914620.0, ans=0.0 2023-10-02 14:38:34,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 14:38:36,070 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 14:38:36,074 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 14:38:36,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 14:38:40,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:38:40,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:38:40,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:38:40,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:38:42,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 14:38:43,282 INFO [train.py:1046] (1/4) Epoch 26, batch 4400, loss[loss=0.1884, simple_loss=0.2624, pruned_loss=0.05718, over 23705.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2472, pruned_loss=0.04545, over 4718028.02 frames. ], batch size: 179, lr: 3.91e-03, grad_scale: 32.0 2023-10-02 14:38:44,743 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 14:38:44,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:48,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:38:48,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:50,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:38:51,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 14:38:51,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 14:38:52,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 14:38:53,008 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 14:38:54,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 14:38:54,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:38:54,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=914686.6666666666, ans=0.125 2023-10-02 14:38:56,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=914753.3333333334, ans=0.1 2023-10-02 14:38:57,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 14:38:57,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:39:00,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:39:00,885 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 14:39:02,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:39:02,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 14:39:02,449 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 14:39:05,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 14:39:07,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 14:39:07,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 14:39:08,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:39:09,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:39:11,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:39:11,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:39:12,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=914820.0, ans=0.2 2023-10-02 14:39:14,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 14:39:14,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 14:39:14,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:39:16,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:39:16,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:39:18,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:39:18,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:39:18,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 14:39:19,574 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 14:39:22,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:39:25,907 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.61 vs. limit=12.0 2023-10-02 14:39:28,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:39:28,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=914886.6666666666, ans=0.0 2023-10-02 14:39:29,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 14:39:31,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=914886.6666666666, ans=0.125 2023-10-02 14:39:35,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:39:37,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:39:38,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:39:38,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 14:39:38,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:39:38,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:39:38,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:39:40,332 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.911e+02 2.191e+02 2.459e+02 3.786e+02, threshold=4.382e+02, percent-clipped=0.0 2023-10-02 14:39:40,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:39:44,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 14:39:47,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 14:39:48,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 14:39:48,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:39:48,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 14:39:49,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:39:51,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:39:54,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 14:39:55,586 INFO [train.py:1046] (1/4) Epoch 26, batch 4450, loss[loss=0.1727, simple_loss=0.2456, pruned_loss=0.04988, over 23671.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2469, pruned_loss=0.04533, over 4723461.75 frames. ], batch size: 149, lr: 3.91e-03, grad_scale: 32.0 2023-10-02 14:39:58,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:40:02,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:04,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:40:07,618 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.17 vs. limit=6.0 2023-10-02 14:40:09,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:40:09,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:40:12,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:13,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:40:16,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:40:16,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=915086.6666666666, ans=0.1 2023-10-02 14:40:17,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:40:17,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 14:40:17,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:40:19,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:19,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:40:19,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:40:20,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:40:23,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=915153.3333333334, ans=0.125 2023-10-02 14:40:26,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:40:26,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:40:28,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:40:29,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:40:29,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:40:32,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 14:40:35,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 14:40:35,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 14:40:35,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:40:37,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:40:38,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 14:40:41,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:40:45,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:40:46,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 14:40:46,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:46,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:40:46,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:40:46,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:40:49,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:40:52,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:40:53,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 14:40:55,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:40:56,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:40:58,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:40:59,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:59,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 14:41:01,645 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.28 vs. limit=12.0 2023-10-02 14:41:02,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:41:03,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 14:41:05,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:41:08,735 INFO [train.py:1046] (1/4) Epoch 26, batch 4500, loss[loss=0.1496, simple_loss=0.2196, pruned_loss=0.03978, over 21596.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2467, pruned_loss=0.04571, over 4704859.46 frames. ], batch size: 47, lr: 3.91e-03, grad_scale: 32.0 2023-10-02 14:41:11,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:41:11,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 14:41:11,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 14:41:11,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=915353.3333333334, ans=0.125 2023-10-02 14:41:13,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:41:18,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:41:18,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:41:19,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:41:19,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:41:20,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:41:21,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:41:26,518 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.72 vs. limit=15.0 2023-10-02 14:41:34,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:41:36,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:41:38,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:41:39,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:41:39,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=915486.6666666666, ans=0.125 2023-10-02 14:41:40,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:41:46,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:41:49,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:41:53,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:41:56,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:41:56,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 14:41:57,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:41:57,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:42:00,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:42:00,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:42:03,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:42:03,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 14:42:03,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:42:03,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:42:07,070 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.824e+02 1.951e+02 2.153e+02 2.921e+02, threshold=3.902e+02, percent-clipped=0.0 2023-10-02 14:42:10,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:42:10,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:42:11,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:42:13,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:42:13,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:42:14,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 14:42:16,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 14:42:16,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 14:42:20,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 14:42:21,636 INFO [train.py:1046] (1/4) Epoch 26, batch 4550, loss[loss=0.1508, simple_loss=0.2267, pruned_loss=0.03742, over 24445.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2453, pruned_loss=0.04568, over 4698232.28 frames. ], batch size: 58, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:42:21,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=915686.6666666666, ans=0.0 2023-10-02 14:42:24,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 14:42:26,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:42:29,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:42:29,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:42:33,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:42:34,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=915686.6666666666, ans=0.125 2023-10-02 14:42:36,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:42:36,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:42:39,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:42:39,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:42:39,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:42:42,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:42:42,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:42:42,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=915753.3333333334, ans=0.0 2023-10-02 14:42:45,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:42:48,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 14:42:48,772 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.64 vs. limit=15.0 2023-10-02 14:42:49,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 14:42:50,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:42:52,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 14:42:54,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 14:42:55,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:42:58,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 14:42:59,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:43:04,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:04,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:04,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:43:05,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 14:43:09,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:43:12,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:12,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:43:13,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:43:13,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 14:43:13,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 14:43:15,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:43:16,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 14:43:17,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 14:43:19,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:43:19,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:43:19,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:43:20,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:20,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:43:21,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=915953.3333333334, ans=0.0 2023-10-02 14:43:22,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:43:22,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 14:43:23,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:43:23,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 14:43:25,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 14:43:25,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:43:25,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 14:43:28,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:43:28,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:43:31,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:43:33,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:33,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:43:33,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:43:36,500 INFO [train.py:1046] (1/4) Epoch 26, batch 4600, loss[loss=0.1716, simple_loss=0.2637, pruned_loss=0.03978, over 24640.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.244, pruned_loss=0.04534, over 4704264.51 frames. ], batch size: 73, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:43:36,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:43:39,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:43:39,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:43:42,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:43:42,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:43:44,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:43:45,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 14:43:46,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:43:48,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:43:48,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:43:49,278 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.38 vs. limit=15.0 2023-10-02 14:43:52,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:43:55,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=916086.6666666666, ans=0.125 2023-10-02 14:43:58,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 14:43:59,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:01,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:03,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:44:03,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:44:06,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=916153.3333333334, ans=0.0 2023-10-02 14:44:08,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=916153.3333333334, ans=0.0 2023-10-02 14:44:10,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 14:44:10,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:44:11,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:44:15,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:15,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:44:15,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=916153.3333333334, ans=0.1 2023-10-02 14:44:16,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=916153.3333333334, ans=0.07 2023-10-02 14:44:18,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:44:20,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 14:44:22,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 14:44:26,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:44:28,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:44:30,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:44:30,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 14:44:30,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:32,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 14:44:32,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:44:32,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:44:34,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:44:34,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:44:34,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:44:36,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 14:44:36,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 14:44:37,536 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.833e+02 1.993e+02 2.244e+02 3.849e+02, threshold=3.987e+02, percent-clipped=0.0 2023-10-02 14:44:37,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 14:44:37,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:44:39,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:44:39,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:44:40,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:44:49,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:44:50,685 INFO [train.py:1046] (1/4) Epoch 26, batch 4650, loss[loss=0.174, simple_loss=0.2638, pruned_loss=0.04204, over 24655.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2433, pruned_loss=0.0451, over 4696312.01 frames. ], batch size: 68, lr: 3.91e-03, grad_scale: 8.0 2023-10-02 14:44:52,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:44:52,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:54,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:44:54,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:44:54,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:44:54,584 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.31 vs. limit=6.0 2023-10-02 14:44:55,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:58,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 14:45:01,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:45:03,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 14:45:03,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:45:04,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 14:45:04,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:45:04,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 14:45:04,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 14:45:04,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:45:05,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:45:07,957 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.35 vs. limit=22.5 2023-10-02 14:45:08,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=916420.0, ans=0.0 2023-10-02 14:45:10,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:45:11,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:45:11,892 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 14:45:13,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=916420.0, ans=0.0 2023-10-02 14:45:14,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:45:14,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 14:45:17,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:45:17,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:45:17,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 14:45:18,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:45:22,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:45:25,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:45:27,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=916486.6666666666, ans=0.125 2023-10-02 14:45:31,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:45:33,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:45:35,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:45:35,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:45:39,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 14:45:39,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 14:45:39,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 14:45:39,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 14:45:41,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:45:43,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=916553.3333333334, ans=0.125 2023-10-02 14:45:49,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:45:49,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:45:49,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 14:45:49,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:45:51,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:45:51,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:45:52,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:45:56,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:45:56,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:45:56,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:45:58,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:45:58,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:45:59,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:46:02,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 14:46:02,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:46:03,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 14:46:04,810 INFO [train.py:1046] (1/4) Epoch 26, batch 4700, loss[loss=0.2142, simple_loss=0.2773, pruned_loss=0.07552, over 19467.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2442, pruned_loss=0.04524, over 4698447.87 frames. ], batch size: 388, lr: 3.91e-03, grad_scale: 8.0 2023-10-02 14:46:10,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:46:12,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:46:13,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:46:15,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:46:16,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 14:46:21,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 14:46:21,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=916753.3333333334, ans=0.125 2023-10-02 14:46:22,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 14:46:25,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:46:27,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:46:27,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:46:29,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:46:34,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:46:36,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 14:46:38,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:46:43,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 14:46:45,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:46:46,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:46:46,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=916820.0, ans=0.125 2023-10-02 14:46:49,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=916886.6666666666, ans=0.125 2023-10-02 14:46:52,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 14:46:53,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:46:59,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:46:59,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 14:47:02,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:47:02,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:47:05,236 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.831e+02 2.029e+02 2.263e+02 3.134e+02, threshold=4.059e+02, percent-clipped=0.0 2023-10-02 14:47:05,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:47:05,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:47:05,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 14:47:06,735 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 14:47:08,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:47:08,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:47:08,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:47:08,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 14:47:10,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:47:14,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 14:47:17,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:47:18,759 INFO [train.py:1046] (1/4) Epoch 26, batch 4750, loss[loss=0.1736, simple_loss=0.2508, pruned_loss=0.04817, over 23352.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2454, pruned_loss=0.04563, over 4704958.05 frames. ], batch size: 93, lr: 3.91e-03, grad_scale: 8.0 2023-10-02 14:47:18,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:47:24,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:47:24,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:47:26,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 14:47:26,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:47:28,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 14:47:30,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:47:30,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:47:32,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:47:35,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.54 vs. limit=15.0 2023-10-02 14:47:36,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 14:47:42,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:47:44,786 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.80 vs. limit=22.5 2023-10-02 14:47:45,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 14:47:45,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:47:49,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:47:49,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:47:49,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:47:49,902 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 14:47:49,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 14:47:55,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 14:47:57,965 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.96 vs. limit=15.0 2023-10-02 14:47:58,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:48:00,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:48:03,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:48:03,536 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 14:48:03,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:48:05,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:48:07,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:48:09,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 14:48:10,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 14:48:10,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:48:10,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:48:10,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:48:12,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:48:12,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 14:48:15,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 14:48:18,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:48:19,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:48:19,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 14:48:21,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:48:22,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:48:23,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:48:24,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:48:24,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:48:28,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:48:28,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 14:48:28,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 14:48:30,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 14:48:33,397 INFO [train.py:1046] (1/4) Epoch 26, batch 4800, loss[loss=0.1828, simple_loss=0.2694, pruned_loss=0.0481, over 24335.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2471, pruned_loss=0.04621, over 4704759.83 frames. ], batch size: 77, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:48:34,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:48:34,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:48:36,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 14:48:37,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=917353.3333333334, ans=0.125 2023-10-02 14:48:43,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:48:43,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:48:49,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:48:51,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:48:51,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:48:52,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 14:48:53,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:48:54,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:48:55,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:48:58,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:01,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:01,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:49:02,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:02,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 14:49:02,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:49:03,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:49:04,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:07,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:49:07,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:49:07,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:49:10,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:49:10,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:49:13,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 14:49:13,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 14:49:14,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:49:14,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:49:15,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:49:15,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:49:16,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:49:18,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:49:18,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=917553.3333333334, ans=0.125 2023-10-02 14:49:19,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:49:22,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:49:23,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:26,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:49:28,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=917553.3333333334, ans=0.0 2023-10-02 14:49:31,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 14:49:31,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:49:33,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:33,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:49:34,319 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.969e+02 2.180e+02 2.462e+02 3.461e+02, threshold=4.360e+02, percent-clipped=0.0 2023-10-02 14:49:34,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:49:37,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:49:37,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:49:38,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:39,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:49:40,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:49:40,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:49:40,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=917620.0, ans=0.0 2023-10-02 14:49:44,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:49:44,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:44,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:49:46,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 14:49:46,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=917686.6666666666, ans=0.125 2023-10-02 14:49:48,376 INFO [train.py:1046] (1/4) Epoch 26, batch 4850, loss[loss=0.178, simple_loss=0.2623, pruned_loss=0.04685, over 24392.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2472, pruned_loss=0.04571, over 4718784.78 frames. ], batch size: 77, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:49:48,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 14:49:48,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:48,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:48,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:49:48,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:51,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:49:53,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=917686.6666666666, ans=0.0 2023-10-02 14:49:58,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 14:49:58,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:50:01,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:50:03,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:50:03,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:50:06,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:50:07,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:50:09,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:50:09,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 14:50:14,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:50:15,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:50:16,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:50:18,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:50:18,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 14:50:19,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:50:19,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:50:23,460 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.31 vs. limit=15.0 2023-10-02 14:50:23,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:50:23,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 14:50:25,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 14:50:26,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:50:33,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:50:34,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 14:50:35,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:50:35,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:50:37,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:50:38,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 14:50:38,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:50:38,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 14:50:38,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:50:40,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:50:41,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 14:50:41,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=917886.6666666666, ans=0.1 2023-10-02 14:50:50,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:50:56,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:50:56,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:50:59,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=917953.3333333334, ans=0.125 2023-10-02 14:50:59,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=917953.3333333334, ans=0.125 2023-10-02 14:51:00,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 14:51:00,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:51:01,643 INFO [train.py:1046] (1/4) Epoch 26, batch 4900, loss[loss=0.1634, simple_loss=0.2511, pruned_loss=0.03783, over 24028.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2456, pruned_loss=0.04491, over 4723677.72 frames. ], batch size: 80, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:51:06,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:51:08,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:51:09,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:51:13,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 14:51:17,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 14:51:20,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 14:51:21,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 14:51:21,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:51:23,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:51:23,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:51:23,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:51:23,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:51:24,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 14:51:26,200 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:51:26,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=918086.6666666666, ans=0.1 2023-10-02 14:51:27,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 14:51:27,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=918086.6666666666, ans=0.0 2023-10-02 14:51:27,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=918086.6666666666, ans=0.1 2023-10-02 14:51:28,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:51:28,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:51:30,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:51:31,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:51:32,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:51:34,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:51:34,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 14:51:36,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:51:36,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:51:36,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 14:51:36,946 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.35 vs. limit=15.0 2023-10-02 14:51:38,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 14:51:42,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 14:51:42,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:51:44,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:51:44,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:51:45,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:51:45,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 14:51:47,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:51:47,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 14:51:50,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:51:50,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 14:51:53,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:51:55,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 14:51:56,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:51:57,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 14:51:58,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 14:52:02,743 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.867e+02 2.051e+02 2.311e+02 3.725e+02, threshold=4.103e+02, percent-clipped=0.0 2023-10-02 14:52:03,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=918286.6666666666, ans=0.125 2023-10-02 14:52:04,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:52:05,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:52:06,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 14:52:06,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:52:06,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:52:08,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=918286.6666666666, ans=0.125 2023-10-02 14:52:09,391 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.00 vs. limit=15.0 2023-10-02 14:52:10,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:52:13,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=918286.6666666666, ans=0.125 2023-10-02 14:52:14,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:52:14,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:52:14,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:52:14,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 14:52:16,116 INFO [train.py:1046] (1/4) Epoch 26, batch 4950, loss[loss=0.1718, simple_loss=0.2439, pruned_loss=0.04983, over 23325.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2437, pruned_loss=0.04471, over 4715829.99 frames. ], batch size: 105, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:52:16,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:52:16,493 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:52:19,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:52:19,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:52:22,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 14:52:22,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 14:52:22,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:52:23,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 14:52:23,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:52:23,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:52:23,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:52:25,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:52:26,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:52:26,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:52:27,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:52:29,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:52:31,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:52:31,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:52:34,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:52:39,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:52:40,143 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.55 vs. limit=15.0 2023-10-02 14:52:42,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:52:44,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:52:44,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:52:45,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:52:47,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 14:52:47,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 14:52:50,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:52:53,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:52:53,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:52:53,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:52:54,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:52:55,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:52:56,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=918486.6666666666, ans=0.125 2023-10-02 14:52:57,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:52:58,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:52:58,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:53:00,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:53:00,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:53:00,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=918553.3333333334, ans=0.125 2023-10-02 14:53:00,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=918553.3333333334, ans=0.125 2023-10-02 14:53:01,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 14:53:01,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:53:04,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:53:07,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:53:09,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:53:09,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:53:10,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:53:10,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:53:12,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:53:12,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=918553.3333333334, ans=0.125 2023-10-02 14:53:13,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:53:15,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:53:15,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:53:16,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 14:53:21,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:53:25,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 14:53:25,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 14:53:29,756 INFO [train.py:1046] (1/4) Epoch 26, batch 5000, loss[loss=0.1796, simple_loss=0.2672, pruned_loss=0.04597, over 24645.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2439, pruned_loss=0.04468, over 4708110.94 frames. ], batch size: 73, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:53:31,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:53:31,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:53:34,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 14:53:35,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 14:53:36,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:53:38,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 14:53:38,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:53:40,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:53:40,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 14:53:40,817 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:53:41,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:53:41,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:53:42,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=918686.6666666666, ans=0.05 2023-10-02 14:53:43,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 14:53:43,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:53:45,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:53:45,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 14:53:45,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 14:53:46,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:53:46,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 14:53:47,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:53:47,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:53:48,820 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.19 vs. limit=15.0 2023-10-02 14:53:49,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:53:49,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 14:53:49,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 14:53:49,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=918753.3333333334, ans=0.1 2023-10-02 14:53:50,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 14:53:51,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:53:51,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:53:52,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 14:53:52,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:53:55,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:53:56,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:53:57,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 14:53:59,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 14:54:00,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:54:01,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:54:05,958 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 14:54:08,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:54:09,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:54:09,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:12,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 14:54:12,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=918886.6666666666, ans=0.0 2023-10-02 14:54:14,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:54:14,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:54:14,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:54:15,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 14:54:17,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:54:18,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=918886.6666666666, ans=0.035 2023-10-02 14:54:21,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:54:21,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:54:27,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 14:54:30,199 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.795e+02 1.977e+02 2.128e+02 2.824e+02, threshold=3.954e+02, percent-clipped=0.0 2023-10-02 14:54:31,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:35,316 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.47 vs. limit=15.0 2023-10-02 14:54:36,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=918953.3333333334, ans=0.0 2023-10-02 14:54:40,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:54:40,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:40,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:54:42,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:54:42,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:54:42,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:54:43,491 INFO [train.py:1046] (1/4) Epoch 26, batch 5050, loss[loss=0.1779, simple_loss=0.2604, pruned_loss=0.04768, over 24362.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2443, pruned_loss=0.04489, over 4714093.84 frames. ], batch size: 77, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:54:43,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:48,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:48,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 14:54:48,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:54:50,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:54:50,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:54:52,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 14:54:52,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:54:54,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:54:56,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:54:56,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:54:58,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:55:07,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 14:55:08,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 14:55:09,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:55:09,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 14:55:10,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:55:12,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:55:12,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:55:13,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:55:13,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 14:55:13,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=919153.3333333334, ans=0.125 2023-10-02 14:55:14,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 14:55:14,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:55:18,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:55:20,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:55:20,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 14:55:23,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:55:26,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 14:55:26,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:55:26,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:55:27,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:55:29,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:55:30,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:55:32,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:55:32,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=919220.0, ans=0.125 2023-10-02 14:55:33,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:55:33,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:55:33,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:55:33,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 14:55:34,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:55:36,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:55:39,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:55:39,504 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 14:55:39,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 14:55:40,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:55:42,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:55:42,250 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 14:55:44,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:55:44,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 14:55:44,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:55:48,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:55:50,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:55:50,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 14:55:50,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 14:55:53,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:55:53,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:55:53,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:55:56,537 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 14:55:57,809 INFO [train.py:1046] (1/4) Epoch 26, batch 5100, loss[loss=0.1807, simple_loss=0.255, pruned_loss=0.05319, over 23429.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2453, pruned_loss=0.04557, over 4700933.06 frames. ], batch size: 93, lr: 3.90e-03, grad_scale: 8.0 2023-10-02 14:55:59,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:56:00,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 14:56:00,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=919353.3333333334, ans=0.0 2023-10-02 14:56:02,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 14:56:02,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:56:03,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:56:06,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:56:06,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 14:56:06,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 14:56:09,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=919353.3333333334, ans=0.125 2023-10-02 14:56:11,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:56:11,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:56:15,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:56:20,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 14:56:20,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:56:22,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:56:22,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 14:56:24,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:56:26,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:56:26,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 14:56:29,145 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 14:56:30,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:56:30,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 14:56:30,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 14:56:33,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:56:33,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=919486.6666666666, ans=0.125 2023-10-02 14:56:36,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=919486.6666666666, ans=0.2 2023-10-02 14:56:41,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=919553.3333333334, ans=0.95 2023-10-02 14:56:42,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:56:42,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=919553.3333333334, ans=0.2 2023-10-02 14:56:43,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 14:56:43,694 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 14:56:45,001 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 14:56:46,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 14:56:46,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:56:49,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 14:56:54,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 14:56:55,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:56:56,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:56:59,930 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.826e+02 2.092e+02 2.413e+02 3.639e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-02 14:57:00,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 14:57:02,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 14:57:04,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 14:57:08,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:57:08,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:57:08,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:57:09,099 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.31 vs. limit=15.0 2023-10-02 14:57:09,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:57:09,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:57:10,907 INFO [train.py:1046] (1/4) Epoch 26, batch 5150, loss[loss=0.1711, simple_loss=0.2591, pruned_loss=0.04152, over 23940.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2459, pruned_loss=0.04554, over 4717349.37 frames. ], batch size: 86, lr: 3.90e-03, grad_scale: 8.0 2023-10-02 14:57:10,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:57:11,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 14:57:11,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 14:57:11,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 14:57:12,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:57:12,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 14:57:14,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:57:14,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 14:57:15,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:57:17,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:57:22,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 14:57:23,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 14:57:24,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:57:24,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:57:26,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:57:26,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:57:26,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:57:28,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:57:28,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:57:28,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 14:57:31,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:57:32,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:57:32,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:57:34,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 14:57:36,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:57:40,306 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.88 vs. limit=22.5 2023-10-02 14:57:41,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:57:42,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 14:57:45,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:57:52,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:57:54,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:57:58,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:57:59,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:58:02,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 14:58:05,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:58:05,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:58:07,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:58:09,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:58:10,570 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.72 vs. limit=12.0 2023-10-02 14:58:11,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:58:12,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 14:58:17,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:58:18,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:58:21,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:58:21,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:58:21,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 14:58:23,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:58:23,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:58:23,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:58:25,773 INFO [train.py:1046] (1/4) Epoch 26, batch 5200, loss[loss=0.1681, simple_loss=0.2566, pruned_loss=0.03978, over 24539.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2465, pruned_loss=0.04527, over 4721839.11 frames. ], batch size: 71, lr: 3.90e-03, grad_scale: 16.0 2023-10-02 14:58:25,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:58:27,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:58:27,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=920020.0, ans=0.125 2023-10-02 14:58:30,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:58:35,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 14:58:36,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:58:37,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:58:40,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:58:40,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:58:41,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:58:41,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 14:58:44,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:58:44,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:58:48,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 14:58:50,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:58:52,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:58:52,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 14:58:54,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 14:58:55,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 14:58:55,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:58:55,585 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 14:58:55,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:58:56,392 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.39 vs. limit=12.0 2023-10-02 14:58:58,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:58:58,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:58:58,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 14:58:59,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:59:01,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:59:04,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 14:59:05,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 14:59:05,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 14:59:06,382 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.59 vs. limit=15.0 2023-10-02 14:59:08,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=920220.0, ans=0.1 2023-10-02 14:59:09,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 14:59:09,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:59:15,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:59:15,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:59:16,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=920220.0, ans=0.125 2023-10-02 14:59:17,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 14:59:18,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:59:18,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 14:59:18,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:59:18,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:59:23,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:59:25,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:59:27,489 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.858e+02 2.023e+02 2.241e+02 5.045e+02, threshold=4.045e+02, percent-clipped=1.0 2023-10-02 14:59:28,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:59:30,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:59:30,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:59:33,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:59:33,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 14:59:35,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:59:35,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:59:37,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:59:37,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 14:59:38,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:59:39,854 INFO [train.py:1046] (1/4) Epoch 26, batch 5250, loss[loss=0.1647, simple_loss=0.2217, pruned_loss=0.05378, over 19122.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2451, pruned_loss=0.04494, over 4712974.60 frames. ], batch size: 388, lr: 3.90e-03, grad_scale: 16.0 2023-10-02 14:59:41,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:59:44,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:59:44,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:59:44,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:59:44,786 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.92 vs. limit=15.0 2023-10-02 14:59:51,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:59:54,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:59:56,636 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.93 vs. limit=6.0 2023-10-02 14:59:57,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:59:58,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:00:00,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 15:00:00,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:00:02,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:00:03,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=920420.0, ans=0.0 2023-10-02 15:00:06,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=920420.0, ans=0.05 2023-10-02 15:00:22,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=920553.3333333334, ans=0.125 2023-10-02 15:00:26,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=920553.3333333334, ans=0.5 2023-10-02 15:00:31,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=920553.3333333334, ans=0.1 2023-10-02 15:00:48,981 INFO [train.py:1046] (1/4) Epoch 26, batch 5300, loss[loss=0.1806, simple_loss=0.2471, pruned_loss=0.057, over 23777.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2442, pruned_loss=0.04451, over 4731379.41 frames. ], batch size: 164, lr: 3.90e-03, grad_scale: 16.0 2023-10-02 15:00:49,426 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.91 vs. limit=15.0 2023-10-02 15:00:54,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=920686.6666666666, ans=0.125 2023-10-02 15:01:03,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:01:03,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 15:01:03,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 15:01:03,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:01:03,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:01:03,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:01:03,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:01:03,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:01:03,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:03,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:01:03,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 15:01:04,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:01:04,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 15:01:04,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 15:01:04,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 15:01:04,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:01:04,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 15:01:04,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 15:01:04,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:01:05,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:01:05,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:01:05,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:01:05,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:01:05,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:01:05,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:01:05,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:01:05,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:01:06,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:01:06,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:01:06,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:01:06,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:01:06,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 15:01:06,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:01:06,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:01:06,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 15:01:06,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 15:01:07,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:01:07,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:01:07,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 15:01:07,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 15:01:07,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 15:01:08,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:01:08,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:01:08,370 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 15:01:08,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 15:01:08,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 15:01:08,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:01:08,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 15:01:08,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 15:01:08,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 15:01:08,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 15:01:15,203 INFO [train.py:1046] (1/4) Epoch 27, batch 0, loss[loss=0.1547, simple_loss=0.2295, pruned_loss=0.03995, over 21971.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2295, pruned_loss=0.03995, over 21971.00 frames. ], batch size: 48, lr: 3.83e-03, grad_scale: 32.0 2023-10-02 15:01:15,203 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 15:01:27,545 INFO [train.py:1078] (1/4) Epoch 27, validation: loss=0.313, simple_loss=0.2744, pruned_loss=0.1758, over 1125622.00 frames. 2023-10-02 15:01:27,545 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 15:01:30,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 15:01:31,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:01:33,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:01:37,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:01:37,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:01:37,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:39,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 15:01:40,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 15:01:41,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:43,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:44,259 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.59 vs. limit=10.0 2023-10-02 15:01:46,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:46,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:01:46,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:01:46,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:01:48,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 15:01:50,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:01:56,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:01:56,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:01:57,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 15:02:04,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:02:04,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:02:06,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:02:07,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=920900.0, ans=0.125 2023-10-02 15:02:07,483 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.57 vs. limit=15.0 2023-10-02 15:02:07,888 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.75 vs. limit=15.0 2023-10-02 15:02:09,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:02:11,024 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 2.074e+02 2.559e+02 3.176e+02 5.504e+02, threshold=5.117e+02, percent-clipped=16.0 2023-10-02 15:02:12,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:02:16,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 15:02:20,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 15:02:20,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:02:20,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:02:21,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=920966.6666666666, ans=0.0 2023-10-02 15:02:22,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:02:22,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:02:25,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 15:02:28,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:02:28,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:02:32,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:02:35,422 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 15:02:36,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:02:38,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:02:39,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:02:40,893 INFO [train.py:1046] (1/4) Epoch 27, batch 50, loss[loss=0.1726, simple_loss=0.2589, pruned_loss=0.04317, over 24651.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2479, pruned_loss=0.04805, over 1062267.13 frames. ], batch size: 73, lr: 3.83e-03, grad_scale: 32.0 2023-10-02 15:02:40,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 15:02:41,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:02:42,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:02:42,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=921100.0, ans=0.125 2023-10-02 15:02:43,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:02:44,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:02:46,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:02:47,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=921100.0, ans=0.125 2023-10-02 15:02:50,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=921100.0, ans=0.125 2023-10-02 15:02:51,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 15:02:51,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:02:57,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:03:01,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 15:03:02,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 15:03:03,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:03:03,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=921166.6666666666, ans=0.125 2023-10-02 15:03:05,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:03:05,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:03:05,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:03:07,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:03:07,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 15:03:07,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:03:15,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:03:17,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:03:17,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:03:18,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 15:03:21,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:03:21,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:03:21,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 15:03:22,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:03:25,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 15:03:33,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:03:33,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:03:35,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:03:36,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:03:36,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:03:39,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 15:03:40,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 15:03:43,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:03:43,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:03:44,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:03:44,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:03:44,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 15:03:44,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 15:03:46,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 15:03:47,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:03:48,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:03:48,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 15:03:48,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 15:03:50,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:03:51,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:03:53,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 15:03:53,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:03:54,369 INFO [train.py:1046] (1/4) Epoch 27, batch 100, loss[loss=0.1675, simple_loss=0.2385, pruned_loss=0.04827, over 23804.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2477, pruned_loss=0.04617, over 1880615.77 frames. ], batch size: 212, lr: 3.83e-03, grad_scale: 16.0 2023-10-02 15:03:55,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:03:58,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:04:01,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=921433.3333333334, ans=0.125 2023-10-02 15:04:02,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:04:03,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 15:04:03,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:04:06,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:04:08,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:04:08,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:04:08,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:04:08,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:04:09,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 15:04:14,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 15:04:14,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:04:14,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:04:14,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:04:18,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 15:04:19,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:04:20,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=921500.0, ans=0.125 2023-10-02 15:04:21,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:04:21,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:04:22,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 15:04:27,029 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 15:04:27,052 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 15:04:28,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:04:28,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:04:31,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:04:34,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:04:35,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:04:38,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=921633.3333333334, ans=0.125 2023-10-02 15:04:40,520 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.396e+02 1.778e+02 2.000e+02 2.218e+02 5.015e+02, threshold=4.000e+02, percent-clipped=0.0 2023-10-02 15:04:40,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:04:40,681 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 15:04:43,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 15:04:46,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:04:46,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:04:47,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:04:50,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:04:54,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:04:55,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:04:57,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:04:57,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:04:58,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:04:58,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:04:58,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:05:00,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 15:05:00,219 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 15:05:00,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:05:02,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:05:02,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:02,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:05:02,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 15:05:02,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 15:05:03,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 15:05:03,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:04,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:05:05,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=921700.0, ans=0.1 2023-10-02 15:05:06,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:05:06,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:05:06,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:05:08,226 INFO [train.py:1046] (1/4) Epoch 27, batch 150, loss[loss=0.2126, simple_loss=0.2808, pruned_loss=0.07218, over 19382.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2489, pruned_loss=0.04665, over 2508962.58 frames. ], batch size: 388, lr: 3.83e-03, grad_scale: 16.0 2023-10-02 15:05:09,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:05:12,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:05:12,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:05:12,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:15,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:05:15,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:18,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:05:18,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:23,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 15:05:23,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 15:05:23,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 15:05:25,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:05:25,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:05:26,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:05:28,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:05:28,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:05:29,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:29,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:31,386 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 15:05:34,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:05:34,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=921833.3333333334, ans=0.0 2023-10-02 15:05:35,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=921900.0, ans=0.0 2023-10-02 15:05:39,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:05:42,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:05:44,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 15:05:47,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:05:47,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:05:49,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:05:49,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=921900.0, ans=0.125 2023-10-02 15:05:50,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:05:52,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:05:52,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:05:54,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:05:54,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 15:05:57,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=921966.6666666666, ans=0.125 2023-10-02 15:06:00,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:06:00,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=921966.6666666666, ans=0.95 2023-10-02 15:06:01,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:06:03,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:06:03,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:06:06,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:06:06,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 15:06:06,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=922033.3333333334, ans=0.04949747468305833 2023-10-02 15:06:09,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:06:12,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:06:13,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:06:16,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:06:16,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 15:06:16,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:06:16,568 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 15:06:18,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=922033.3333333334, ans=0.2 2023-10-02 15:06:20,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:06:21,856 INFO [train.py:1046] (1/4) Epoch 27, batch 200, loss[loss=0.1867, simple_loss=0.259, pruned_loss=0.0572, over 23340.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.249, pruned_loss=0.0468, over 2996613.45 frames. ], batch size: 93, lr: 3.82e-03, grad_scale: 8.0 2023-10-02 15:06:26,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:06:26,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:06:27,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 15:06:28,050 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.89 vs. limit=6.0 2023-10-02 15:06:28,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:06:30,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:06:33,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 15:06:34,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:06:34,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:06:37,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:06:39,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=922166.6666666666, ans=0.0 2023-10-02 15:06:40,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:06:40,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:06:40,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:06:42,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=922166.6666666666, ans=0.0 2023-10-02 15:06:46,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=922166.6666666666, ans=0.125 2023-10-02 15:06:54,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=922233.3333333334, ans=0.1 2023-10-02 15:06:59,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:07:01,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:07:02,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:07:02,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:07:03,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 15:07:03,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:07:05,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:06,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:07:08,368 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.469e+02 1.865e+02 2.065e+02 2.281e+02 3.557e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-02 15:07:08,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:07:08,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:07:08,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 15:07:09,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 15:07:09,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:07:14,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:07:18,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:07:24,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:24,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:07:31,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:34,362 INFO [train.py:1046] (1/4) Epoch 27, batch 250, loss[loss=0.1816, simple_loss=0.2644, pruned_loss=0.04937, over 24476.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2489, pruned_loss=0.04664, over 3383807.57 frames. ], batch size: 69, lr: 3.82e-03, grad_scale: 8.0 2023-10-02 15:07:34,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 15:07:35,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:07:35,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:07:35,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:07:35,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:07:37,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 15:07:39,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:07:39,057 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 15:07:40,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:42,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:07:42,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:42,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:07:44,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:07:44,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:46,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:07:47,906 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:07:49,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:07:50,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=922500.0, ans=0.0 2023-10-02 15:07:55,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=922500.0, ans=0.2 2023-10-02 15:07:58,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:08:03,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:08:03,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:08:09,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:08:09,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:08:11,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:08:12,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:08:12,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:08:12,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:08:12,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:08:15,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:08:16,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 15:08:16,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:08:18,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:08:19,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=922633.3333333334, ans=0.0 2023-10-02 15:08:20,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:08:20,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:08:20,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:08:20,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=922633.3333333334, ans=0.125 2023-10-02 15:08:21,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:08:21,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:08:24,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:08:26,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:08:27,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:08:32,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:08:32,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=922700.0, ans=0.125 2023-10-02 15:08:36,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:08:37,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:08:43,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:08:43,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:08:47,745 INFO [train.py:1046] (1/4) Epoch 27, batch 300, loss[loss=0.1755, simple_loss=0.2528, pruned_loss=0.04908, over 23417.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2464, pruned_loss=0.04549, over 3679547.05 frames. ], batch size: 106, lr: 3.82e-03, grad_scale: 8.0 2023-10-02 15:08:47,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 15:08:49,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:08:49,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:08:51,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 15:08:52,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 15:08:55,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:08:55,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 15:08:59,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:09:00,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:09:03,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:09:04,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 15:09:04,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:09:05,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:09:05,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 15:09:05,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:09:10,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 15:09:14,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:09:14,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 15:09:17,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 15:09:17,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:20,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:09:24,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:24,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 15:09:24,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:09:24,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:09:25,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:09:25,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:09:31,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 15:09:31,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 15:09:32,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:09:34,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:35,794 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.868e+02 2.076e+02 2.400e+02 4.267e+02, threshold=4.152e+02, percent-clipped=1.0 2023-10-02 15:09:35,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 15:09:37,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:09:41,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:09:43,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:09:43,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 15:09:43,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=922966.6666666666, ans=0.1 2023-10-02 15:09:44,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=922966.6666666666, ans=0.125 2023-10-02 15:09:47,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:47,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:09:47,646 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:09:49,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:50,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:09:50,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 15:09:52,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 15:09:52,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:09:54,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 15:09:55,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:55,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:09:56,120 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.37 vs. limit=22.5 2023-10-02 15:09:57,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:09:57,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:09:58,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:00,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=923033.3333333334, ans=0.0 2023-10-02 15:10:02,719 INFO [train.py:1046] (1/4) Epoch 27, batch 350, loss[loss=0.1685, simple_loss=0.2517, pruned_loss=0.04262, over 24475.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2444, pruned_loss=0.04525, over 3894578.29 frames. ], batch size: 63, lr: 3.82e-03, grad_scale: 8.0 2023-10-02 15:10:02,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:10:02,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 15:10:03,470 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.90 vs. limit=15.0 2023-10-02 15:10:05,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:11,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:10:13,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:10:13,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:15,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 15:10:18,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:10:18,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 15:10:18,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=923166.6666666666, ans=0.125 2023-10-02 15:10:21,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:23,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 15:10:24,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:10:27,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 15:10:29,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:10:31,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:10:31,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:10:31,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=923233.3333333334, ans=0.125 2023-10-02 15:10:31,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=923233.3333333334, ans=0.2 2023-10-02 15:10:32,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=923233.3333333334, ans=0.125 2023-10-02 15:10:33,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:10:33,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:10:33,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:10:33,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:10:33,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:10:34,517 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.03 vs. limit=12.0 2023-10-02 15:10:35,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:10:35,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:40,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=923233.3333333334, ans=0.0 2023-10-02 15:10:43,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:10:43,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:10:43,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:10:45,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:10:46,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=923300.0, ans=0.125 2023-10-02 15:10:50,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 15:10:50,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:56,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:10:56,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:10:56,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:10:56,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=923300.0, ans=0.2 2023-10-02 15:10:57,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 15:11:00,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:00,993 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 15:11:02,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 15:11:02,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:11:05,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:11:05,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 15:11:07,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:11,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:11:11,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=923366.6666666666, ans=0.125 2023-10-02 15:11:12,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:11:14,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:14,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:11:16,748 INFO [train.py:1046] (1/4) Epoch 27, batch 400, loss[loss=0.1623, simple_loss=0.2444, pruned_loss=0.04005, over 24663.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2442, pruned_loss=0.04488, over 4081115.08 frames. ], batch size: 68, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:11:16,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:11:19,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:11:20,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:11:22,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 15:11:22,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:23,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:11:26,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:11:26,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:11:27,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=923433.3333333334, ans=0.125 2023-10-02 15:11:29,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:11:31,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:11:33,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 15:11:33,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 15:11:33,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:11:33,383 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:11:35,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 15:11:35,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:11:39,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:11:39,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:11:39,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 15:11:41,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:11:41,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:11:41,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:11:43,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:44,711 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 15:11:44,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 15:11:48,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:11:50,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:11:50,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 15:11:52,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 15:11:53,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=923566.6666666666, ans=0.125 2023-10-02 15:11:55,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:11:55,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:11:55,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=923566.6666666666, ans=0.125 2023-10-02 15:12:03,399 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.781e+02 1.952e+02 2.127e+02 3.140e+02, threshold=3.905e+02, percent-clipped=0.0 2023-10-02 15:12:03,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 15:12:06,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:12:07,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 15:12:07,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=923633.3333333334, ans=0.07 2023-10-02 15:12:10,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:12:12,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:12:13,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 15:12:17,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:12:19,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 15:12:20,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:12:22,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:12:22,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 15:12:25,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 15:12:27,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 15:12:29,445 INFO [train.py:1046] (1/4) Epoch 27, batch 450, loss[loss=0.162, simple_loss=0.2352, pruned_loss=0.0444, over 23686.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2447, pruned_loss=0.04493, over 4223904.96 frames. ], batch size: 135, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:12:29,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:12:29,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:12:32,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 15:12:32,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=923766.6666666666, ans=0.125 2023-10-02 15:12:34,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:12:34,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:12:36,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:12:37,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 15:12:38,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:12:39,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:12:39,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:12:40,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 15:12:40,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:12:41,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:12:42,799 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.89 vs. limit=10.0 2023-10-02 15:12:43,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:12:44,574 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.42 vs. limit=22.5 2023-10-02 15:12:52,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:12:52,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:12:54,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 15:12:55,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 15:12:59,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:13:00,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:13:02,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:13:05,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:13:07,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:13:08,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 15:13:10,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 15:13:11,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 15:13:11,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:13:13,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:13:13,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:13:14,695 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 15:13:16,056 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 15:13:16,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:13:17,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:13:18,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 15:13:19,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=923966.6666666666, ans=0.125 2023-10-02 15:13:20,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 15:13:20,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:13:21,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 15:13:21,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 15:13:24,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:13:26,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:13:27,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:13:29,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 15:13:29,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=924033.3333333334, ans=0.1 2023-10-02 15:13:31,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=924033.3333333334, ans=0.0 2023-10-02 15:13:34,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:13:34,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 15:13:35,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 15:13:37,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:13:38,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=924033.3333333334, ans=0.125 2023-10-02 15:13:41,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:13:42,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:13:44,262 INFO [train.py:1046] (1/4) Epoch 27, batch 500, loss[loss=0.1624, simple_loss=0.2451, pruned_loss=0.03981, over 23363.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.245, pruned_loss=0.04503, over 4334871.82 frames. ], batch size: 93, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:13:45,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:13:46,115 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 15:13:50,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:13:50,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:13:51,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:13:51,668 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 15:13:53,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 15:13:53,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:13:55,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 15:14:00,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 15:14:01,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:14:03,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:14:03,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:14:05,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:07,872 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.60 vs. limit=6.0 2023-10-02 15:14:14,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:14:14,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:14:14,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 15:14:16,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:14:16,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 15:14:16,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:14:17,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:14:18,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:14:18,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:14:18,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:14:20,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 15:14:21,665 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 15:14:24,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:14:25,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:27,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:27,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:27,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:14:28,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=924300.0, ans=0.09899494936611666 2023-10-02 15:14:30,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 15:14:31,511 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.886e+02 2.128e+02 2.373e+02 3.584e+02, threshold=4.256e+02, percent-clipped=0.0 2023-10-02 15:14:33,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:14:33,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:14:35,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=924300.0, ans=0.125 2023-10-02 15:14:38,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:14:39,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:44,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:14:45,231 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.30 vs. limit=15.0 2023-10-02 15:14:45,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=924366.6666666666, ans=0.0 2023-10-02 15:14:47,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 15:14:47,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:14:47,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:14:51,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 15:14:51,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:14:52,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:14:57,926 INFO [train.py:1046] (1/4) Epoch 27, batch 550, loss[loss=0.1714, simple_loss=0.2402, pruned_loss=0.05127, over 23531.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2464, pruned_loss=0.04573, over 4419642.53 frames. ], batch size: 256, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:14:58,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 15:15:01,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 15:15:02,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:15:02,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 15:15:02,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:15:02,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:15:04,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:04,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:06,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:15:06,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:15:09,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:15:09,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 15:15:09,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:15:13,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:15:13,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:17,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:15:19,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:23,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 15:15:24,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 15:15:25,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:15:30,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:15:30,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:15:31,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:15:36,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:15:36,954 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 15:15:37,634 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.18 vs. limit=15.0 2023-10-02 15:15:38,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:38,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 15:15:41,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:15:43,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:15:43,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:15:44,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:15:44,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 15:15:47,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 15:15:47,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:15:47,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:15:48,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:15:48,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:15:50,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:15:51,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:15:54,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:15:54,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=924633.3333333334, ans=0.1 2023-10-02 15:15:55,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:15:55,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 15:15:57,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:15:57,489 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:15:58,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:15:59,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:16:00,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:16:01,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 15:16:01,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 15:16:06,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=924700.0, ans=0.0 2023-10-02 15:16:07,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 15:16:10,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 15:16:10,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:16:10,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:16:10,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:16:12,156 INFO [train.py:1046] (1/4) Epoch 27, batch 600, loss[loss=0.1725, simple_loss=0.2409, pruned_loss=0.05208, over 23649.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2469, pruned_loss=0.04617, over 4482741.68 frames. ], batch size: 135, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:16:18,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:16:20,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:16:22,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 15:16:23,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:16:26,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:16:29,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:16:30,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 15:16:31,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:16:39,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 15:16:43,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:16:43,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:16:43,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:16:49,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:16:49,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:16:49,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:16:53,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:16:57,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:16:57,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:16:57,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:16:59,384 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.876e+02 2.029e+02 2.331e+02 3.965e+02, threshold=4.058e+02, percent-clipped=0.0 2023-10-02 15:17:02,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 15:17:09,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 15:17:09,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:17:13,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 15:17:15,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:17:16,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 15:17:16,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:17:17,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:17:22,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 15:17:22,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 15:17:25,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:17:25,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:17:26,472 INFO [train.py:1046] (1/4) Epoch 27, batch 650, loss[loss=0.1555, simple_loss=0.2362, pruned_loss=0.03736, over 24315.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.244, pruned_loss=0.04553, over 4509660.37 frames. ], batch size: 61, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:17:26,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:17:30,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 15:17:32,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:17:32,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=925100.0, ans=0.125 2023-10-02 15:17:38,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:17:38,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:17:41,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:17:44,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 15:17:46,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:17:47,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:17:51,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:17:51,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 15:17:53,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:17:53,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:17:54,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 15:17:56,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:17:57,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:17:57,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:17:57,621 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 15:17:57,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:17:58,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:17:59,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=925233.3333333334, ans=0.125 2023-10-02 15:18:02,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:18:02,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:18:03,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:18:05,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:18:05,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 15:18:07,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:18:07,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:18:07,400 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:18:08,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:18:08,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:18:09,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:18:09,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 15:18:10,467 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.43 vs. limit=15.0 2023-10-02 15:18:12,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=925300.0, ans=0.025 2023-10-02 15:18:13,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 15:18:13,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:18:13,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:18:13,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:18:13,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:18:14,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:18:20,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:18:21,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:18:23,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:18:25,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:18:25,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:18:27,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:18:33,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:18:33,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:18:34,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:18:35,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:18:41,047 INFO [train.py:1046] (1/4) Epoch 27, batch 700, loss[loss=0.1732, simple_loss=0.2371, pruned_loss=0.05469, over 23544.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2433, pruned_loss=0.04499, over 4561113.66 frames. ], batch size: 285, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:18:41,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 15:18:42,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 15:18:47,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 15:18:47,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:18:48,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:18:50,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 15:18:55,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:18:58,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:18:59,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:19:00,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:19:01,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:19:05,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:19:08,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 15:19:08,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:19:10,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 15:19:13,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 15:19:15,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:19:15,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:19:17,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:19:21,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:19:21,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 15:19:25,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:19:26,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:19:26,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 15:19:27,923 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.868e+02 2.051e+02 2.317e+02 3.367e+02, threshold=4.102e+02, percent-clipped=0.0 2023-10-02 15:19:30,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:19:33,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:19:34,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:19:36,979 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.11 vs. limit=15.0 2023-10-02 15:19:38,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=925700.0, ans=0.125 2023-10-02 15:19:41,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:19:42,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 15:19:45,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 15:19:46,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=925700.0, ans=0.125 2023-10-02 15:19:47,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 15:19:48,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:19:50,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:19:50,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:19:51,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:19:51,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 15:19:54,429 INFO [train.py:1046] (1/4) Epoch 27, batch 750, loss[loss=0.1779, simple_loss=0.2442, pruned_loss=0.0558, over 22921.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2434, pruned_loss=0.0451, over 4590141.85 frames. ], batch size: 322, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:19:55,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 15:19:55,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 15:19:55,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 15:19:56,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 15:19:56,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 15:19:57,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:19:57,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 15:19:58,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:20:00,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:20:01,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:20:02,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:20:04,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 15:20:04,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:20:07,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:20:09,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:20:11,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:20:12,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:20:14,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:20:14,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 15:20:16,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:20:17,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:20:18,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:20:20,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:20:22,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 15:20:22,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:20:23,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 15:20:24,871 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 15:20:24,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 15:20:24,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:20:24,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 15:20:26,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:20:33,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:20:33,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:20:33,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:20:35,302 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:20:36,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:20:37,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:20:37,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=925966.6666666666, ans=0.2 2023-10-02 15:20:39,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 15:20:40,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:20:41,554 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.89 vs. limit=15.0 2023-10-02 15:20:42,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 15:20:42,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:20:46,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:20:47,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 15:20:47,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:20:51,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:20:53,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:20:53,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:20:55,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:20:58,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 15:20:58,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:20:58,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:21:02,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:21:02,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:21:04,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:21:04,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:21:08,213 INFO [train.py:1046] (1/4) Epoch 27, batch 800, loss[loss=0.1652, simple_loss=0.2549, pruned_loss=0.03776, over 23964.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2448, pruned_loss=0.0452, over 4617647.07 frames. ], batch size: 80, lr: 3.82e-03, grad_scale: 32.0 2023-10-02 15:21:08,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=926100.0, ans=0.0 2023-10-02 15:21:11,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:21:11,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:14,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:21:14,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:21:14,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:15,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:21:17,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:19,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=926100.0, ans=0.125 2023-10-02 15:21:20,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:21:21,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:21:24,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 15:21:25,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:21:27,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:21:27,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:21:27,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:21:27,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 15:21:27,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:21:27,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=926166.6666666666, ans=0.1 2023-10-02 15:21:28,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 15:21:31,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:32,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:21:34,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:21:34,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:21:37,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:21:37,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:21:41,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=926233.3333333334, ans=0.1 2023-10-02 15:21:43,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:21:45,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:21:45,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 15:21:45,892 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=926233.3333333334, ans=0.0 2023-10-02 15:21:47,062 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 15:21:47,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 15:21:48,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:21:48,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:21:49,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:49,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:21:55,128 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.828e+02 2.029e+02 2.331e+02 3.011e+02, threshold=4.058e+02, percent-clipped=0.0 2023-10-02 15:21:56,504 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 15:21:56,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 15:21:57,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:22:00,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:22:04,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:22:07,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:22:09,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 15:22:09,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:22:11,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 15:22:18,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:22:21,347 INFO [train.py:1046] (1/4) Epoch 27, batch 850, loss[loss=0.1527, simple_loss=0.2349, pruned_loss=0.03526, over 24659.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2461, pruned_loss=0.04561, over 4641787.39 frames. ], batch size: 65, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:22:21,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=926433.3333333334, ans=0.015 2023-10-02 15:22:21,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=926433.3333333334, ans=0.125 2023-10-02 15:22:22,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:22:22,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 15:22:23,699 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.16 vs. limit=15.0 2023-10-02 15:22:24,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:22:24,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:22:25,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 15:22:25,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:22:25,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=926433.3333333334, ans=0.2 2023-10-02 15:22:26,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:22:28,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:22:29,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:22:31,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:22:31,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 15:22:31,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 15:22:31,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 15:22:32,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:22:32,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:22:34,626 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.74 vs. limit=22.5 2023-10-02 15:22:35,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:22:35,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:22:35,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:22:39,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:22:39,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:22:40,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 15:22:44,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 15:22:47,453 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.70 vs. limit=10.0 2023-10-02 15:22:47,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:22:49,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 15:22:53,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 15:22:55,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 15:22:56,515 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 15:22:56,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:22:56,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:22:56,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 15:22:58,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=926566.6666666666, ans=0.2 2023-10-02 15:23:00,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:23:02,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:23:02,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 15:23:03,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:23:04,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:23:06,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:23:06,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:23:08,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:23:09,164 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:23:10,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 15:23:10,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 15:23:14,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:23:14,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:23:16,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:23:16,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:23:16,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:23:20,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:23:24,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:23:25,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:23:26,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:23:26,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:23:34,946 INFO [train.py:1046] (1/4) Epoch 27, batch 900, loss[loss=0.1745, simple_loss=0.2561, pruned_loss=0.04649, over 24413.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2463, pruned_loss=0.04534, over 4675228.25 frames. ], batch size: 77, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:23:35,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 15:23:35,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:23:36,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 15:23:37,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:23:37,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:23:39,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 15:23:44,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:23:48,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:23:48,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 15:23:51,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:23:52,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 15:23:53,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 15:23:55,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:23:55,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:23:56,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:23:56,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:24:02,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=926833.3333333334, ans=0.1 2023-10-02 15:24:05,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:24:05,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:24:06,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:24:07,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:24:10,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=926900.0, ans=0.0 2023-10-02 15:24:13,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 15:24:13,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=926900.0, ans=0.0 2023-10-02 15:24:14,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:24:17,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=926966.6666666666, ans=0.125 2023-10-02 15:24:19,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:24:20,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:24:20,798 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 15:24:22,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 15:24:24,177 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.889e+02 2.142e+02 2.586e+02 4.212e+02, threshold=4.284e+02, percent-clipped=1.0 2023-10-02 15:24:27,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:24:27,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:24:27,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:24:34,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:24:34,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:24:36,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 15:24:36,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:24:39,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 15:24:40,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:24:40,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:24:43,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:24:43,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:24:46,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 15:24:47,552 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 15:24:49,475 INFO [train.py:1046] (1/4) Epoch 27, batch 950, loss[loss=0.1718, simple_loss=0.2558, pruned_loss=0.04389, over 24657.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.247, pruned_loss=0.04563, over 4678574.75 frames. ], batch size: 68, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:24:49,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 15:24:49,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 15:24:51,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:24:54,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 15:24:59,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:25:00,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:02,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:02,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 15:25:02,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=927100.0, ans=0.125 2023-10-02 15:25:05,011 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 15:25:07,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=927166.6666666666, ans=0.2 2023-10-02 15:25:09,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:10,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:25:10,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:25:10,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:25:10,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 15:25:11,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 15:25:13,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:25:14,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 15:25:15,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:25:19,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:25:19,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:25:19,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:25:20,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 15:25:24,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 15:25:24,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=927233.3333333334, ans=0.125 2023-10-02 15:25:25,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:25:25,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:25:31,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:25:31,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:25:34,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 15:25:36,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 15:25:36,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:25:37,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:25:38,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:25:38,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:25:41,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 15:25:43,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:25:44,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:25:45,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:25:45,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 15:25:45,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:45,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:25:47,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 15:25:51,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:25:52,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:52,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=927366.6666666666, ans=0.0 2023-10-02 15:25:55,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:25:57,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 15:25:57,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 15:26:01,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:26:04,262 INFO [train.py:1046] (1/4) Epoch 27, batch 1000, loss[loss=0.1648, simple_loss=0.2471, pruned_loss=0.04125, over 24653.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2454, pruned_loss=0.04554, over 4667643.43 frames. ], batch size: 65, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:26:07,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 15:26:07,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:26:07,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=927433.3333333334, ans=0.1 2023-10-02 15:26:11,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:26:12,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 15:26:12,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 15:26:15,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:26:15,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:26:16,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:26:20,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 15:26:24,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 15:26:25,428 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.80 vs. limit=10.0 2023-10-02 15:26:26,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 15:26:26,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:26:28,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=927500.0, ans=0.1 2023-10-02 15:26:29,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 15:26:29,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 15:26:29,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 15:26:32,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:26:33,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:26:33,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=927566.6666666666, ans=0.125 2023-10-02 15:26:38,756 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.47 vs. limit=8.0 2023-10-02 15:26:40,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:26:42,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:26:43,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:26:44,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:26:44,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 15:26:44,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:26:44,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:26:46,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:26:46,388 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 15:26:49,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 15:26:49,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=927633.3333333334, ans=0.015 2023-10-02 15:26:51,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 15:26:53,526 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.843e+02 1.984e+02 2.193e+02 2.939e+02, threshold=3.969e+02, percent-clipped=0.0 2023-10-02 15:26:54,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 15:26:56,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:27:01,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:27:02,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:27:02,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:27:02,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=927700.0, ans=0.125 2023-10-02 15:27:03,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:27:05,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 15:27:05,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:27:06,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 15:27:06,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 15:27:09,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:27:09,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:27:09,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=927700.0, ans=0.1 2023-10-02 15:27:12,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:27:13,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:27:15,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:27:18,233 INFO [train.py:1046] (1/4) Epoch 27, batch 1050, loss[loss=0.1588, simple_loss=0.2503, pruned_loss=0.03369, over 24645.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2448, pruned_loss=0.04483, over 4684676.78 frames. ], batch size: 73, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:27:18,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:27:19,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:27:21,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 15:27:21,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:27:24,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:27:25,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:27:28,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:27:30,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:27:30,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:27:31,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:27:31,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:27:33,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 15:27:33,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:27:33,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 15:27:36,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:27:36,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 15:27:37,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:27:41,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:27:43,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:27:43,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:27:46,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 15:27:46,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 15:27:48,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:27:52,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 15:27:56,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 15:27:58,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:28:00,213 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.45 vs. limit=15.0 2023-10-02 15:28:00,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 15:28:02,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 15:28:04,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:28:04,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:28:06,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:28:07,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=927966.6666666666, ans=0.125 2023-10-02 15:28:08,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=927966.6666666666, ans=0.125 2023-10-02 15:28:10,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 15:28:10,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=927966.6666666666, ans=0.2 2023-10-02 15:28:11,950 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.13 vs. limit=15.0 2023-10-02 15:28:12,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 15:28:12,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 15:28:12,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:28:14,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:28:15,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 15:28:18,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:28:20,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:28:20,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:28:20,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:28:20,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:28:24,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:28:24,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 15:28:26,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:28:26,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 15:28:26,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 15:28:28,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:28:29,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=928033.3333333334, ans=0.125 2023-10-02 15:28:31,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:28:32,694 INFO [train.py:1046] (1/4) Epoch 27, batch 1100, loss[loss=0.1761, simple_loss=0.2566, pruned_loss=0.04781, over 24063.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2439, pruned_loss=0.04468, over 4675255.35 frames. ], batch size: 80, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:28:32,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=928100.0, ans=0.125 2023-10-02 15:28:35,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=928100.0, ans=0.5 2023-10-02 15:28:36,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:28:39,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:28:41,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:28:42,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:28:42,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 15:28:42,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=928100.0, ans=0.0 2023-10-02 15:28:43,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:28:44,543 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.71 vs. limit=12.0 2023-10-02 15:28:45,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:28:46,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:28:50,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:28:50,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 15:28:51,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 15:28:53,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:28:53,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:28:56,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:28:59,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:29:01,891 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.78 vs. limit=22.5 2023-10-02 15:29:02,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:29:06,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 15:29:06,963 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 15:29:07,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:29:08,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=928233.3333333334, ans=0.0 2023-10-02 15:29:09,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:29:09,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 15:29:09,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:29:11,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 15:29:11,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:29:11,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:29:12,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:29:12,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:29:12,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 15:29:19,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:29:20,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 15:29:21,250 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.465e+02 1.794e+02 1.965e+02 2.259e+02 3.177e+02, threshold=3.930e+02, percent-clipped=0.0 2023-10-02 15:29:21,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:29:25,958 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.00 vs. limit=15.0 2023-10-02 15:29:27,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:29:31,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 15:29:31,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 15:29:33,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:29:35,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:29:36,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:29:37,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 15:29:39,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:29:39,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:29:40,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 15:29:40,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:29:42,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 15:29:43,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:29:43,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:29:43,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:29:46,244 INFO [train.py:1046] (1/4) Epoch 27, batch 1150, loss[loss=0.1653, simple_loss=0.2397, pruned_loss=0.04546, over 23397.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2448, pruned_loss=0.04512, over 4682871.01 frames. ], batch size: 119, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:29:47,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:29:48,307 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.69 vs. limit=15.0 2023-10-02 15:29:51,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:29:52,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:29:54,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:29:54,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 15:29:54,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:29:57,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 15:29:58,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:29:58,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:30:04,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 15:30:06,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:30:06,922 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=928500.0, ans=0.125 2023-10-02 15:30:09,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:30:09,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:30:10,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 15:30:10,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:30:10,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:30:15,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 15:30:15,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:30:16,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:30:24,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=928566.6666666666, ans=0.1 2023-10-02 15:30:27,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:30:34,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:30:34,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 15:30:34,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:30:34,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:30:40,518 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 15:30:43,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:30:48,734 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 15:30:51,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:30:53,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:30:53,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:30:53,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:30:57,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:31:01,339 INFO [train.py:1046] (1/4) Epoch 27, batch 1200, loss[loss=0.1691, simple_loss=0.2442, pruned_loss=0.04697, over 23378.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2457, pruned_loss=0.04533, over 4694800.57 frames. ], batch size: 285, lr: 3.81e-03, grad_scale: 32.0 2023-10-02 15:31:02,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:31:02,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:31:04,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:31:04,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:31:04,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:31:08,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:31:10,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:31:10,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=928766.6666666666, ans=0.125 2023-10-02 15:31:10,857 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.54 vs. limit=22.5 2023-10-02 15:31:11,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:31:11,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:31:14,306 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 15:31:17,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 15:31:18,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:31:21,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:31:24,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:31:26,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:31:26,193 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 15:31:27,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:31:36,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 15:31:36,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:31:36,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 15:31:36,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:31:37,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=928900.0, ans=0.125 2023-10-02 15:31:40,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 15:31:42,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 15:31:42,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:31:42,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=928900.0, ans=0.125 2023-10-02 15:31:43,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:31:45,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:31:46,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:31:48,993 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.904e+02 2.120e+02 2.564e+02 3.915e+02, threshold=4.240e+02, percent-clipped=0.0 2023-10-02 15:31:49,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:31:49,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:31:49,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:31:50,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 15:31:50,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:31:51,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:31:51,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:31:52,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=928966.6666666666, ans=0.125 2023-10-02 15:31:53,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=928966.6666666666, ans=0.0 2023-10-02 15:31:54,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:31:54,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:31:57,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 15:31:59,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:32:02,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 15:32:06,938 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 15:32:08,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:32:09,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:32:10,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=929033.3333333334, ans=0.0 2023-10-02 15:32:11,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:32:11,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:32:12,130 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.23 vs. limit=15.0 2023-10-02 15:32:13,883 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.77 vs. limit=10.0 2023-10-02 15:32:14,400 INFO [train.py:1046] (1/4) Epoch 27, batch 1250, loss[loss=0.1759, simple_loss=0.2603, pruned_loss=0.04572, over 24036.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2465, pruned_loss=0.0456, over 4692293.31 frames. ], batch size: 80, lr: 3.81e-03, grad_scale: 32.0 2023-10-02 15:32:14,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 15:32:18,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:32:20,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:32:21,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 15:32:22,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:32:24,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=929100.0, ans=0.125 2023-10-02 15:32:25,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:32:29,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:32:31,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:32:32,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:32:32,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:32:35,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:32:38,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 15:32:38,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:32:38,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:32:40,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:32:42,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:32:43,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:32:45,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:32:49,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 15:32:50,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:32:53,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:32:53,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 15:32:55,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:32:55,142 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 15:32:55,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:32:55,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:32:55,791 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.45 vs. limit=15.0 2023-10-02 15:32:58,519 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.87 vs. limit=22.5 2023-10-02 15:32:59,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:33:01,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=929300.0, ans=0.125 2023-10-02 15:33:01,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=929300.0, ans=0.0 2023-10-02 15:33:04,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:33:04,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:33:04,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 15:33:06,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 15:33:06,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 15:33:09,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:33:10,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 15:33:10,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:33:13,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 15:33:13,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:33:14,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 15:33:14,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 15:33:15,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:33:15,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 15:33:17,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:33:19,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 15:33:19,786 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.17 vs. limit=6.0 2023-10-02 15:33:22,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:33:23,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:33:24,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:33:26,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:33:27,267 INFO [train.py:1046] (1/4) Epoch 27, batch 1300, loss[loss=0.1723, simple_loss=0.2583, pruned_loss=0.04311, over 24006.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2461, pruned_loss=0.04535, over 4700540.84 frames. ], batch size: 80, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:33:29,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:33:29,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 15:33:35,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:33:36,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 15:33:38,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:33:39,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:33:41,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:33:42,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 15:33:46,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:33:47,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:33:49,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 15:33:51,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=929500.0, ans=0.125 2023-10-02 15:33:53,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:33:56,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:33:58,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:33:59,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:34:01,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:34:01,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:34:03,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 15:34:03,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 15:34:04,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.19 vs. limit=22.5 2023-10-02 15:34:09,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:34:09,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:34:09,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 15:34:10,343 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.64 vs. limit=15.0 2023-10-02 15:34:11,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 15:34:12,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:34:15,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:34:16,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 15:34:17,772 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.852e+02 2.104e+02 2.381e+02 3.588e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-02 15:34:17,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:34:17,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 15:34:19,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:34:20,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:34:20,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:34:24,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 15:34:24,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=929633.3333333334, ans=0.0 2023-10-02 15:34:25,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 15:34:26,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 15:34:31,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:34:33,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 15:34:34,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:34:37,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=929700.0, ans=0.125 2023-10-02 15:34:41,790 INFO [train.py:1046] (1/4) Epoch 27, batch 1350, loss[loss=0.1723, simple_loss=0.237, pruned_loss=0.05381, over 23852.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2449, pruned_loss=0.04517, over 4705664.50 frames. ], batch size: 179, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:34:41,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 15:34:43,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=929766.6666666666, ans=0.07 2023-10-02 15:34:44,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:34:47,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:34:50,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:34:50,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:34:53,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:34:53,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:34:57,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:35:00,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 15:35:01,563 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.86 vs. limit=15.0 2023-10-02 15:35:02,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:35:02,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:35:05,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 15:35:07,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:35:08,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:35:08,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 15:35:08,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=929833.3333333334, ans=0.1 2023-10-02 15:35:09,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 15:35:10,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=929900.0, ans=0.0 2023-10-02 15:35:11,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 15:35:12,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:35:12,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 15:35:22,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:35:31,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:35:31,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:35:31,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 15:35:35,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:35:37,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 15:35:38,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:35:38,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:35:41,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:35:42,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 15:35:45,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:35:49,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 15:35:51,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 15:35:55,185 INFO [train.py:1046] (1/4) Epoch 27, batch 1400, loss[loss=0.1483, simple_loss=0.1924, pruned_loss=0.05211, over 19349.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2432, pruned_loss=0.04487, over 4706450.92 frames. ], batch size: 388, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:35:55,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 15:35:57,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:35:59,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:36:00,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:36:07,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 15:36:07,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 15:36:13,525 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.11 vs. limit=15.0 2023-10-02 15:36:16,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:36:18,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:36:19,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:36:21,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:36:26,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:36:26,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 15:36:34,338 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.31 vs. limit=15.0 2023-10-02 15:36:36,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:36:36,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:36:41,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 15:36:41,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:36:42,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:36:44,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:36:44,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:36:45,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:36:45,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:36:46,666 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.831e+02 2.061e+02 2.226e+02 3.360e+02, threshold=4.122e+02, percent-clipped=0.0 2023-10-02 15:36:46,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:36:48,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 15:36:48,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:36:51,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:36:51,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=930300.0, ans=0.125 2023-10-02 15:36:52,975 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.39 vs. limit=15.0 2023-10-02 15:36:55,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:36:57,383 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.69 vs. limit=10.0 2023-10-02 15:37:01,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 15:37:01,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=930366.6666666666, ans=0.0 2023-10-02 15:37:02,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 15:37:03,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:37:05,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 15:37:06,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:37:08,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=930433.3333333334, ans=0.0 2023-10-02 15:37:09,952 INFO [train.py:1046] (1/4) Epoch 27, batch 1450, loss[loss=0.1763, simple_loss=0.2531, pruned_loss=0.04973, over 23282.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2436, pruned_loss=0.04498, over 4711745.98 frames. ], batch size: 105, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:37:10,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:37:14,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:37:16,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:37:16,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:16,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 15:37:21,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:37:22,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:37:22,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:37:23,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 15:37:25,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:37:26,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 15:37:26,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:26,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:37:26,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 15:37:28,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:37:28,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:37:29,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 15:37:29,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:37:30,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:37:31,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:31,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=930500.0, ans=0.0 2023-10-02 15:37:34,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:37:39,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:37:39,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:37:42,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:37:42,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:43,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:37:43,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:37:43,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:45,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:37:49,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 15:37:51,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:37:53,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=930633.3333333334, ans=0.2 2023-10-02 15:37:54,757 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 15:37:56,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:37:57,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:37:58,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:38:00,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 15:38:01,260 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.11 vs. limit=15.0 2023-10-02 15:38:05,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:38:05,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 15:38:07,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 15:38:09,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:38:12,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:38:13,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:38:15,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 15:38:18,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 15:38:18,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 15:38:19,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:38:20,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:38:23,503 INFO [train.py:1046] (1/4) Epoch 27, batch 1500, loss[loss=0.1779, simple_loss=0.2603, pruned_loss=0.0478, over 24609.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2443, pruned_loss=0.04491, over 4717749.46 frames. ], batch size: 68, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:38:29,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 15:38:29,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:38:29,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:38:30,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:38:31,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:38:31,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:38:33,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 15:38:35,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:38:35,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:38:35,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:38:36,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:38:38,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:38:40,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:38:44,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:38:44,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 15:38:45,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:38:47,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:38:47,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:38:51,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 15:38:55,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 15:38:56,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:38:56,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 15:38:58,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 15:38:59,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:39:00,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:39:01,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:39:01,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 15:39:03,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:39:03,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:39:03,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 15:39:04,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:39:11,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:39:11,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 15:39:13,896 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.907e+02 2.123e+02 2.554e+02 3.367e+02, threshold=4.246e+02, percent-clipped=0.0 2023-10-02 15:39:15,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:39:17,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:39:21,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=931033.3333333334, ans=0.1 2023-10-02 15:39:22,131 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 15:39:22,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:39:22,185 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 15:39:23,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:39:24,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:39:26,364 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 15:39:27,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:39:30,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 15:39:33,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:39:36,352 INFO [train.py:1046] (1/4) Epoch 27, batch 1550, loss[loss=0.1544, simple_loss=0.2244, pruned_loss=0.04217, over 23540.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2451, pruned_loss=0.04525, over 4722375.59 frames. ], batch size: 134, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:39:36,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:39:36,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:39:38,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:39:38,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:39:38,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:39:41,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 15:39:41,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 15:39:41,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:39:43,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 15:39:43,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 15:39:44,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:39:46,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:39:46,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:39:46,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:39:48,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:39:49,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:39:50,835 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 15:39:52,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:39:52,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:39:53,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:39:54,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:39:54,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 15:39:56,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:39:56,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 15:39:56,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=931166.6666666666, ans=0.0 2023-10-02 15:39:57,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 15:39:57,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 15:39:59,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:40:00,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:40:03,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:40:05,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 15:40:05,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 15:40:07,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=931233.3333333334, ans=0.1 2023-10-02 15:40:11,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=931233.3333333334, ans=0.025 2023-10-02 15:40:14,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:40:15,794 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:40:18,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:40:18,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:40:18,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:40:19,670 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.03 vs. limit=15.0 2023-10-02 15:40:20,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 15:40:26,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:40:28,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:40:29,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:40:31,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:40:32,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:40:32,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 15:40:33,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:40:35,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:40:35,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:40:35,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 15:40:35,296 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 15:40:38,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:40:43,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 15:40:49,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:40:50,809 INFO [train.py:1046] (1/4) Epoch 27, batch 1600, loss[loss=0.1705, simple_loss=0.2557, pruned_loss=0.0427, over 24046.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2464, pruned_loss=0.04528, over 4715489.76 frames. ], batch size: 80, lr: 3.81e-03, grad_scale: 32.0 2023-10-02 15:40:50,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:40:51,423 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.59 vs. limit=12.0 2023-10-02 15:40:52,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 15:40:52,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:40:53,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:40:53,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:40:53,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:40:54,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:40:59,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:40:59,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 15:41:00,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 15:41:01,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 15:41:03,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:41:04,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 15:41:05,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:41:09,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:41:14,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:41:18,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 15:41:18,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=931566.6666666666, ans=0.125 2023-10-02 15:41:19,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:41:21,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 15:41:21,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:41:21,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 15:41:27,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 15:41:34,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:41:35,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 15:41:35,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:41:35,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:41:35,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:41:38,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 15:41:39,052 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:41:41,438 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.847e+02 2.065e+02 2.421e+02 3.334e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-02 15:41:41,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 15:41:43,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:41:43,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:41:44,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:41:45,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:41:46,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:41:46,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=931633.3333333334, ans=0.1 2023-10-02 15:41:47,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:41:49,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:41:54,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:41:56,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:41:57,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=931700.0, ans=0.125 2023-10-02 15:41:58,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 15:41:58,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:41:59,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 15:42:03,088 INFO [train.py:1046] (1/4) Epoch 27, batch 1650, loss[loss=0.157, simple_loss=0.2503, pruned_loss=0.03187, over 24444.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2469, pruned_loss=0.0456, over 4712240.60 frames. ], batch size: 69, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:42:04,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:42:05,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:42:07,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:42:07,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 15:42:07,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 15:42:07,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 15:42:09,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 15:42:12,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:42:12,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:42:14,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:42:14,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:42:17,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:42:18,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 15:42:19,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:42:19,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:42:19,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:42:20,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:42:21,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 15:42:22,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 15:42:27,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:42:29,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=931833.3333333334, ans=0.125 2023-10-02 15:42:29,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=931833.3333333334, ans=0.2 2023-10-02 15:42:30,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:42:37,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 15:42:37,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:42:39,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=931900.0, ans=0.0 2023-10-02 15:42:41,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 15:42:41,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=931900.0, ans=0.2 2023-10-02 15:42:43,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:42:46,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:42:46,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=931966.6666666666, ans=0.2 2023-10-02 15:42:47,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:42:47,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:42:49,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:42:49,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:42:52,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=931966.6666666666, ans=0.125 2023-10-02 15:42:53,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:42:53,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:42:53,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:42:53,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:42:55,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:42:56,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:42:58,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:42:59,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 15:43:01,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:43:01,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 15:43:02,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 15:43:02,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 15:43:02,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:43:04,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:43:05,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:43:05,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:43:05,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 15:43:10,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:43:11,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:43:12,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:43:15,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 15:43:18,087 INFO [train.py:1046] (1/4) Epoch 27, batch 1700, loss[loss=0.1865, simple_loss=0.2703, pruned_loss=0.05131, over 24387.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.247, pruned_loss=0.04538, over 4716758.32 frames. ], batch size: 77, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:43:18,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:43:18,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:43:18,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 15:43:18,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:43:18,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:43:18,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:43:19,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=932100.0, ans=0.125 2023-10-02 15:43:21,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:43:21,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:43:22,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 15:43:25,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:43:27,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=932100.0, ans=0.0 2023-10-02 15:43:27,890 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.31 vs. limit=15.0 2023-10-02 15:43:33,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:43:36,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:43:43,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:43:43,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:43:45,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:43:45,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:43:47,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 15:43:49,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:43:49,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:43:50,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:43:51,312 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.22 vs. limit=15.0 2023-10-02 15:43:52,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:43:53,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 15:43:54,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 15:43:56,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:43:59,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 15:43:59,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:44:06,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:44:07,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:44:09,561 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.921e+02 2.104e+02 2.385e+02 3.447e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-02 15:44:09,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:44:11,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=932300.0, ans=0.0 2023-10-02 15:44:12,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 15:44:12,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 15:44:12,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:44:14,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:44:14,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 15:44:14,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=932300.0, ans=0.125 2023-10-02 15:44:15,215 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.55 vs. limit=22.5 2023-10-02 15:44:15,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:44:15,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:44:15,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:44:15,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:44:18,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:44:18,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:44:19,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:44:19,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:44:19,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:44:25,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:44:25,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 15:44:26,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:44:28,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:44:30,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=932433.3333333334, ans=0.2 2023-10-02 15:44:31,724 INFO [train.py:1046] (1/4) Epoch 27, batch 1750, loss[loss=0.1407, simple_loss=0.22, pruned_loss=0.03069, over 18924.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2449, pruned_loss=0.04479, over 4714083.34 frames. ], batch size: 41, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:44:31,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 15:44:34,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=932433.3333333334, ans=0.125 2023-10-02 15:44:37,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:44:37,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:44:38,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 15:44:38,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 15:44:40,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:44:44,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:44:44,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:44:48,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 15:44:49,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:44:51,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 15:44:51,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:44:52,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:44:54,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=932500.0, ans=0.125 2023-10-02 15:44:55,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 15:44:55,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 15:44:58,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:44:58,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 15:45:03,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=932566.6666666666, ans=0.125 2023-10-02 15:45:05,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:45:08,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:45:08,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:45:14,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:45:14,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:45:16,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:45:17,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:45:20,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:45:20,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:45:22,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 15:45:23,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:45:24,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 15:45:26,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:45:28,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:45:29,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:45:32,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:45:33,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 15:45:33,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:45:33,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=932700.0, ans=0.5 2023-10-02 15:45:35,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:45:36,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=932700.0, ans=0.0 2023-10-02 15:45:37,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:45:38,185 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:45:39,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:45:40,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=932700.0, ans=0.0 2023-10-02 15:45:40,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=932700.0, ans=0.125 2023-10-02 15:45:41,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:45:43,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 15:45:43,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:45:44,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:45:44,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:45:44,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:45:46,223 INFO [train.py:1046] (1/4) Epoch 27, batch 1800, loss[loss=0.1798, simple_loss=0.2531, pruned_loss=0.05322, over 23779.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2444, pruned_loss=0.04468, over 4701723.11 frames. ], batch size: 164, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:45:46,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:45:46,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:45:47,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=932766.6666666666, ans=0.2 2023-10-02 15:45:47,901 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=932766.6666666666, ans=0.125 2023-10-02 15:45:50,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:45:51,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:45:53,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 15:45:55,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:45:59,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 15:45:59,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:46:02,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:46:04,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:46:04,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:46:04,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:46:07,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:46:07,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 15:46:08,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:46:12,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:46:13,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=932833.3333333334, ans=0.0 2023-10-02 15:46:15,049 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.93 vs. limit=15.0 2023-10-02 15:46:15,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 15:46:16,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=932900.0, ans=0.125 2023-10-02 15:46:18,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 15:46:18,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 15:46:20,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:46:21,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:46:21,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:46:21,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:46:28,279 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 15:46:29,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:46:31,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:46:31,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=932966.6666666666, ans=0.125 2023-10-02 15:46:33,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 15:46:34,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 15:46:34,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:46:35,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:46:37,034 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.417e+02 1.793e+02 1.976e+02 2.310e+02 4.121e+02, threshold=3.953e+02, percent-clipped=0.0 2023-10-02 15:46:37,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:46:41,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 15:46:42,102 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.16 vs. limit=15.0 2023-10-02 15:46:47,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:46:49,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 15:46:50,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:46:50,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:46:50,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:46:50,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 15:46:51,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=933033.3333333334, ans=0.09899494936611666 2023-10-02 15:46:53,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:46:53,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:46:56,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 15:46:56,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:46:57,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:46:57,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:46:57,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:46:58,963 INFO [train.py:1046] (1/4) Epoch 27, batch 1850, loss[loss=0.1734, simple_loss=0.2517, pruned_loss=0.04755, over 23183.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2441, pruned_loss=0.04487, over 4704079.78 frames. ], batch size: 105, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:47:00,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:47:00,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:47:03,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:47:03,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:47:05,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:47:05,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=933100.0, ans=0.0 2023-10-02 15:47:06,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:47:13,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:47:13,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 15:47:17,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 15:47:18,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=933166.6666666666, ans=0.125 2023-10-02 15:47:19,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 15:47:22,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:47:22,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 15:47:22,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 15:47:23,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten.whitening_limit, batch_count=933166.6666666666, ans=15.0 2023-10-02 15:47:28,873 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.14 vs. limit=22.5 2023-10-02 15:47:32,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:47:33,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 15:47:34,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=933233.3333333334, ans=0.125 2023-10-02 15:47:35,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:47:35,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:47:39,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 15:47:40,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:47:41,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 15:47:41,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:47:45,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:47:46,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:47:49,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:47:53,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:47:54,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 15:47:54,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:47:56,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:47:57,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:47:58,249 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.14 vs. limit=12.0 2023-10-02 15:47:59,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 15:47:59,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:48:03,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:48:03,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:48:03,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 15:48:03,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 15:48:06,005 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 15:48:06,087 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 15:48:08,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:48:09,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:48:09,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:48:09,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:48:09,500 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 15:48:09,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:48:10,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:48:12,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:48:14,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:48:15,791 INFO [train.py:1046] (1/4) Epoch 27, batch 1900, loss[loss=0.1726, simple_loss=0.248, pruned_loss=0.04861, over 23904.00 frames. ], tot_loss[loss=0.168, simple_loss=0.245, pruned_loss=0.04555, over 4690051.74 frames. ], batch size: 195, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:48:15,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:48:15,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 15:48:16,542 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.46 vs. limit=15.0 2023-10-02 15:48:19,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:48:19,164 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 15:48:19,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:48:20,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:48:26,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:48:27,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:48:28,837 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 15:48:30,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 15:48:30,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=933500.0, ans=10.0 2023-10-02 15:48:31,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:48:31,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:48:31,664 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 15:48:31,694 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 15:48:35,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 15:48:37,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:48:41,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 15:48:43,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 15:48:54,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 15:48:57,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 15:48:57,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:48:58,482 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 15:48:58,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 15:48:58,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 15:49:00,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 15:49:00,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:49:05,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 15:49:06,699 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.841e+02 1.943e+02 2.107e+02 2.840e+02, threshold=3.886e+02, percent-clipped=0.0 2023-10-02 15:49:06,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:49:10,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:49:10,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 15:49:12,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:49:15,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 15:49:15,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:49:21,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=933700.0, ans=0.0 2023-10-02 15:49:22,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:49:22,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:49:22,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:49:23,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:49:25,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:49:25,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 15:49:26,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:49:29,327 INFO [train.py:1046] (1/4) Epoch 27, batch 1950, loss[loss=0.1499, simple_loss=0.2259, pruned_loss=0.03692, over 24323.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2464, pruned_loss=0.04556, over 4701009.47 frames. ], batch size: 56, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:49:30,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:49:30,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:49:33,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:49:33,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:49:33,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:49:34,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:49:38,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:49:39,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:49:40,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:49:40,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:49:40,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=933766.6666666666, ans=0.1 2023-10-02 15:49:41,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 15:49:43,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 15:49:43,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:49:44,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:49:46,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:49:46,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:49:46,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:49:48,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:49:51,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:49:51,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:49:51,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:49:53,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:49:56,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:49:57,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:49:57,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:49:57,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 15:49:57,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 15:49:58,517 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.05 vs. limit=6.0 2023-10-02 15:49:59,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 15:49:59,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:50:00,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:50:03,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:50:04,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:50:08,307 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.78 vs. limit=15.0 2023-10-02 15:50:08,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:50:11,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:50:11,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:50:12,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 15:50:12,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:50:16,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:50:17,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:50:17,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:50:26,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:50:28,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:50:29,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:50:31,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:50:34,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:50:36,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:50:36,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 15:50:36,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:50:36,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:50:37,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 15:50:38,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=934033.3333333334, ans=0.0 2023-10-02 15:50:39,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:50:41,904 INFO [train.py:1046] (1/4) Epoch 27, batch 2000, loss[loss=0.1448, simple_loss=0.2237, pruned_loss=0.03291, over 20999.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2474, pruned_loss=0.04576, over 4704560.44 frames. ], batch size: 46, lr: 3.80e-03, grad_scale: 32.0 2023-10-02 15:50:42,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:50:44,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:50:44,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:50:45,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:50:46,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:50:49,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 15:50:50,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:50:53,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:50:55,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 15:50:57,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:50:57,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:51:00,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:51:01,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 15:51:03,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:04,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:04,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:05,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 15:51:05,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 15:51:07,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 15:51:07,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:51:11,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:51:11,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:51:13,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:13,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:51:13,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:51:15,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 15:51:15,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=934233.3333333334, ans=0.125 2023-10-02 15:51:16,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=934233.3333333334, ans=0.125 2023-10-02 15:51:17,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 15:51:17,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:51:17,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:51:18,589 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.15 vs. limit=6.0 2023-10-02 15:51:24,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:51:25,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:51:25,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:51:26,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:51:28,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:51:28,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:51:28,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:51:28,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:51:30,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:33,410 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.846e+02 2.049e+02 2.214e+02 2.926e+02, threshold=4.097e+02, percent-clipped=0.0 2023-10-02 15:51:33,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:51:33,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 15:51:37,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:51:38,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:51:43,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:51:43,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:51:47,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=934366.6666666666, ans=0.125 2023-10-02 15:51:48,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:51,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:51:51,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:51,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=934366.6666666666, ans=0.125 2023-10-02 15:51:52,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:51:52,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:51:54,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:51:54,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=934433.3333333334, ans=0.0 2023-10-02 15:51:55,654 INFO [train.py:1046] (1/4) Epoch 27, batch 2050, loss[loss=0.1495, simple_loss=0.208, pruned_loss=0.04552, over 22714.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2459, pruned_loss=0.04524, over 4706198.41 frames. ], batch size: 322, lr: 3.80e-03, grad_scale: 32.0 2023-10-02 15:51:55,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:59,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:52:00,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:52:04,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:52:07,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:52:07,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:52:07,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:52:08,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 15:52:08,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:52:10,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:52:10,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:52:20,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:52:20,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:52:21,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 15:52:24,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:52:25,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 15:52:27,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:52:27,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=934566.6666666666, ans=0.125 2023-10-02 15:52:28,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:52:30,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:52:31,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:52:32,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:52:33,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:52:34,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:52:34,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:52:34,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=934566.6666666666, ans=0.035 2023-10-02 15:52:37,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:52:40,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:52:41,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:52:43,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:52:47,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:52:48,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=934633.3333333334, ans=0.0 2023-10-02 15:52:50,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=934633.3333333334, ans=0.2 2023-10-02 15:52:52,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:52:53,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 15:52:54,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=934700.0, ans=0.125 2023-10-02 15:52:54,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=934700.0, ans=0.95 2023-10-02 15:52:58,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:52:58,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:53:02,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=934700.0, ans=6.0 2023-10-02 15:53:02,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:53:04,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 15:53:07,153 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 15:53:07,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:53:08,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:53:09,820 INFO [train.py:1046] (1/4) Epoch 27, batch 2100, loss[loss=0.1622, simple_loss=0.2308, pruned_loss=0.04679, over 23453.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2438, pruned_loss=0.04463, over 4697162.89 frames. ], batch size: 285, lr: 3.80e-03, grad_scale: 32.0 2023-10-02 15:53:09,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:53:11,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:53:11,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 15:53:11,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 15:53:11,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=934766.6666666666, ans=0.0 2023-10-02 15:53:14,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:53:16,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=934766.6666666666, ans=0.2 2023-10-02 15:53:17,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:53:19,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:53:20,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:53:20,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:53:20,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 15:53:20,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:53:22,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 15:53:22,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 15:53:25,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:53:26,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:53:26,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 15:53:26,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 15:53:31,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 15:53:31,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:53:34,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=934833.3333333334, ans=0.5 2023-10-02 15:53:35,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:53:35,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:53:39,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:53:39,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 15:53:40,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:53:40,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 15:53:42,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 15:53:43,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:53:43,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 15:53:43,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 15:53:44,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 15:53:47,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:53:49,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:53:51,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:53:52,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:53:54,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:53:54,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:53:54,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 15:53:54,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:53:54,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:53:54,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=934966.6666666666, ans=0.125 2023-10-02 15:53:55,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:53:55,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 15:53:57,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 15:53:57,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 15:54:00,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:54:00,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=934966.6666666666, ans=0.125 2023-10-02 15:54:02,751 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.820e+02 2.011e+02 2.406e+02 3.767e+02, threshold=4.021e+02, percent-clipped=0.0 2023-10-02 15:54:02,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:54:02,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 15:54:03,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=934966.6666666666, ans=0.0 2023-10-02 15:54:05,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=934966.6666666666, ans=0.125 2023-10-02 15:54:08,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:54:10,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:54:12,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:54:12,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:54:12,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 15:54:12,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:54:13,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:54:13,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:54:17,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:54:17,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:54:19,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 15:54:20,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 15:54:20,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:54:23,078 INFO [train.py:1046] (1/4) Epoch 27, batch 2150, loss[loss=0.161, simple_loss=0.2503, pruned_loss=0.0359, over 24531.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2438, pruned_loss=0.0445, over 4703906.36 frames. ], batch size: 71, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:54:23,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:54:23,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:54:23,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:54:24,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:54:28,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 15:54:30,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:54:32,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:54:32,496 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:54:34,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:54:34,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:54:34,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:54:37,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:54:39,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:54:39,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:54:39,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=935166.6666666666, ans=10.0 2023-10-02 15:54:40,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=935166.6666666666, ans=0.0 2023-10-02 15:54:43,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:54:43,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 15:54:47,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:54:49,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:54:51,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:54:52,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:54:52,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:54:52,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:54:54,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:54:54,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:54:55,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:54:56,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 15:54:58,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:54:58,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:55:00,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:55:00,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:55:02,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:55:04,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:55:04,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:55:07,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:55:07,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 15:55:07,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=935300.0, ans=0.0 2023-10-02 15:55:08,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:55:09,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:55:11,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:11,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:55:12,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:55:14,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:15,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:15,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 15:55:15,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 15:55:15,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=935300.0, ans=0.2 2023-10-02 15:55:16,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:55:16,892 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 15:55:16,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:16,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:55:18,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 15:55:18,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:55:18,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 15:55:18,305 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 15:55:18,305 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 15:55:20,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 15:55:21,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:22,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:55:24,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:55:24,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:26,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 15:55:28,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:28,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:35,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:55:35,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 15:55:36,907 INFO [train.py:1046] (1/4) Epoch 27, batch 2200, loss[loss=0.1676, simple_loss=0.2397, pruned_loss=0.04769, over 23874.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2438, pruned_loss=0.04456, over 4713130.60 frames. ], batch size: 195, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:55:40,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:55:44,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=935433.3333333334, ans=0.1 2023-10-02 15:55:45,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:46,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:55:46,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:55:46,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:55:49,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:50,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:55:50,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 15:55:55,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 15:55:57,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:56:01,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 15:56:03,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:56:05,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:56:05,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:56:09,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:56:09,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 15:56:13,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:56:14,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:56:16,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 15:56:17,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=935566.6666666666, ans=0.1 2023-10-02 15:56:18,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:56:21,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:56:23,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:56:25,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:56:28,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 15:56:28,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:56:29,899 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.796e+02 1.925e+02 2.148e+02 3.063e+02, threshold=3.850e+02, percent-clipped=0.0 2023-10-02 15:56:30,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 15:56:32,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:56:32,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 15:56:32,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:56:35,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:56:35,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:56:35,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:56:35,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:56:36,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:56:38,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:56:39,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 15:56:42,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 15:56:43,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:56:44,512 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.08 vs. limit=22.5 2023-10-02 15:56:45,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:56:45,187 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 15:56:47,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:56:49,231 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 15:56:49,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 15:56:49,363 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 15:56:50,518 INFO [train.py:1046] (1/4) Epoch 27, batch 2250, loss[loss=0.1688, simple_loss=0.2445, pruned_loss=0.04657, over 23628.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2434, pruned_loss=0.04444, over 4715578.29 frames. ], batch size: 256, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:56:51,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:56:53,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 15:56:55,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:56:56,525 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 15:56:59,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:57:01,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:57:01,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=935766.6666666666, ans=0.0 2023-10-02 15:57:06,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:57:08,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:57:10,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:57:12,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:57:13,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:57:14,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 15:57:14,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:57:14,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:57:16,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 15:57:17,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:57:17,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:57:19,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:57:23,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:57:24,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 15:57:24,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:57:28,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 15:57:28,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:57:29,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:57:35,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:57:37,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:57:37,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:57:38,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:57:40,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:57:41,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:57:44,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:57:44,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=935966.6666666666, ans=0.0 2023-10-02 15:57:47,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:57:50,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 15:57:50,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:57:51,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:57:54,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=936033.3333333334, ans=0.125 2023-10-02 15:57:57,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 15:58:00,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 15:58:00,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 15:58:00,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:58:01,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=936033.3333333334, ans=0.0 2023-10-02 15:58:02,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:58:04,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 15:58:05,944 INFO [train.py:1046] (1/4) Epoch 27, batch 2300, loss[loss=0.152, simple_loss=0.2352, pruned_loss=0.03442, over 24457.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2442, pruned_loss=0.04475, over 4717923.03 frames. ], batch size: 63, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:58:07,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:58:07,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:58:09,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=936100.0, ans=0.125 2023-10-02 15:58:12,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:58:14,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:58:15,550 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 15:58:15,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:58:21,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=936166.6666666666, ans=0.07 2023-10-02 15:58:22,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:58:22,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:58:22,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:58:23,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:58:23,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 15:58:25,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:58:28,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:58:30,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:58:35,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:58:38,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:58:41,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:58:44,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:58:45,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:58:47,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:58:51,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:58:53,427 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.70 vs. limit=22.5 2023-10-02 15:58:53,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:58:55,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:58:55,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:58:55,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 15:58:58,050 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.878e+02 2.081e+02 2.351e+02 3.500e+02, threshold=4.161e+02, percent-clipped=0.0 2023-10-02 15:58:58,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 15:58:58,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:58:59,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:58:59,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:58:59,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:59:02,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 15:59:02,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:59:03,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 15:59:03,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:59:03,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:59:04,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 15:59:07,219 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.71 vs. limit=22.5 2023-10-02 15:59:09,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:59:12,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:59:15,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:59:16,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:59:16,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:59:16,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=936366.6666666666, ans=0.125 2023-10-02 15:59:17,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:59:17,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:59:19,045 INFO [train.py:1046] (1/4) Epoch 27, batch 2350, loss[loss=0.1803, simple_loss=0.2508, pruned_loss=0.05489, over 23337.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2459, pruned_loss=0.04566, over 4703428.22 frames. ], batch size: 105, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:59:20,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:59:20,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 15:59:24,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:59:24,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 15:59:30,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 15:59:34,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:59:37,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:59:37,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:59:37,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:59:37,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:59:38,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 15:59:40,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.01 vs. limit=12.0 2023-10-02 15:59:42,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:59:43,202 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.42 vs. limit=15.0 2023-10-02 15:59:46,117 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.14 vs. limit=22.5 2023-10-02 15:59:46,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=936566.6666666666, ans=0.0 2023-10-02 15:59:48,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 15:59:50,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:59:53,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:59:53,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:59:55,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:59:56,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 15:59:57,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:59:59,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:59:59,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:00:01,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:00:04,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:00:05,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=936633.3333333334, ans=0.2 2023-10-02 16:00:06,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 16:00:07,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:00:09,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:00:10,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:00:12,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 16:00:13,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:00:16,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 16:00:16,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:00:20,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 16:00:20,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=936700.0, ans=0.125 2023-10-02 16:00:23,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 16:00:23,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=936700.0, ans=0.125 2023-10-02 16:00:24,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:00:24,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:00:24,754 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 16:00:24,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 16:00:27,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 16:00:28,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:00:32,036 INFO [train.py:1046] (1/4) Epoch 27, batch 2400, loss[loss=0.1763, simple_loss=0.2671, pruned_loss=0.04273, over 24572.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2451, pruned_loss=0.04484, over 4716695.08 frames. ], batch size: 71, lr: 3.79e-03, grad_scale: 32.0 2023-10-02 16:00:33,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:00:38,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:00:38,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:00:38,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 16:00:38,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 16:00:45,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:00:45,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:00:46,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=936833.3333333334, ans=0.125 2023-10-02 16:00:48,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 16:00:48,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:00:48,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:00:50,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 16:00:51,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.78 vs. limit=22.5 2023-10-02 16:00:51,132 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.98 vs. limit=15.0 2023-10-02 16:00:55,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:00:57,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 16:01:03,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:01:06,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=936900.0, ans=0.0 2023-10-02 16:01:08,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 16:01:11,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:01:13,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:01:17,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:01:17,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 16:01:18,080 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.46 vs. limit=15.0 2023-10-02 16:01:18,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:01:25,752 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.884e+02 2.076e+02 2.344e+02 3.327e+02, threshold=4.151e+02, percent-clipped=0.0 2023-10-02 16:01:25,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:01:27,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:01:28,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=936966.6666666666, ans=0.1 2023-10-02 16:01:29,576 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.52 vs. limit=10.0 2023-10-02 16:01:30,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:01:30,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:01:30,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 16:01:32,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:01:32,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:01:33,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:01:33,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:01:36,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:01:36,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:01:36,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 16:01:38,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 16:01:40,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:01:40,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:01:40,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 16:01:41,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 16:01:41,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 16:01:41,445 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 16:01:42,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 16:01:42,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:01:44,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:01:44,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:01:45,658 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 16:01:46,945 INFO [train.py:1046] (1/4) Epoch 27, batch 2450, loss[loss=0.1658, simple_loss=0.2409, pruned_loss=0.0454, over 23722.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2436, pruned_loss=0.04423, over 4723311.14 frames. ], batch size: 149, lr: 3.79e-03, grad_scale: 32.0 2023-10-02 16:01:47,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:01:47,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:01:47,772 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.99 vs. limit=15.0 2023-10-02 16:01:49,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:01:51,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:01:53,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:01:53,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:01:55,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 16:01:59,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:01:59,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:02:03,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:02:03,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:02:04,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:02:04,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 16:02:08,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:02:09,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:02:10,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:02:13,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:02:13,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:02:15,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:02:15,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:02:16,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 16:02:18,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:02:26,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:02:27,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:02:27,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:02:28,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:02:28,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:02:30,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:02:31,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 16:02:34,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:02:34,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:02:34,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=937300.0, ans=0.125 2023-10-02 16:02:37,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:02:37,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:02:38,160 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.11 vs. limit=15.0 2023-10-02 16:02:43,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:02:44,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 16:02:46,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:02:47,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:02:48,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 16:02:48,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:02:49,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:02:49,558 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.72 vs. limit=22.5 2023-10-02 16:02:53,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:02:54,425 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.06 vs. limit=8.0 2023-10-02 16:02:55,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:02:55,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:02:59,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 16:03:00,703 INFO [train.py:1046] (1/4) Epoch 27, batch 2500, loss[loss=0.1498, simple_loss=0.227, pruned_loss=0.03631, over 21543.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2435, pruned_loss=0.04376, over 4723105.87 frames. ], batch size: 47, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:03:00,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:03:04,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=937433.3333333334, ans=0.125 2023-10-02 16:03:05,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:03:13,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=937433.3333333334, ans=0.125 2023-10-02 16:03:14,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:03:16,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:03:17,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:03:17,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 16:03:19,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=937500.0, ans=0.0 2023-10-02 16:03:25,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:03:25,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:03:25,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=937500.0, ans=0.125 2023-10-02 16:03:28,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 16:03:28,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:03:28,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 16:03:30,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:03:30,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:03:31,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 16:03:31,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:03:31,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 16:03:33,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:03:36,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:03:37,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:03:39,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:03:40,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 16:03:40,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:03:42,420 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.48 vs. limit=15.0 2023-10-02 16:03:43,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:03:46,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:03:51,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:03:51,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=937633.3333333334, ans=0.125 2023-10-02 16:03:53,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:03:55,143 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.821e+02 2.040e+02 2.416e+02 3.469e+02, threshold=4.081e+02, percent-clipped=0.0 2023-10-02 16:03:58,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 16:04:01,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 16:04:01,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:04:01,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 16:04:04,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:04:04,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:04:05,825 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 16:04:05,826 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 16:04:05,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 16:04:09,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:04:10,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 16:04:10,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 16:04:11,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:04:13,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 16:04:14,541 INFO [train.py:1046] (1/4) Epoch 27, batch 2550, loss[loss=0.1761, simple_loss=0.2511, pruned_loss=0.05053, over 23772.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2437, pruned_loss=0.04413, over 4724407.97 frames. ], batch size: 179, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:04:16,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 16:04:19,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:04:21,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:04:21,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:04:22,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:04:24,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 16:04:24,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:04:27,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 16:04:28,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:04:30,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:04:33,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:04:34,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 16:04:34,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:04:35,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:04:35,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:04:38,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:04:38,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 16:04:40,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 16:04:40,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:04:40,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 16:04:51,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:04:57,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:04:57,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:04:57,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:04:58,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:05:03,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:05:07,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:05:07,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:05:07,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:05:08,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 16:05:08,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:05:13,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:05:14,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:05:19,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:05:19,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 16:05:19,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:05:20,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:05:20,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 16:05:21,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=938033.3333333334, ans=0.2 2023-10-02 16:05:22,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 16:05:22,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:05:28,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:05:29,770 INFO [train.py:1046] (1/4) Epoch 27, batch 2600, loss[loss=0.1743, simple_loss=0.2485, pruned_loss=0.05004, over 23816.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2444, pruned_loss=0.04416, over 4730841.73 frames. ], batch size: 195, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:05:31,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:05:32,578 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 16:05:35,906 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 16:05:35,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:05:37,219 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 16:05:37,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 16:05:37,328 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 16:05:40,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:05:40,471 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 16:05:41,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 16:05:41,941 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 16:05:44,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:05:44,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=938166.6666666666, ans=0.04949747468305833 2023-10-02 16:05:46,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 16:05:47,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 16:05:49,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 16:05:49,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 16:05:51,888 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 16:05:51,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 16:05:55,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=938166.6666666666, ans=0.2 2023-10-02 16:05:59,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:01,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:06:01,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:06:01,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 16:06:02,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:06:07,128 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 16:06:10,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=938233.3333333334, ans=0.125 2023-10-02 16:06:13,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:06:13,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:14,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 16:06:15,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:06:15,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:06:15,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 16:06:18,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:06:19,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:06:21,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:06:24,355 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.903e+02 2.122e+02 2.565e+02 3.470e+02, threshold=4.244e+02, percent-clipped=0.0 2023-10-02 16:06:24,523 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 16:06:24,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:06:25,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:06:29,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:06:29,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:06:29,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 16:06:31,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:06:34,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:06:34,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:06:40,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 16:06:41,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:44,156 INFO [train.py:1046] (1/4) Epoch 27, batch 2650, loss[loss=0.1528, simple_loss=0.2423, pruned_loss=0.03164, over 24463.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2448, pruned_loss=0.04442, over 4740141.39 frames. ], batch size: 69, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:06:44,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:06:47,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 16:06:47,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:48,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:06:50,332 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 16:06:50,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:06:51,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:53,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:06:54,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:06:57,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:06:57,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 16:06:57,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:06:59,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:07:02,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 16:07:04,004 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 16:07:06,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.53 vs. limit=10.0 2023-10-02 16:07:06,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:07:10,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 16:07:10,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:07:10,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 16:07:14,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:07:14,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:07:14,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:07:14,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:19,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 16:07:19,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 16:07:20,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=938566.6666666666, ans=0.05 2023-10-02 16:07:21,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:07:26,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 16:07:26,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:07:26,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:26,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:07:27,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:07:27,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:07:30,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:07:33,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:07:33,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:07:35,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:07:35,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:07:38,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:07:38,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:07:39,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:07:39,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:07:41,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:07:43,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:44,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:07:46,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:07:46,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 16:07:48,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:07:50,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:52,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:53,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:07:53,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:07:53,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:07:56,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:07:56,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 16:07:59,335 INFO [train.py:1046] (1/4) Epoch 27, batch 2700, loss[loss=0.1744, simple_loss=0.2553, pruned_loss=0.04674, over 24676.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2456, pruned_loss=0.04451, over 4743153.14 frames. ], batch size: 65, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:07:59,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:08:00,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 16:08:03,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:08:03,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:08:03,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:08:05,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:08:05,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:08:05,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:08:05,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 16:08:05,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 16:08:06,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:08:09,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:08:11,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:08:11,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:08:14,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:08:17,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 16:08:17,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:08:17,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=938833.3333333334, ans=0.09899494936611666 2023-10-02 16:08:21,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:08:21,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:08:23,900 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.15 vs. limit=15.0 2023-10-02 16:08:27,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:08:27,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:08:27,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:08:27,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:08:29,049 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.90 vs. limit=15.0 2023-10-02 16:08:31,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:08:35,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:08:35,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:08:35,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:08:39,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:08:40,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:08:47,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:08:47,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:08:50,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=938966.6666666666, ans=0.125 2023-10-02 16:08:51,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:08:51,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:08:53,642 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.852e+02 1.992e+02 2.201e+02 2.706e+02, threshold=3.984e+02, percent-clipped=0.0 2023-10-02 16:08:56,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:08:56,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:08:57,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:08:57,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:08:59,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:09:00,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:09:02,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:09:03,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:09:03,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:09:07,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 16:09:08,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:09:08,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=939033.3333333334, ans=0.125 2023-10-02 16:09:11,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:09:11,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 16:09:13,018 INFO [train.py:1046] (1/4) Epoch 27, batch 2750, loss[loss=0.1587, simple_loss=0.2196, pruned_loss=0.04888, over 23419.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2445, pruned_loss=0.04474, over 4732229.14 frames. ], batch size: 285, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:09:14,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 16:09:14,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:09:16,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=939100.0, ans=0.1 2023-10-02 16:09:17,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:09:18,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:09:20,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=939100.0, ans=0.125 2023-10-02 16:09:21,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:21,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:09:21,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:25,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:09:25,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 16:09:27,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:09:27,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:27,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 16:09:27,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:09:27,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:09:32,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 16:09:34,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:09:34,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:34,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:09:35,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:09:35,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:09:36,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:09:36,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:09:36,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:09:42,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:09:42,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:09:43,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:09:45,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:45,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:09:49,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:09:52,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:09:52,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:09:58,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:58,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:09:58,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:10:04,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:10:05,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:10:05,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 16:10:08,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:10:12,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 16:10:16,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 16:10:19,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:10:19,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 16:10:20,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:10:22,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:10:22,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=939366.6666666666, ans=0.125 2023-10-02 16:10:23,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 16:10:23,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:10:26,195 INFO [train.py:1046] (1/4) Epoch 27, batch 2800, loss[loss=0.1699, simple_loss=0.2624, pruned_loss=0.03865, over 24454.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2439, pruned_loss=0.0444, over 4724065.51 frames. ], batch size: 69, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:10:27,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 16:10:27,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:10:27,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=939433.3333333334, ans=0.0 2023-10-02 16:10:28,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:10:29,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 16:10:29,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:10:29,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:10:31,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:10:31,046 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 16:10:31,046 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 16:10:35,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:10:35,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:10:35,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:10:38,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:10:40,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 16:10:43,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 16:10:44,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 16:10:46,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:10:46,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=939500.0, ans=0.0 2023-10-02 16:10:46,940 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.67 vs. limit=15.0 2023-10-02 16:10:47,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:10:47,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:10:50,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:10:51,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:10:51,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 16:10:51,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:10:57,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:10:59,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:11:00,959 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:11:01,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=939566.6666666666, ans=0.2 2023-10-02 16:11:02,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:11:02,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:11:03,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:11:08,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:11:08,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 16:11:09,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:11:09,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:11:09,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:11:10,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=939633.3333333334, ans=0.0 2023-10-02 16:11:16,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:11:16,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:11:18,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:11:21,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:11:21,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:11:21,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:11:22,775 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.891e+02 2.118e+02 2.532e+02 5.316e+02, threshold=4.237e+02, percent-clipped=2.0 2023-10-02 16:11:22,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 16:11:23,630 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.73 vs. limit=10.0 2023-10-02 16:11:24,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:11:25,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:11:26,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 16:11:26,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:11:28,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:11:28,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:11:31,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 16:11:32,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:11:32,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:11:32,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:11:34,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=939700.0, ans=0.125 2023-10-02 16:11:35,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 16:11:39,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=939766.6666666666, ans=0.125 2023-10-02 16:11:40,359 INFO [train.py:1046] (1/4) Epoch 27, batch 2850, loss[loss=0.1683, simple_loss=0.2431, pruned_loss=0.04674, over 23873.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2436, pruned_loss=0.04455, over 4728961.67 frames. ], batch size: 195, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:11:41,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:11:41,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 16:11:41,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:11:42,009 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=939766.6666666666, ans=0.0 2023-10-02 16:11:44,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:11:46,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:11:48,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:11:48,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:11:48,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=939766.6666666666, ans=0.2 2023-10-02 16:11:49,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:11:51,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:11:52,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:11:53,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 16:11:59,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 16:11:59,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:12:00,639 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.45 vs. limit=15.0 2023-10-02 16:12:01,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 16:12:01,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:12:04,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 16:12:05,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 16:12:06,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:12:17,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:12:18,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=939900.0, ans=0.0 2023-10-02 16:12:19,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:12:19,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:12:20,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:12:20,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:12:20,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:12:22,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=939900.0, ans=0.1 2023-10-02 16:12:23,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:12:23,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 16:12:24,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:12:24,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:12:26,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:12:26,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:12:29,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:12:29,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:12:30,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:12:32,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:12:34,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=939966.6666666666, ans=0.0 2023-10-02 16:12:35,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:12:35,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:12:36,214 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.93 vs. limit=15.0 2023-10-02 16:12:36,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:12:38,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:12:40,667 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.89 vs. limit=15.0 2023-10-02 16:12:43,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:12:44,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 16:12:44,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 16:12:47,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:12:47,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:12:47,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 16:12:49,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:12:49,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:12:50,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:12:50,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:12:50,486 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 16:12:50,526 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 16:12:50,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:12:50,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=940033.3333333334, ans=0.0 2023-10-02 16:12:51,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:12:53,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=940100.0, ans=0.0 2023-10-02 16:12:54,590 INFO [train.py:1046] (1/4) Epoch 27, batch 2900, loss[loss=0.1543, simple_loss=0.2416, pruned_loss=0.03351, over 24471.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2435, pruned_loss=0.04415, over 4728921.41 frames. ], batch size: 66, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:12:56,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 16:12:56,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:12:56,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:12:59,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 16:13:03,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:13:03,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 16:13:04,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 16:13:06,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:13:06,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:13:08,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:13:10,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:13:11,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=940166.6666666666, ans=0.0 2023-10-02 16:13:15,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:13:15,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:13:17,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 16:13:17,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=940166.6666666666, ans=0.2 2023-10-02 16:13:18,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 16:13:18,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:13:19,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:13:21,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 16:13:22,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 16:13:25,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:13:25,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 16:13:25,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:13:27,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:13:27,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 16:13:31,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:13:31,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:13:35,348 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.23 vs. limit=15.0 2023-10-02 16:13:36,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:13:39,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:13:39,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=940300.0, ans=0.125 2023-10-02 16:13:43,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 16:13:43,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 16:13:43,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:13:45,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:13:48,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 16:13:49,473 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.11 vs. limit=15.0 2023-10-02 16:13:49,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:13:53,106 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.791e+02 2.001e+02 2.222e+02 3.379e+02, threshold=4.002e+02, percent-clipped=0.0 2023-10-02 16:13:54,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:13:56,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=940366.6666666666, ans=0.125 2023-10-02 16:14:02,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:14:02,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:14:03,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 16:14:06,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:14:06,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 16:14:07,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:14:07,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:14:08,802 INFO [train.py:1046] (1/4) Epoch 27, batch 2950, loss[loss=0.1798, simple_loss=0.2576, pruned_loss=0.05102, over 24381.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2452, pruned_loss=0.04487, over 4726240.81 frames. ], batch size: 77, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:14:13,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:14:16,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 16:14:17,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:14:17,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:14:17,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:14:19,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:14:21,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 16:14:21,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 16:14:22,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:14:22,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:14:22,822 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=940500.0, ans=0.125 2023-10-02 16:14:28,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:14:29,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:14:30,514 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.66 vs. limit=15.0 2023-10-02 16:14:32,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:14:32,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:14:37,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:14:37,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:14:40,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:14:41,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:14:41,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:14:42,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 16:14:47,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 16:14:47,547 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 16:14:48,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:14:50,432 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 16:14:52,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 16:14:52,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:14:52,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:14:52,363 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 16:14:53,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:14:56,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 16:14:56,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:14:57,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=940633.3333333334, ans=0.125 2023-10-02 16:14:58,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:15:00,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:15:02,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:15:02,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:15:02,525 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 16:15:03,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:15:03,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 16:15:03,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=940633.3333333334, ans=0.0 2023-10-02 16:15:09,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:15:09,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:15:11,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 16:15:11,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:15:12,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 16:15:15,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:15:17,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:15:17,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:15:18,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:15:20,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 16:15:21,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:15:22,834 INFO [train.py:1046] (1/4) Epoch 27, batch 3000, loss[loss=0.1587, simple_loss=0.2493, pruned_loss=0.03408, over 24435.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2459, pruned_loss=0.04479, over 4726222.15 frames. ], batch size: 69, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:15:22,835 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 16:15:34,585 INFO [train.py:1078] (1/4) Epoch 27, validation: loss=0.3322, simple_loss=0.2706, pruned_loss=0.197, over 1125622.00 frames. 2023-10-02 16:15:34,585 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 16:15:34,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:15:34,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:15:34,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:15:34,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:15:36,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:15:37,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:15:37,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 16:15:40,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:15:42,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:15:42,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=940766.6666666666, ans=0.07 2023-10-02 16:15:43,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:15:46,688 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 16:15:46,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 16:15:48,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:15:49,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:15:49,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 16:15:49,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:15:52,919 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.85 vs. limit=6.0 2023-10-02 16:15:57,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:16:00,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=940833.3333333334, ans=0.125 2023-10-02 16:16:05,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=940900.0, ans=0.04949747468305833 2023-10-02 16:16:06,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:16:06,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=940900.0, ans=0.1 2023-10-02 16:16:10,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 16:16:12,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:16:15,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:16:15,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:16:15,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:16:15,357 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:16:17,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:16:17,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 16:16:20,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 16:16:20,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:16:21,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 16:16:23,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:16:23,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:16:24,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:16:24,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:16:29,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:16:29,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:16:29,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:16:30,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:16:31,980 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.934e+02 2.111e+02 2.505e+02 3.384e+02, threshold=4.221e+02, percent-clipped=0.0 2023-10-02 16:16:33,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 16:16:35,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:16:35,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:16:35,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:16:38,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:16:38,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=941033.3333333334, ans=0.0 2023-10-02 16:16:39,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:16:39,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 16:16:39,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 16:16:39,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:16:41,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 16:16:42,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:16:45,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 16:16:46,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:16:48,193 INFO [train.py:1046] (1/4) Epoch 27, batch 3050, loss[loss=0.1926, simple_loss=0.2758, pruned_loss=0.05464, over 24321.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2473, pruned_loss=0.04581, over 4727697.69 frames. ], batch size: 74, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:16:48,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:16:49,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 16:16:49,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 16:16:49,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 16:16:51,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:16:52,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:16:52,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 16:16:52,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:16:52,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:16:53,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 16:16:57,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:17:00,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:17:00,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:17:03,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=941166.6666666666, ans=0.125 2023-10-02 16:17:04,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:17:06,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 16:17:13,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 16:17:14,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 16:17:15,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:17:16,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:17:19,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:17:19,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:17:19,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:17:19,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=941233.3333333334, ans=0.0 2023-10-02 16:17:22,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:17:23,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:17:23,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:17:23,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:17:23,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:17:23,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=941233.3333333334, ans=0.09899494936611666 2023-10-02 16:17:25,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:17:25,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=941233.3333333334, ans=0.05 2023-10-02 16:17:26,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:17:28,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:17:30,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 16:17:30,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:17:30,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:17:33,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:17:33,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:17:34,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:17:35,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:17:35,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=941300.0, ans=0.5 2023-10-02 16:17:39,732 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.39 vs. limit=15.0 2023-10-02 16:17:40,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:17:42,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:17:48,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:17:48,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:17:49,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:17:51,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:17:52,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:17:52,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:17:53,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 16:17:55,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:17:55,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:17:56,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 16:17:58,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:18:02,757 INFO [train.py:1046] (1/4) Epoch 27, batch 3100, loss[loss=0.1588, simple_loss=0.2353, pruned_loss=0.04114, over 24327.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2471, pruned_loss=0.04576, over 4728207.97 frames. ], batch size: 56, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:18:04,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:18:05,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:18:07,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 16:18:09,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 16:18:13,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 16:18:13,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 16:18:14,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:18:14,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=941433.3333333334, ans=0.125 2023-10-02 16:18:17,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:18:17,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:18:20,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 16:18:23,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=941500.0, ans=0.0 2023-10-02 16:18:24,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:18:28,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 16:18:34,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 16:18:34,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:18:34,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:18:35,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:18:36,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 16:18:38,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:18:38,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 16:18:38,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:18:38,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:18:41,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 16:18:41,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:18:44,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:18:45,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 16:18:47,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 16:18:48,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:18:48,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:18:50,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:18:50,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:18:51,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:18:53,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=941633.3333333334, ans=0.0 2023-10-02 16:18:53,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=941633.3333333334, ans=0.125 2023-10-02 16:18:54,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:18:54,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:18:54,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:18:55,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:18:55,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:18:55,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 16:18:59,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:19:00,437 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.900e+02 2.040e+02 2.286e+02 3.067e+02, threshold=4.081e+02, percent-clipped=0.0 2023-10-02 16:19:00,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 16:19:03,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:19:03,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 16:19:04,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:19:04,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:19:04,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 16:19:08,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=941700.0, ans=0.1 2023-10-02 16:19:16,595 INFO [train.py:1046] (1/4) Epoch 27, batch 3150, loss[loss=0.1642, simple_loss=0.257, pruned_loss=0.03575, over 24650.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2452, pruned_loss=0.04504, over 4724475.70 frames. ], batch size: 73, lr: 3.78e-03, grad_scale: 8.0 2023-10-02 16:19:16,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 16:19:17,344 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.67 vs. limit=22.5 2023-10-02 16:19:18,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:19:18,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:19:20,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:19:20,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:19:20,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 16:19:22,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:19:22,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 16:19:23,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 16:19:24,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:19:26,844 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 16:19:30,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 16:19:30,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:19:31,482 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 16:19:32,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 16:19:34,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 16:19:36,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 16:19:36,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 16:19:36,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:19:36,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:19:37,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:19:38,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 16:19:40,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:19:41,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:19:41,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:19:44,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 16:19:47,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 16:19:47,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:19:50,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:19:50,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:19:51,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 16:19:53,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 16:19:54,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:19:55,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 16:19:55,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 16:19:57,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:19:57,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:20:00,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:20:00,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:20:00,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 16:20:02,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:20:02,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:04,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:20:04,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:20:05,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 16:20:05,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:20:05,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=941966.6666666666, ans=0.125 2023-10-02 16:20:08,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 16:20:08,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:09,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 16:20:09,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 16:20:11,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:20:11,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:20:11,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 16:20:12,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 16:20:12,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:20:15,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:20:16,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:16,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:20:22,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:20:22,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=942033.3333333334, ans=0.0 2023-10-02 16:20:23,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:25,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 16:20:25,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=942033.3333333334, ans=0.0 2023-10-02 16:20:30,375 INFO [train.py:1046] (1/4) Epoch 27, batch 3200, loss[loss=0.1524, simple_loss=0.233, pruned_loss=0.03588, over 24459.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2436, pruned_loss=0.04447, over 4721688.80 frames. ], batch size: 63, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:20:30,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:20:30,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 16:20:33,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:33,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=942100.0, ans=0.125 2023-10-02 16:20:35,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:20:35,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 16:20:36,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:20:40,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:20:44,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:49,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=942166.6666666666, ans=0.125 2023-10-02 16:20:53,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:21:02,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 16:21:03,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:21:05,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 16:21:06,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 16:21:09,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:21:10,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:21:12,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:21:15,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 16:21:16,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 16:21:19,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 16:21:20,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 16:21:25,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:21:26,755 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.873e+02 2.063e+02 2.304e+02 3.218e+02, threshold=4.126e+02, percent-clipped=0.0 2023-10-02 16:21:30,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:21:31,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:21:31,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:21:31,732 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 16:21:31,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:21:35,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:21:39,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 16:21:39,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 16:21:39,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 16:21:40,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 16:21:41,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=942366.6666666666, ans=15.0 2023-10-02 16:21:43,736 INFO [train.py:1046] (1/4) Epoch 27, batch 3250, loss[loss=0.1643, simple_loss=0.2387, pruned_loss=0.04492, over 23522.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2432, pruned_loss=0.04415, over 4727436.45 frames. ], batch size: 134, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:21:43,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:21:45,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 16:21:45,342 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 16:21:46,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:21:46,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:21:48,096 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 16:21:53,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:21:54,529 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.82 vs. limit=22.5 2023-10-02 16:21:56,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:21:59,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=942500.0, ans=0.0 2023-10-02 16:22:03,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:22:03,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 16:22:05,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:22:05,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:22:05,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:22:07,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:22:08,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:22:09,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:22:10,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:22:11,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:22:11,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:22:11,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:22:11,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:22:15,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:22:16,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:22:17,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:22:17,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:22:20,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:22:20,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:22:20,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:22:22,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=942566.6666666666, ans=10.0 2023-10-02 16:22:24,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 16:22:25,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=942566.6666666666, ans=0.1 2023-10-02 16:22:26,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:22:26,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:22:26,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:22:27,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:22:33,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:22:35,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=942633.3333333334, ans=0.125 2023-10-02 16:22:35,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=942633.3333333334, ans=0.125 2023-10-02 16:22:39,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:22:41,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:22:41,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 16:22:41,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:22:41,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 16:22:41,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:22:43,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=942700.0, ans=0.0 2023-10-02 16:22:44,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 16:22:44,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 16:22:44,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:22:46,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:22:47,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:22:49,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 16:22:49,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:22:49,348 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:22:51,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=942700.0, ans=0.125 2023-10-02 16:22:53,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:22:53,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:22:53,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 16:22:54,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:22:55,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:22:55,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 16:22:57,202 INFO [train.py:1046] (1/4) Epoch 27, batch 3300, loss[loss=0.1768, simple_loss=0.2445, pruned_loss=0.05461, over 23891.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2441, pruned_loss=0.04431, over 4724464.05 frames. ], batch size: 180, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:22:57,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:22:58,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 16:23:00,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 16:23:01,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 16:23:01,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:23:06,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:23:06,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:23:06,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:07,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 16:23:07,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:23:09,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=942766.6666666666, ans=0.2 2023-10-02 16:23:10,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:23:12,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:23:15,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 16:23:16,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:23:17,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:23:18,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:19,932 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 16:23:20,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:23:21,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:23:22,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:23:22,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:23:22,767 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 16:23:26,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:23:26,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:23:27,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=942900.0, ans=0.0 2023-10-02 16:23:29,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:29,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 16:23:31,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 16:23:31,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:32,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:23:34,469 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 16:23:35,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 16:23:37,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:23:38,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 16:23:41,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:23:44,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 16:23:45,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:23:47,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:23:48,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:23:48,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:23:48,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:23:50,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:23:50,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:51,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:23:51,899 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 16:23:53,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 16:23:54,394 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.925e+02 2.169e+02 2.567e+02 3.728e+02, threshold=4.338e+02, percent-clipped=0.0 2023-10-02 16:23:54,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:23:55,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:23:55,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:23:57,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:23:57,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:23:58,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:23:58,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:23:58,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 16:24:00,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:24:03,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:24:06,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 16:24:06,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:07,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:08,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:24:08,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:24:10,175 INFO [train.py:1046] (1/4) Epoch 27, batch 3350, loss[loss=0.17, simple_loss=0.2492, pruned_loss=0.0454, over 23531.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2446, pruned_loss=0.04405, over 4727474.91 frames. ], batch size: 106, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:24:10,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:24:11,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:24:11,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:14,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:24:16,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:18,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:24:21,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:23,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:24:23,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:24:25,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:24:26,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 16:24:26,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=943166.6666666666, ans=0.125 2023-10-02 16:24:27,936 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 16:24:27,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:24:30,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 16:24:30,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 16:24:32,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:24:32,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:24:32,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=943166.6666666666, ans=0.09899494936611666 2023-10-02 16:24:35,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:24:35,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 16:24:35,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:35,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:24:38,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:40,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:41,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:43,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:24:45,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:24:47,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:49,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:24:50,559 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.15 vs. limit=22.5 2023-10-02 16:24:52,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:24:52,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:55,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:56,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:24:57,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:25:00,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 16:25:00,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:25:00,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=943300.0, ans=0.2 2023-10-02 16:25:02,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 16:25:02,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:25:02,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 16:25:03,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:25:04,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:25:11,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:25:11,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 16:25:12,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:25:14,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:25:15,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:25:18,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:25:20,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 16:25:21,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:25:21,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:25:23,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:25:23,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 16:25:24,641 INFO [train.py:1046] (1/4) Epoch 27, batch 3400, loss[loss=0.2247, simple_loss=0.2896, pruned_loss=0.07989, over 19535.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2452, pruned_loss=0.0445, over 4718855.06 frames. ], batch size: 389, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:25:24,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:25:24,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 16:25:26,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:25:26,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:25:27,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:25:28,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:25:28,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 16:25:32,234 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.22 vs. limit=15.0 2023-10-02 16:25:32,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 16:25:32,838 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 16:25:32,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:25:37,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:25:37,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:25:37,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:25:39,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=943500.0, ans=0.125 2023-10-02 16:25:40,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:25:43,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=943500.0, ans=0.125 2023-10-02 16:25:45,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:25:49,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 16:25:52,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:25:55,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:25:55,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:25:56,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 16:26:01,367 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.04 vs. limit=22.5 2023-10-02 16:26:02,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:26:05,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 16:26:12,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:26:14,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:26:14,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 16:26:15,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:26:15,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:26:15,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:26:15,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:26:20,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:26:21,336 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.815e+02 1.942e+02 2.172e+02 3.090e+02, threshold=3.885e+02, percent-clipped=0.0 2023-10-02 16:26:22,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:26:22,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:26:27,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:26:30,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 16:26:34,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:26:36,749 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.90 vs. limit=10.0 2023-10-02 16:26:37,421 INFO [train.py:1046] (1/4) Epoch 27, batch 3450, loss[loss=0.152, simple_loss=0.2278, pruned_loss=0.03813, over 24638.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2453, pruned_loss=0.04517, over 4710955.89 frames. ], batch size: 60, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:26:39,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 16:26:42,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 16:26:42,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:26:44,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:26:44,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 16:26:44,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=943766.6666666666, ans=0.0 2023-10-02 16:26:45,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:26:50,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:26:56,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:26:56,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:26:56,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:26:57,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:26:59,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:27:04,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 16:27:10,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 16:27:10,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 16:27:11,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:27:12,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:27:18,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 16:27:18,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:27:21,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:27:21,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:27:23,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:27:25,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:27:26,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 16:27:26,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:27:27,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:27:30,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:27:33,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 16:27:36,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:27:41,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:27:42,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:27:42,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=944033.3333333334, ans=0.125 2023-10-02 16:27:46,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:27:48,147 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.74 vs. limit=15.0 2023-10-02 16:27:50,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:27:50,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:27:50,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:27:51,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:27:53,143 INFO [train.py:1046] (1/4) Epoch 27, batch 3500, loss[loss=0.147, simple_loss=0.2262, pruned_loss=0.03386, over 24350.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2441, pruned_loss=0.04478, over 4707492.20 frames. ], batch size: 56, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:27:53,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:27:56,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:27:57,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 16:27:59,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:28:00,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=944100.0, ans=0.1 2023-10-02 16:28:01,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 16:28:05,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:28:05,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 16:28:09,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:28:11,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:28:12,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:28:12,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:28:13,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 16:28:14,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:14,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:28:15,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 16:28:18,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:18,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 16:28:19,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:28:21,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=944233.3333333334, ans=0.0 2023-10-02 16:28:22,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:22,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 16:28:22,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:28:25,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:28:27,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:28:28,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:29,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:28:29,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:28:32,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 16:28:34,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 16:28:34,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 16:28:35,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:28:37,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:38,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:28:38,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:28:41,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 16:28:42,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:28:46,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:28:46,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=944300.0, ans=0.125 2023-10-02 16:28:48,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 16:28:48,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 16:28:48,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:28:50,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:28:52,058 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.840e+02 2.099e+02 2.420e+02 3.438e+02, threshold=4.198e+02, percent-clipped=0.0 2023-10-02 16:28:52,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:28:53,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:55,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 16:28:55,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:28:56,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:28:57,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 16:28:59,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 16:29:01,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:29:02,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:29:02,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:29:02,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:29:05,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:29:07,055 INFO [train.py:1046] (1/4) Epoch 27, batch 3550, loss[loss=0.157, simple_loss=0.2355, pruned_loss=0.03926, over 24466.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2422, pruned_loss=0.04406, over 4702563.33 frames. ], batch size: 58, lr: 3.78e-03, grad_scale: 8.0 2023-10-02 16:29:08,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=944433.3333333334, ans=0.0 2023-10-02 16:29:13,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:29:15,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 16:29:17,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:29:19,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:29:20,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:29:22,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:29:22,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:29:25,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:29:25,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:29:26,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:29:26,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 16:29:26,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:29:31,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=944500.0, ans=0.0 2023-10-02 16:29:32,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:29:32,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:29:34,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=944500.0, ans=0.5 2023-10-02 16:29:35,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:29:35,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:29:35,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:29:35,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 16:29:35,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:29:36,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:29:38,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 16:29:42,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:29:42,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:29:43,781 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.04 vs. limit=6.0 2023-10-02 16:29:44,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:29:46,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 16:29:46,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:29:47,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 16:29:48,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:29:50,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:29:50,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:29:54,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 16:29:55,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:30:02,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:30:02,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 16:30:03,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:30:06,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:30:07,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 16:30:14,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 16:30:14,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:30:15,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:30:18,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:30:18,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:30:18,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:30:21,528 INFO [train.py:1046] (1/4) Epoch 27, batch 3600, loss[loss=0.1446, simple_loss=0.2275, pruned_loss=0.03082, over 24371.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.242, pruned_loss=0.04416, over 4690374.98 frames. ], batch size: 61, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:30:23,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:30:25,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:30:26,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:30:26,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:30:27,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:30:27,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 16:30:31,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:30:31,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:30:32,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=944766.6666666666, ans=0.125 2023-10-02 16:30:35,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:30:38,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:30:38,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:30:39,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:30:39,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 16:30:40,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:30:43,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:30:44,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:30:45,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:30:45,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=944833.3333333334, ans=0.1 2023-10-02 16:30:46,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=944833.3333333334, ans=0.125 2023-10-02 16:30:48,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:30:49,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:30:50,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 16:30:56,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:30:57,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:30:58,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 16:31:00,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:31:02,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=944900.0, ans=0.0 2023-10-02 16:31:06,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:31:07,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=944966.6666666666, ans=0.2 2023-10-02 16:31:08,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:31:14,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:31:14,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:31:14,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 16:31:15,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 16:31:17,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 16:31:19,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:31:19,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:31:20,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 16:31:22,203 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.857e+02 2.015e+02 2.444e+02 4.517e+02, threshold=4.030e+02, percent-clipped=2.0 2023-10-02 16:31:22,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:31:22,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:31:22,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:31:23,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 16:31:25,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 16:31:28,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:31:30,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 16:31:36,510 INFO [train.py:1046] (1/4) Epoch 27, batch 3650, loss[loss=0.1558, simple_loss=0.2283, pruned_loss=0.04164, over 23638.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2435, pruned_loss=0.04467, over 4696233.16 frames. ], batch size: 135, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:31:36,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 16:31:36,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:31:38,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=945100.0, ans=0.0 2023-10-02 16:31:39,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 16:31:42,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 16:31:43,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=945100.0, ans=0.125 2023-10-02 16:31:46,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:31:46,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:31:46,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:31:49,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 16:31:51,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:31:52,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 16:31:52,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:31:54,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:31:54,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 16:31:56,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 16:31:56,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:31:56,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:31:57,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:31:59,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=945166.6666666666, ans=0.125 2023-10-02 16:32:00,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 16:32:00,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 16:32:02,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:32:02,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=945166.6666666666, ans=0.2 2023-10-02 16:32:03,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 16:32:06,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:32:06,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:32:11,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:32:13,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:32:13,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:32:15,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:32:16,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:32:18,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:32:21,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:32:22,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:32:22,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:32:25,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:32:25,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:32:25,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:32:33,422 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 16:32:36,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:32:36,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:32:37,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:32:37,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:32:39,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:32:41,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:32:41,864 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.10 vs. limit=12.0 2023-10-02 16:32:42,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 16:32:42,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:32:43,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=945366.6666666666, ans=0.125 2023-10-02 16:32:45,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:32:48,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:32:49,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:32:50,159 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.33 vs. limit=22.5 2023-10-02 16:32:50,910 INFO [train.py:1046] (1/4) Epoch 27, batch 3700, loss[loss=0.1625, simple_loss=0.24, pruned_loss=0.04255, over 23657.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2443, pruned_loss=0.04492, over 4701888.62 frames. ], batch size: 232, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:32:52,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:32:52,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 16:32:52,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:32:53,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 16:32:53,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=945433.3333333334, ans=0.0 2023-10-02 16:32:55,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:32:58,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:33:01,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:33:01,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:33:03,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:33:03,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:33:03,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.85 vs. limit=10.0 2023-10-02 16:33:04,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 16:33:04,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:33:06,087 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 16:33:07,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=945500.0, ans=0.125 2023-10-02 16:33:15,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:33:15,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 16:33:16,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=945500.0, ans=0.125 2023-10-02 16:33:17,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:33:19,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 16:33:19,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:33:22,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:33:23,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 16:33:23,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=945566.6666666666, ans=0.1 2023-10-02 16:33:24,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:33:26,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:33:28,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:33:28,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:33:28,569 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=945566.6666666666, ans=0.125 2023-10-02 16:33:29,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 16:33:35,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:33:35,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 16:33:35,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:33:37,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 16:33:40,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:33:41,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:33:42,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:33:43,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=945633.3333333334, ans=0.0 2023-10-02 16:33:44,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 16:33:45,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:33:45,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:33:45,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:33:46,050 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.26 vs. limit=15.0 2023-10-02 16:33:47,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:33:50,529 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.788e+02 1.952e+02 2.175e+02 3.581e+02, threshold=3.904e+02, percent-clipped=0.0 2023-10-02 16:33:50,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:33:50,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 16:33:52,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 16:33:53,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:33:53,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:33:54,881 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:33:56,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:33:57,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:33:58,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:34:00,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:34:02,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:34:03,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 16:34:04,687 INFO [train.py:1046] (1/4) Epoch 27, batch 3750, loss[loss=0.1498, simple_loss=0.2263, pruned_loss=0.03665, over 24468.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2452, pruned_loss=0.04484, over 4720298.12 frames. ], batch size: 58, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:34:04,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 16:34:08,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 16:34:09,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 16:34:10,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:34:12,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:34:13,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:34:13,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=945766.6666666666, ans=0.125 2023-10-02 16:34:14,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:34:15,432 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.60 vs. limit=15.0 2023-10-02 16:34:17,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:34:22,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:34:22,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:34:25,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:34:28,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:34:28,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=945833.3333333334, ans=0.0 2023-10-02 16:34:29,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 16:34:29,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:34:30,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=945833.3333333334, ans=15.0 2023-10-02 16:34:31,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:34:31,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:34:35,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 16:34:38,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 16:34:39,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:34:41,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:34:43,417 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.57 vs. limit=15.0 2023-10-02 16:34:44,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:34:49,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:34:50,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 16:34:53,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 16:34:55,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=945966.6666666666, ans=0.07 2023-10-02 16:34:56,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:34:56,851 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.52 vs. limit=15.0 2023-10-02 16:34:59,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:34:59,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:35:01,583 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.09 vs. limit=22.5 2023-10-02 16:35:02,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:35:06,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 16:35:08,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 16:35:10,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:35:12,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:35:14,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:35:16,125 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.73 vs. limit=15.0 2023-10-02 16:35:18,931 INFO [train.py:1046] (1/4) Epoch 27, batch 3800, loss[loss=0.1748, simple_loss=0.2591, pruned_loss=0.04525, over 24451.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2452, pruned_loss=0.04482, over 4716723.71 frames. ], batch size: 69, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:35:20,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:35:22,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=946100.0, ans=0.025 2023-10-02 16:35:24,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:35:26,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 16:35:26,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 16:35:28,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:35:30,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:35:31,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 16:35:31,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 16:35:31,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:35:33,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:35:34,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:35:36,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:35:36,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:35:36,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 16:35:39,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 16:35:40,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:35:41,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=946166.6666666666, ans=0.125 2023-10-02 16:35:41,400 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.50 vs. limit=12.0 2023-10-02 16:35:43,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:35:46,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:35:46,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:35:48,862 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.01 vs. limit=10.0 2023-10-02 16:35:49,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:35:49,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:35:50,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:35:52,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:35:53,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=946233.3333333334, ans=0.2 2023-10-02 16:35:55,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:35:55,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 16:35:55,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=946233.3333333334, ans=0.1 2023-10-02 16:35:57,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:36:02,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:36:10,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:36:11,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 16:36:13,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 16:36:15,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:36:16,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:36:17,637 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.874e+02 2.041e+02 2.512e+02 4.276e+02, threshold=4.082e+02, percent-clipped=2.0 2023-10-02 16:36:17,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:36:19,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 16:36:23,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 16:36:23,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 16:36:23,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:36:25,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:36:29,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:36:29,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:36:32,742 INFO [train.py:1046] (1/4) Epoch 27, batch 3850, loss[loss=0.1675, simple_loss=0.2567, pruned_loss=0.03921, over 24648.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2439, pruned_loss=0.0448, over 4704745.77 frames. ], batch size: 68, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:36:32,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:36:34,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 16:36:35,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:36:37,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:36:37,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=946433.3333333334, ans=0.0 2023-10-02 16:36:37,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=946433.3333333334, ans=0.1 2023-10-02 16:36:40,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:36:41,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=946433.3333333334, ans=0.0 2023-10-02 16:36:43,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:36:45,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 16:36:46,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 16:36:51,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:36:53,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=946500.0, ans=0.125 2023-10-02 16:36:54,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:36:54,880 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.12 vs. limit=6.0 2023-10-02 16:36:55,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:36:55,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:36:58,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:36:58,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:37:00,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:00,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:37:01,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:04,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:05,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:37:05,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:37:06,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 16:37:06,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 16:37:07,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:37:08,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:37:10,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:11,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:37:11,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 16:37:15,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 16:37:15,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:18,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 16:37:19,793 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.24 vs. limit=15.0 2023-10-02 16:37:20,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 16:37:21,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=946633.3333333334, ans=0.125 2023-10-02 16:37:25,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:26,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:37:28,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=946633.3333333334, ans=0.125 2023-10-02 16:37:30,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:31,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 16:37:32,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 16:37:35,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:35,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:40,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:37:40,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:37:40,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:41,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:41,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:37:41,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 16:37:43,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:37:43,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 16:37:45,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:45,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:47,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:37:47,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:48,825 INFO [train.py:1046] (1/4) Epoch 27, batch 3900, loss[loss=0.1564, simple_loss=0.2317, pruned_loss=0.0405, over 23629.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2424, pruned_loss=0.04444, over 4698172.49 frames. ], batch size: 232, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:37:48,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:37:48,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:48,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:49,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:37:50,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 16:37:50,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:52,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=946766.6666666666, ans=0.2 2023-10-02 16:37:53,987 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.23 vs. limit=15.0 2023-10-02 16:37:54,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:37:54,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:37:54,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:37:56,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:37:57,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:37:57,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:59,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:38:00,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 16:38:00,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:38:01,039 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.08 vs. limit=15.0 2023-10-02 16:38:01,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 16:38:01,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:38:03,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 16:38:04,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 16:38:09,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:38:09,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:38:10,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:38:10,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:38:15,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:38:15,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=946833.3333333334, ans=0.2 2023-10-02 16:38:18,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:38:20,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:38:21,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:38:22,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:38:28,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:38:28,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:38:29,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=946900.0, ans=0.0 2023-10-02 16:38:37,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:38:38,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:38:47,389 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.883e+02 2.029e+02 2.294e+02 3.792e+02, threshold=4.059e+02, percent-clipped=0.0 2023-10-02 16:38:47,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:38:47,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=947033.3333333334, ans=0.0 2023-10-02 16:38:50,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:38:52,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 16:38:52,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 16:38:52,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:38:53,489 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.61 vs. limit=15.0 2023-10-02 16:38:54,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 16:38:56,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:38:58,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 16:39:02,574 INFO [train.py:1046] (1/4) Epoch 27, batch 3950, loss[loss=0.1422, simple_loss=0.226, pruned_loss=0.02925, over 24285.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2417, pruned_loss=0.04423, over 4700900.07 frames. ], batch size: 61, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:39:02,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:39:04,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 16:39:04,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:39:07,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:39:09,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:39:14,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=947100.0, ans=0.125 2023-10-02 16:39:15,340 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 16:39:15,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:39:15,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 16:39:16,719 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 16:39:16,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:39:19,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:39:21,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:39:21,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:39:23,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=947166.6666666666, ans=0.125 2023-10-02 16:39:24,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 16:39:28,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:39:28,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:39:29,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:39:31,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:39:31,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:39:36,325 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.95 vs. limit=15.0 2023-10-02 16:39:38,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=947233.3333333334, ans=0.125 2023-10-02 16:39:40,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:39:40,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:39:42,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=947233.3333333334, ans=0.0 2023-10-02 16:39:45,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 16:39:51,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 16:39:51,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 16:39:51,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:39:53,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:39:54,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=947300.0, ans=0.05 2023-10-02 16:39:55,120 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.07 vs. limit=22.5 2023-10-02 16:40:02,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:40:02,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:40:03,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:40:03,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:40:03,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 16:40:09,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:40:09,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:40:13,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 16:40:16,474 INFO [train.py:1046] (1/4) Epoch 27, batch 4000, loss[loss=0.1588, simple_loss=0.2496, pruned_loss=0.03397, over 24447.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.243, pruned_loss=0.04431, over 4708782.40 frames. ], batch size: 69, lr: 3.77e-03, grad_scale: 32.0 2023-10-02 16:40:21,018 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=947433.3333333334, ans=0.125 2023-10-02 16:40:23,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:40:31,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:40:37,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:40:37,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:40:38,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:40:38,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 16:40:40,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:40:40,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 16:40:40,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:40:40,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 16:40:41,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:40:44,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:40:44,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:40:44,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:40:45,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:40:45,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 16:40:47,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:40:48,945 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 16:40:49,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:40:50,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:40:51,924 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 16:40:53,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:40:53,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:41:00,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 16:41:00,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:41:03,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:41:04,000 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 16:41:06,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:41:06,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 16:41:06,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:41:08,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:41:08,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:41:11,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:41:11,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:41:11,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:41:14,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 16:41:14,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:41:15,413 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.476e+02 1.843e+02 2.082e+02 2.345e+02 3.565e+02, threshold=4.163e+02, percent-clipped=0.0 2023-10-02 16:41:15,549 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 16:41:19,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:41:22,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 16:41:24,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:41:24,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:41:24,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=947700.0, ans=0.07 2023-10-02 16:41:25,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:41:27,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:41:28,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=947700.0, ans=0.125 2023-10-02 16:41:30,557 INFO [train.py:1046] (1/4) Epoch 27, batch 4050, loss[loss=0.2222, simple_loss=0.2822, pruned_loss=0.08111, over 19467.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2444, pruned_loss=0.0448, over 4705470.60 frames. ], batch size: 388, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:41:32,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:41:35,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 16:41:35,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 16:41:36,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:41:36,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:41:38,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:41:38,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=947766.6666666666, ans=0.2 2023-10-02 16:41:39,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:41:41,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:41:41,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=947766.6666666666, ans=0.2 2023-10-02 16:41:44,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:41:45,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=947833.3333333334, ans=0.125 2023-10-02 16:41:46,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:41:46,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:41:49,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:41:49,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:41:52,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:41:55,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:41:58,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 16:42:01,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 16:42:01,657 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 16:42:03,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:42:05,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=947900.0, ans=0.125 2023-10-02 16:42:05,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=947900.0, ans=0.1 2023-10-02 16:42:10,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 16:42:12,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:42:12,985 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.10 vs. limit=15.0 2023-10-02 16:42:14,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:42:16,612 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:42:17,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:42:17,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:42:17,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:42:21,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:42:26,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 16:42:26,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:42:26,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:42:29,142 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.45 vs. limit=22.5 2023-10-02 16:42:29,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 16:42:33,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:42:39,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 16:42:42,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:42:42,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:42:42,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 16:42:42,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 16:42:42,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:42:45,020 INFO [train.py:1046] (1/4) Epoch 27, batch 4100, loss[loss=0.1784, simple_loss=0.2494, pruned_loss=0.05368, over 23620.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2452, pruned_loss=0.04514, over 4697751.34 frames. ], batch size: 232, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:42:45,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:42:45,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:42:45,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:42:46,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=948100.0, ans=0.125 2023-10-02 16:42:51,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 16:42:54,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 16:42:56,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 16:42:57,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 16:42:57,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:42:57,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:42:59,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:42:59,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:43:00,758 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 16:43:02,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=948166.6666666666, ans=0.0 2023-10-02 16:43:04,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:43:04,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=948166.6666666666, ans=0.2 2023-10-02 16:43:05,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:43:05,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:43:06,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:43:08,806 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.55 vs. limit=15.0 2023-10-02 16:43:10,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:43:11,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:43:11,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:43:11,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 16:43:13,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:43:13,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:43:13,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:43:13,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:43:13,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 16:43:17,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:43:18,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 16:43:18,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:43:18,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=948233.3333333334, ans=0.1 2023-10-02 16:43:21,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:43:21,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 16:43:22,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:43:22,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:43:22,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:43:24,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 16:43:25,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:43:27,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:43:28,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 16:43:28,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:43:29,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:43:33,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:43:39,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:43:42,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:43:42,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:43:44,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=948366.6666666666, ans=0.2 2023-10-02 16:43:45,429 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.836e+02 1.976e+02 2.197e+02 2.879e+02, threshold=3.952e+02, percent-clipped=0.0 2023-10-02 16:43:48,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:43:48,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:43:48,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=948366.6666666666, ans=0.0 2023-10-02 16:43:48,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=948366.6666666666, ans=0.95 2023-10-02 16:43:51,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:43:53,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:43:54,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=948366.6666666666, ans=0.125 2023-10-02 16:43:54,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=948366.6666666666, ans=0.125 2023-10-02 16:43:56,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:43:58,456 INFO [train.py:1046] (1/4) Epoch 27, batch 4150, loss[loss=0.1809, simple_loss=0.2453, pruned_loss=0.05823, over 23819.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2461, pruned_loss=0.04603, over 4692927.87 frames. ], batch size: 195, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:43:58,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:43:59,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:43:59,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:44:03,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 16:44:03,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:44:04,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 16:44:04,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 16:44:04,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 16:44:07,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:44:13,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:44:13,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:44:15,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.25 vs. limit=15.0 2023-10-02 16:44:17,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:44:19,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:44:19,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:44:21,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 16:44:21,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:44:22,639 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.37 vs. limit=12.0 2023-10-02 16:44:23,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 16:44:26,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:44:29,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:44:29,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=948566.6666666666, ans=0.2 2023-10-02 16:44:31,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 16:44:34,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 16:44:34,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:44:36,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 16:44:36,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:44:36,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:44:37,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:44:37,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:44:43,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 16:44:46,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:44:48,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:44:49,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 16:44:49,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:44:50,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 16:44:53,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:44:53,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=948633.3333333334, ans=0.125 2023-10-02 16:44:54,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:44:56,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:44:57,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 16:44:57,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:44:57,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:44:58,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:45:00,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 16:45:00,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:45:02,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:45:02,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:45:02,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 16:45:02,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:45:02,696 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:45:03,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:45:05,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:45:06,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:45:06,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 16:45:06,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:45:11,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:45:12,423 INFO [train.py:1046] (1/4) Epoch 27, batch 4200, loss[loss=0.1604, simple_loss=0.2529, pruned_loss=0.03394, over 24341.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2455, pruned_loss=0.04559, over 4692958.78 frames. ], batch size: 74, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:45:12,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 16:45:13,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:45:14,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=948766.6666666666, ans=0.1 2023-10-02 16:45:14,671 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.09 vs. limit=15.0 2023-10-02 16:45:15,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:45:18,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:45:19,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:45:19,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:45:20,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 16:45:23,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 16:45:23,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:45:25,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:45:26,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:45:29,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 16:45:33,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:45:33,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:45:33,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 16:45:33,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:45:34,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:45:35,766 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.51 vs. limit=15.0 2023-10-02 16:45:36,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:45:36,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:45:36,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:45:39,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 16:45:39,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:45:43,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=948900.0, ans=0.2 2023-10-02 16:45:45,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:45:45,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:45:48,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:45:49,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:45:50,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:45:50,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 16:45:51,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:45:52,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:45:55,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 16:45:57,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:46:02,459 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.72 vs. limit=22.5 2023-10-02 16:46:04,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:46:08,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 16:46:09,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=948966.6666666666, ans=0.1 2023-10-02 16:46:10,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:46:12,984 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.962e+02 2.274e+02 2.740e+02 4.088e+02, threshold=4.548e+02, percent-clipped=1.0 2023-10-02 16:46:14,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:46:14,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:46:15,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 16:46:17,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=949033.3333333334, ans=0.09899494936611666 2023-10-02 16:46:22,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:46:25,307 INFO [train.py:1046] (1/4) Epoch 27, batch 4250, loss[loss=0.1753, simple_loss=0.2563, pruned_loss=0.04716, over 23550.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2438, pruned_loss=0.04498, over 4685997.19 frames. ], batch size: 94, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:46:26,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:46:26,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:46:28,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=949100.0, ans=0.2 2023-10-02 16:46:30,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:46:35,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:46:35,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 16:46:36,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:46:39,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:46:42,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:46:45,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:46:45,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:46:46,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:46:48,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:46:49,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:46:50,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:46:52,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:46:54,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:46:56,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:46:56,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 16:46:57,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=949233.3333333334, ans=0.2 2023-10-02 16:47:02,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 16:47:02,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:47:02,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:47:02,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:47:04,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:47:04,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:47:05,232 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.77 vs. limit=22.5 2023-10-02 16:47:05,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:47:08,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 16:47:09,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:47:13,658 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.56 vs. limit=22.5 2023-10-02 16:47:14,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:47:15,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:47:15,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=949300.0, ans=0.125 2023-10-02 16:47:15,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=949300.0, ans=0.07 2023-10-02 16:47:17,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 16:47:17,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:47:17,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 16:47:18,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:47:19,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=949300.0, ans=0.1 2023-10-02 16:47:20,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:47:20,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=949300.0, ans=0.125 2023-10-02 16:47:21,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:47:21,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:47:24,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 16:47:25,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 16:47:27,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:47:31,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:47:33,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:47:35,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:47:36,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:47:38,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:47:39,727 INFO [train.py:1046] (1/4) Epoch 27, batch 4300, loss[loss=0.1794, simple_loss=0.2542, pruned_loss=0.05229, over 23398.00 frames. ], tot_loss[loss=0.166, simple_loss=0.243, pruned_loss=0.04448, over 4696821.22 frames. ], batch size: 105, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:47:39,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:47:41,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:47:41,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 16:47:41,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:47:41,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=949433.3333333334, ans=0.0 2023-10-02 16:47:46,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:47:46,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:47:50,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:47:58,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:47:58,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 16:47:59,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:48:01,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:48:01,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:48:02,688 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 16:48:06,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:48:08,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:48:11,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 16:48:11,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:48:11,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 16:48:13,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 16:48:17,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:48:18,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:48:18,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:48:20,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:48:20,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=949566.6666666666, ans=0.0 2023-10-02 16:48:21,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:48:23,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:48:23,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 16:48:24,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 16:48:27,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:48:28,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:48:28,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 16:48:28,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:48:28,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:48:28,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 16:48:28,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 16:48:29,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 16:48:30,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:48:31,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 16:48:31,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 16:48:34,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:48:34,790 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 16:48:36,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:48:38,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:48:38,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:48:40,906 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.825e+02 1.972e+02 2.208e+02 2.993e+02, threshold=3.944e+02, percent-clipped=0.0 2023-10-02 16:48:42,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 16:48:42,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:48:44,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:48:44,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:48:44,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:48:45,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:48:47,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:48:49,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:48:51,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:48:51,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:48:54,101 INFO [train.py:1046] (1/4) Epoch 27, batch 4350, loss[loss=0.1768, simple_loss=0.249, pruned_loss=0.0523, over 23378.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2438, pruned_loss=0.0447, over 4707090.01 frames. ], batch size: 119, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:48:56,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 16:48:57,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 16:49:01,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:49:02,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:49:04,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=949766.6666666666, ans=0.5 2023-10-02 16:49:05,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:49:05,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:49:12,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:49:15,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=949833.3333333334, ans=0.0 2023-10-02 16:49:16,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:49:19,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:49:19,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:49:23,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:49:25,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:49:25,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:49:30,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 16:49:30,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:49:32,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:49:32,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=949900.0, ans=0.0 2023-10-02 16:49:35,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:49:37,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 16:49:42,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:49:44,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 16:49:46,381 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 16:49:48,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:49:48,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:49:49,725 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 16:49:49,792 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 16:49:49,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:49:49,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:49:49,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:49:51,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:49:52,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:49:52,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:49:55,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 16:49:55,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:49:55,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:49:55,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:49:56,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 16:49:58,089 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 16:49:58,095 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 16:49:58,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 16:50:00,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:50:02,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:50:02,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:50:02,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:50:03,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=950033.3333333334, ans=0.1 2023-10-02 16:50:04,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 16:50:06,816 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 16:50:06,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:50:08,105 INFO [train.py:1046] (1/4) Epoch 27, batch 4400, loss[loss=0.1419, simple_loss=0.2189, pruned_loss=0.03243, over 24333.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2445, pruned_loss=0.04466, over 4713528.19 frames. ], batch size: 56, lr: 3.77e-03, grad_scale: 32.0 2023-10-02 16:50:11,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:50:11,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:50:13,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:50:15,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 16:50:15,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 16:50:16,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 16:50:16,642 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 16:50:18,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 16:50:18,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:50:18,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=950100.0, ans=0.0 2023-10-02 16:50:19,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=950100.0, ans=0.125 2023-10-02 16:50:20,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 16:50:22,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:50:22,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:50:22,293 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 16:50:24,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:50:24,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 16:50:25,002 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 16:50:28,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 16:50:28,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 16:50:29,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 16:50:30,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:50:30,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:50:31,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:50:31,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:50:34,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 16:50:34,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 16:50:36,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:50:36,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=950233.3333333334, ans=0.125 2023-10-02 16:50:38,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:50:38,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:50:40,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:50:42,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:50:42,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 16:50:42,336 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 16:50:47,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:50:51,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:50:54,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 16:50:58,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:51:01,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:51:02,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:51:04,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 16:51:04,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:51:04,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:51:04,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:51:05,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 16:51:08,772 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.837e+02 2.009e+02 2.278e+02 3.254e+02, threshold=4.017e+02, percent-clipped=0.0 2023-10-02 16:51:10,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 16:51:11,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 16:51:13,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 16:51:14,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:51:14,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 16:51:14,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:51:19,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:51:20,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 16:51:22,141 INFO [train.py:1046] (1/4) Epoch 27, batch 4450, loss[loss=0.2029, simple_loss=0.2722, pruned_loss=0.06682, over 19830.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2456, pruned_loss=0.04501, over 4701586.45 frames. ], batch size: 390, lr: 3.77e-03, grad_scale: 32.0 2023-10-02 16:51:23,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:51:26,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:51:26,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:51:32,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:51:32,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:51:33,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=950433.3333333334, ans=0.125 2023-10-02 16:51:33,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=950433.3333333334, ans=0.125 2023-10-02 16:51:34,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:51:36,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:51:36,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=950500.0, ans=0.2 2023-10-02 16:51:39,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:51:40,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:51:40,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 16:51:40,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:51:42,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:51:42,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:51:42,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:51:47,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 16:51:47,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=950500.0, ans=0.1 2023-10-02 16:51:50,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=950566.6666666666, ans=0.2 2023-10-02 16:51:53,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:51:53,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:51:53,851 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.28 vs. limit=15.0 2023-10-02 16:51:54,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:51:55,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:51:57,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:52:01,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 16:52:03,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 16:52:03,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 16:52:03,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:52:05,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:52:05,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=950633.3333333334, ans=0.2 2023-10-02 16:52:06,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 16:52:10,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 16:52:13,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:52:13,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 16:52:15,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:52:15,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:52:15,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:52:15,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:52:17,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:52:21,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:52:21,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 16:52:23,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:52:24,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:52:24,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=950700.0, ans=0.0 2023-10-02 16:52:25,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:52:28,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:52:28,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 16:52:31,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:52:34,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 16:52:35,453 INFO [train.py:1046] (1/4) Epoch 27, batch 4500, loss[loss=0.1383, simple_loss=0.2166, pruned_loss=0.03003, over 22341.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2463, pruned_loss=0.04564, over 4702405.45 frames. ], batch size: 49, lr: 3.77e-03, grad_scale: 8.0 2023-10-02 16:52:35,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:52:36,567 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.82 vs. limit=15.0 2023-10-02 16:52:38,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:52:39,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 16:52:39,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 16:52:41,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:52:43,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=950766.6666666666, ans=0.125 2023-10-02 16:52:45,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:52:46,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:52:47,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:52:48,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:52:48,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:52:49,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:52:59,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:53:01,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:53:04,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:53:05,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:53:05,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:53:11,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:53:14,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:53:19,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:53:22,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:53:23,504 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.01 vs. limit=15.0 2023-10-02 16:53:23,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 16:53:23,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:53:25,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:53:28,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:53:28,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:53:30,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:53:30,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 16:53:30,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 16:53:30,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:53:34,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:53:34,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:53:37,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:53:37,905 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:53:38,922 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.864e+02 2.019e+02 2.246e+02 3.268e+02, threshold=4.037e+02, percent-clipped=0.0 2023-10-02 16:53:39,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:53:39,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:53:42,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 16:53:45,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 16:53:45,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 16:53:48,171 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:53:49,798 INFO [train.py:1046] (1/4) Epoch 27, batch 4550, loss[loss=0.1509, simple_loss=0.2324, pruned_loss=0.03465, over 20683.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2455, pruned_loss=0.0453, over 4703042.11 frames. ], batch size: 45, lr: 3.77e-03, grad_scale: 8.0 2023-10-02 16:53:49,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 16:53:51,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 16:53:53,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:53:54,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:53:56,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:53:58,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:54:00,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=951100.0, ans=0.125 2023-10-02 16:54:01,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:54:02,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=951100.0, ans=0.1 2023-10-02 16:54:03,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:54:05,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:54:05,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:54:05,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:08,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:54:08,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:54:13,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:54:16,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 16:54:17,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 16:54:18,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:54:19,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 16:54:24,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 16:54:24,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:54:26,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 16:54:28,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:54:31,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:31,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:31,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=951233.3333333334, ans=0.0 2023-10-02 16:54:32,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:54:33,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 16:54:36,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:54:39,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:39,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:54:40,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:54:42,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 16:54:42,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 16:54:42,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:54:43,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 16:54:45,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 16:54:45,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:54:46,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:54:46,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:54:48,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:49,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:54:50,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:54:52,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 16:54:53,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:54:54,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 16:54:54,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 16:54:54,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:54:54,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 16:54:57,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:54:57,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:54:58,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:54:58,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:59,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:55:01,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:55:02,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:55:03,698 INFO [train.py:1046] (1/4) Epoch 27, batch 4600, loss[loss=0.1517, simple_loss=0.2412, pruned_loss=0.03115, over 24301.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.244, pruned_loss=0.0452, over 4700463.53 frames. ], batch size: 74, lr: 3.77e-03, grad_scale: 8.0 2023-10-02 16:55:05,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:06,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:55:09,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:55:09,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:55:10,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:55:11,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 16:55:13,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:55:16,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:55:19,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:55:20,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:28,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 16:55:29,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:31,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:32,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=951566.6666666666, ans=0.125 2023-10-02 16:55:33,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:55:34,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:55:39,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 16:55:39,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:55:39,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:55:44,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:45,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:55:47,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:55:50,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 16:55:52,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 16:55:56,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:55:56,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:55:59,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:55:59,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 16:56:00,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:56:00,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 16:56:00,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:56:00,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:56:03,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:56:03,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:56:04,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:56:05,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 16:56:06,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 16:56:06,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 16:56:06,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:56:07,497 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.372e+02 1.867e+02 2.097e+02 2.357e+02 3.231e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-02 16:56:07,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:56:07,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:56:10,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:56:18,270 INFO [train.py:1046] (1/4) Epoch 27, batch 4650, loss[loss=0.1629, simple_loss=0.2514, pruned_loss=0.0372, over 24656.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2432, pruned_loss=0.04464, over 4702708.78 frames. ], batch size: 73, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 16:56:20,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:56:21,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=951766.6666666666, ans=0.0 2023-10-02 16:56:24,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:56:24,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:56:25,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:56:25,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:56:26,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:56:27,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:56:27,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=951766.6666666666, ans=0.0 2023-10-02 16:56:29,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 16:56:34,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:56:35,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 16:56:35,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:56:37,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 16:56:37,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:56:37,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 16:56:38,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 16:56:38,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:56:38,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:56:41,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:56:41,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:56:43,339 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 16:56:45,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:56:47,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 16:56:50,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:56:50,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:56:51,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 16:56:52,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:56:58,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:57:01,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:57:05,082 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.65 vs. limit=6.0 2023-10-02 16:57:05,140 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.56 vs. limit=15.0 2023-10-02 16:57:05,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:57:07,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:57:07,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:57:08,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:57:11,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 16:57:12,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 16:57:14,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 16:57:14,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 16:57:16,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:57:17,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=952033.3333333334, ans=0.125 2023-10-02 16:57:24,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:57:24,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:57:24,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 16:57:24,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:57:24,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:57:25,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:57:27,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:57:28,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:57:28,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:57:28,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:57:31,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:57:31,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:57:31,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:57:32,743 INFO [train.py:1046] (1/4) Epoch 27, batch 4700, loss[loss=0.1771, simple_loss=0.2539, pruned_loss=0.05014, over 23948.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2437, pruned_loss=0.0447, over 4697981.01 frames. ], batch size: 86, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 16:57:32,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 16:57:34,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:57:36,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 16:57:38,314 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.44 vs. limit=6.0 2023-10-02 16:57:43,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:57:44,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:57:44,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:57:46,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:57:47,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:57:52,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 16:57:52,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 16:57:54,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:57:55,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:57:55,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:57:59,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:58:02,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=952233.3333333334, ans=0.125 2023-10-02 16:58:06,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:58:07,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 16:58:09,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:58:14,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 16:58:16,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:58:18,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:18,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=952300.0, ans=0.05 2023-10-02 16:58:21,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 16:58:24,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:58:28,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:58:28,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 16:58:31,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:31,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:58:31,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=952366.6666666666, ans=0.125 2023-10-02 16:58:33,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:58:34,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:58:34,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 16:58:36,115 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.791e+02 1.934e+02 2.224e+02 4.121e+02, threshold=3.867e+02, percent-clipped=0.0 2023-10-02 16:58:36,206 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 16:58:37,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:58:40,872 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.54 vs. limit=15.0 2023-10-02 16:58:41,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:41,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:41,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 16:58:41,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:41,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=952366.6666666666, ans=0.1 2023-10-02 16:58:45,407 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.74 vs. limit=6.0 2023-10-02 16:58:45,877 INFO [train.py:1046] (1/4) Epoch 27, batch 4750, loss[loss=0.1704, simple_loss=0.2509, pruned_loss=0.04491, over 24482.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2444, pruned_loss=0.0448, over 4702696.10 frames. ], batch size: 66, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 16:58:45,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 16:58:47,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:58:48,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:58:54,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:58:54,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:58:55,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 16:58:55,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:59:01,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 16:59:02,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:59:02,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:59:04,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:59:08,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 16:59:12,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:59:14,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 16:59:15,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:59:18,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:59:18,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:59:18,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:59:19,826 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 16:59:19,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 16:59:25,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 16:59:28,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:59:30,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:59:33,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:59:33,154 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 16:59:33,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:59:35,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:59:37,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:59:40,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 16:59:40,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 16:59:40,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:59:40,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:59:40,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:59:42,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:59:42,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 16:59:42,749 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.72 vs. limit=15.0 2023-10-02 16:59:44,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 16:59:46,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=952700.0, ans=0.0 2023-10-02 16:59:48,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:59:50,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:59:50,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 16:59:51,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:59:53,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:59:54,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:59:56,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:59:56,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:59:59,417 INFO [train.py:1046] (1/4) Epoch 27, batch 4800, loss[loss=0.1702, simple_loss=0.25, pruned_loss=0.04519, over 24462.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2454, pruned_loss=0.04486, over 4708048.77 frames. ], batch size: 66, lr: 3.76e-03, grad_scale: 16.0 2023-10-02 16:59:59,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:00:00,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 17:00:00,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 17:00:03,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 17:00:04,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:00:04,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:00:05,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=952766.6666666666, ans=0.0 2023-10-02 17:00:06,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 17:00:10,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:00:10,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:00:16,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:00:17,160 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.72 vs. limit=15.0 2023-10-02 17:00:17,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:00:17,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:00:18,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 17:00:18,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=952833.3333333334, ans=0.0 2023-10-02 17:00:19,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:00:19,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:00:22,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:00:25,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:00:26,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:00:28,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:00:28,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:00:30,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 17:00:30,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:00:32,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:00:33,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:00:36,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:00:36,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:00:36,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:00:37,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 17:00:39,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:00:42,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 17:00:42,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 17:00:44,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:00:44,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:00:44,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:00:44,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:00:44,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:00:44,844 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.91 vs. limit=15.0 2023-10-02 17:00:46,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:00:46,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:00:51,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:00:52,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:00:55,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:00:57,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=953033.3333333334, ans=0.125 2023-10-02 17:00:58,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 17:00:58,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:00:58,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:00:58,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=953033.3333333334, ans=0.125 2023-10-02 17:01:00,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:01:00,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:01:01,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=953033.3333333334, ans=0.1 2023-10-02 17:01:02,772 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.815e+02 2.074e+02 2.431e+02 3.640e+02, threshold=4.148e+02, percent-clipped=0.0 2023-10-02 17:01:04,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:01:04,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:01:05,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:05,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:01:06,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:01:06,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:01:10,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:01:12,201 INFO [train.py:1046] (1/4) Epoch 27, batch 4850, loss[loss=0.1558, simple_loss=0.2375, pruned_loss=0.03709, over 24662.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2457, pruned_loss=0.04479, over 4717788.69 frames. ], batch size: 65, lr: 3.76e-03, grad_scale: 16.0 2023-10-02 17:01:12,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:12,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:01:12,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=953100.0, ans=0.125 2023-10-02 17:01:14,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 17:01:14,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=953100.0, ans=0.125 2023-10-02 17:01:15,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 17:01:15,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:01:15,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:01:16,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:01:16,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:19,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:01:24,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 17:01:27,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:01:32,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:01:32,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:01:33,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:36,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:01:37,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:01:38,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:01:39,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 17:01:43,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:01:45,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:01:46,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 17:01:46,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:01:46,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 17:01:49,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:01:49,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:01:52,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:01:52,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 17:01:52,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 17:01:54,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:02:02,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:02:03,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 17:02:05,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:02:05,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:02:06,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:02:07,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 17:02:07,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:02:07,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 17:02:09,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:02:09,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:02:09,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 17:02:18,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:02:24,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:02:24,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:02:26,971 INFO [train.py:1046] (1/4) Epoch 27, batch 4900, loss[loss=0.1555, simple_loss=0.2479, pruned_loss=0.03155, over 24309.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2447, pruned_loss=0.04444, over 4714104.17 frames. ], batch size: 74, lr: 3.76e-03, grad_scale: 16.0 2023-10-02 17:02:29,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 17:02:29,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:02:32,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=953433.3333333334, ans=0.125 2023-10-02 17:02:35,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:02:36,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:02:36,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:02:38,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 17:02:43,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 17:02:46,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 17:02:48,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 17:02:48,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:02:49,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:02:49,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:02:49,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:02:49,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:02:49,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 17:02:54,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 17:02:56,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 17:02:56,805 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.38 vs. limit=22.5 2023-10-02 17:02:57,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:02:57,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:02:58,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:03:00,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:03:01,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:03:01,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 17:03:03,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:03:04,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:03:04,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 17:03:04,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 17:03:08,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 17:03:10,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:03:12,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:03:12,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:03:12,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:03:14,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 17:03:14,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:03:14,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 17:03:17,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:03:18,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:03:21,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:03:23,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=953633.3333333334, ans=0.125 2023-10-02 17:03:24,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 17:03:24,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:03:26,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 17:03:26,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 17:03:29,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=953700.0, ans=0.125 2023-10-02 17:03:30,817 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.829e+02 2.030e+02 2.368e+02 3.684e+02, threshold=4.060e+02, percent-clipped=0.0 2023-10-02 17:03:34,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:03:34,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:03:35,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 17:03:36,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 17:03:36,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:03:38,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:03:41,070 INFO [train.py:1046] (1/4) Epoch 27, batch 4950, loss[loss=0.1702, simple_loss=0.2401, pruned_loss=0.05011, over 23862.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2436, pruned_loss=0.04432, over 4712679.54 frames. ], batch size: 212, lr: 3.76e-03, grad_scale: 16.0 2023-10-02 17:03:41,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:03:41,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:03:41,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:03:41,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=953766.6666666666, ans=0.0 2023-10-02 17:03:42,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 17:03:42,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 17:03:45,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:03:45,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 17:03:48,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 17:03:48,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 17:03:49,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:03:49,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 17:03:49,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:03:49,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:03:50,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:03:50,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:03:53,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:03:53,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:03:55,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:03:56,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:03:57,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:03:57,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:04:01,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 17:04:01,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=953833.3333333334, ans=0.2 2023-10-02 17:04:02,940 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.01 vs. limit=12.0 2023-10-02 17:04:06,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:04:07,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:04:09,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:04:10,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:12,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:04:12,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 17:04:13,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 17:04:16,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:17,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:04:17,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:04:17,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:04:17,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:04:19,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:04:23,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:04:24,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:04:25,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:04:28,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:04:28,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:29,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 17:04:29,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:04:31,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:04:32,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=953966.6666666666, ans=0.2 2023-10-02 17:04:34,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:04:35,281 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.12 vs. limit=15.0 2023-10-02 17:04:37,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:04:37,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:04:38,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:38,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:04:38,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:04:41,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:04:41,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:04:41,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:04:43,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 17:04:47,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:04:52,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 17:04:52,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 17:04:55,560 INFO [train.py:1046] (1/4) Epoch 27, batch 5000, loss[loss=0.1622, simple_loss=0.2373, pruned_loss=0.04355, over 23434.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2428, pruned_loss=0.04425, over 4712244.16 frames. ], batch size: 134, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:04:58,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:58,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:05:00,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 17:05:01,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 17:05:04,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:05:05,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 17:05:05,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:05:05,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:05:06,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=954100.0, ans=0.1 2023-10-02 17:05:07,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 17:05:08,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:05:09,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:05:09,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 17:05:09,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:05:11,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:05:12,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 17:05:13,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 17:05:14,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:05:15,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 17:05:15,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 17:05:16,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:05:16,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:05:16,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 17:05:16,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 17:05:18,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 17:05:18,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:05:19,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:05:21,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 17:05:21,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:05:22,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:05:24,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:05:26,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 17:05:27,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 17:05:27,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:05:29,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:05:33,786 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 17:05:37,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:05:38,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:05:38,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:05:41,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 17:05:41,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:05:41,622 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:05:42,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:05:42,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:05:44,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 17:05:44,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:05:48,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:05:49,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:05:56,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 17:05:59,918 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.877e+02 2.124e+02 2.613e+02 3.895e+02, threshold=4.248e+02, percent-clipped=0.0 2023-10-02 17:06:00,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:06:09,333 INFO [train.py:1046] (1/4) Epoch 27, batch 5050, loss[loss=0.1708, simple_loss=0.253, pruned_loss=0.04436, over 23283.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.244, pruned_loss=0.04454, over 4721851.72 frames. ], batch size: 93, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:06:09,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:06:10,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:06:10,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:06:10,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:06:10,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:06:12,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:06:12,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:06:16,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:06:16,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 17:06:16,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:06:16,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=954433.3333333334, ans=0.2 2023-10-02 17:06:16,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=954433.3333333334, ans=0.125 2023-10-02 17:06:19,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:06:20,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:06:21,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 17:06:23,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:06:23,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:06:25,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:06:26,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:06:26,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:06:35,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 17:06:37,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:06:37,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:06:38,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 17:06:38,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:06:40,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:06:40,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:06:41,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:06:41,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 17:06:41,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 17:06:43,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:06:44,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=954566.6666666666, ans=0.125 2023-10-02 17:06:45,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:06:49,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:06:49,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 17:06:51,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:06:54,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 17:06:54,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=954633.3333333334, ans=0.0 2023-10-02 17:06:55,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:06:55,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:06:57,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:06:57,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:06:59,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:07:00,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:07:02,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:02,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=954633.3333333334, ans=10.0 2023-10-02 17:07:03,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:07:03,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:07:03,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 17:07:03,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=954633.3333333334, ans=0.125 2023-10-02 17:07:05,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:07:05,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:07:08,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:07:08,332 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 17:07:08,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:07:11,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:07:11,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:11,135 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 17:07:14,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:07:14,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 17:07:14,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:17,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:07:18,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:18,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 17:07:20,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 17:07:23,382 INFO [train.py:1046] (1/4) Epoch 27, batch 5100, loss[loss=0.1474, simple_loss=0.2298, pruned_loss=0.03247, over 24441.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2444, pruned_loss=0.04447, over 4733402.57 frames. ], batch size: 63, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:07:23,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:07:23,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:07:23,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:07:26,271 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 17:07:29,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:07:32,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 17:07:32,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 17:07:33,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:07:34,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:07:37,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:07:37,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 17:07:39,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 17:07:42,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:07:42,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:07:45,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:07:47,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 17:07:48,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:07:49,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:49,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 17:07:53,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:07:54,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:07:54,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 17:07:56,026 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 17:07:57,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:07:59,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 17:07:59,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 17:08:03,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:08:11,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:08:14,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 17:08:14,112 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 17:08:15,293 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 17:08:16,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 17:08:16,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:08:18,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 17:08:22,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 17:08:23,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 17:08:25,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:08:28,129 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.816e+02 1.951e+02 2.163e+02 3.190e+02, threshold=3.902e+02, percent-clipped=0.0 2023-10-02 17:08:28,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 17:08:30,216 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=955033.3333333334, ans=0.125 2023-10-02 17:08:31,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:08:31,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 17:08:34,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=955033.3333333334, ans=0.04949747468305833 2023-10-02 17:08:36,954 INFO [train.py:1046] (1/4) Epoch 27, batch 5150, loss[loss=0.1522, simple_loss=0.2233, pruned_loss=0.04059, over 21508.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2451, pruned_loss=0.04474, over 4719208.98 frames. ], batch size: 47, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:08:37,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:08:37,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:08:37,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:08:38,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:08:38,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:08:40,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:08:40,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 17:08:40,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 17:08:41,227 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.27 vs. limit=15.0 2023-10-02 17:08:41,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 17:08:41,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:08:41,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 17:08:43,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:08:43,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 17:08:44,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:08:47,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:08:50,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 17:08:52,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 17:08:53,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:08:53,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:08:56,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:08:56,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:08:56,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:08:56,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:08:56,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:08:58,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 17:08:59,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:09:01,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:09:01,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 17:09:02,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 17:09:04,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:09:08,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=955233.3333333334, ans=0.125 2023-10-02 17:09:09,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:09:11,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 17:09:14,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:09:19,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:09:21,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:09:24,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:09:26,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:09:28,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 17:09:32,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:09:32,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:09:32,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:09:36,205 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.69 vs. limit=15.0 2023-10-02 17:09:36,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:09:38,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:09:38,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 17:09:41,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:09:44,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 17:09:46,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:09:46,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:09:46,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=955366.6666666666, ans=0.125 2023-10-02 17:09:46,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=955366.6666666666, ans=0.125 2023-10-02 17:09:47,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:09:47,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:09:47,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:09:49,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:09:51,795 INFO [train.py:1046] (1/4) Epoch 27, batch 5200, loss[loss=0.1569, simple_loss=0.2471, pruned_loss=0.03334, over 24689.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2463, pruned_loss=0.04506, over 4715836.32 frames. ], batch size: 73, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:09:51,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:09:53,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:09:55,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:09:59,367 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:10:02,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 17:10:02,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:10:02,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:10:05,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:10:06,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:10:06,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:10:06,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=955500.0, ans=0.125 2023-10-02 17:10:08,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 17:10:10,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 17:10:10,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:10:14,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 17:10:14,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=955500.0, ans=0.0 2023-10-02 17:10:14,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=955500.0, ans=0.09899494936611666 2023-10-02 17:10:16,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:10:18,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:10:18,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 17:10:18,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 17:10:21,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 17:10:22,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:10:22,939 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 17:10:22,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:10:25,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:10:25,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:10:26,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 17:10:26,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:10:28,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=955566.6666666666, ans=0.2 2023-10-02 17:10:29,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:10:31,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=955566.6666666666, ans=0.125 2023-10-02 17:10:33,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 17:10:34,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 17:10:34,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 17:10:40,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 17:10:40,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:10:40,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=955633.3333333334, ans=0.1 2023-10-02 17:10:43,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:10:43,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:10:44,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 17:10:44,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:10:46,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 17:10:46,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:10:47,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:10:50,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:10:53,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:10:56,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:10:56,991 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.31 vs. limit=22.5 2023-10-02 17:10:57,478 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.823e+02 2.038e+02 2.325e+02 3.987e+02, threshold=4.077e+02, percent-clipped=1.0 2023-10-02 17:10:57,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:10:57,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:11:01,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:11:01,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 17:11:01,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=955700.0, ans=0.125 2023-10-02 17:11:03,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:11:05,099 INFO [train.py:1046] (1/4) Epoch 27, batch 5250, loss[loss=0.1705, simple_loss=0.2604, pruned_loss=0.04029, over 24688.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.245, pruned_loss=0.04477, over 4723616.00 frames. ], batch size: 73, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:11:05,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:11:05,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:11:06,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:11:06,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:11:09,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:11:12,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:11:13,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:11:13,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:11:17,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=955766.6666666666, ans=0.1 2023-10-02 17:11:18,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:11:19,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=955833.3333333334, ans=0.1 2023-10-02 17:11:19,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=955833.3333333334, ans=0.125 2023-10-02 17:11:20,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:11:21,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:11:24,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:11:26,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 17:11:26,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:11:26,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:11:28,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=955833.3333333334, ans=0.2 2023-10-02 17:12:13,984 INFO [train.py:1046] (1/4) Epoch 27, batch 5300, loss[loss=0.143, simple_loss=0.1946, pruned_loss=0.04576, over 19385.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2433, pruned_loss=0.04438, over 4710807.02 frames. ], batch size: 389, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:12:15,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=956100.0, ans=0.125 2023-10-02 17:12:20,259 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.29 vs. limit=15.0 2023-10-02 17:12:25,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=956100.0, ans=0.125 2023-10-02 17:12:28,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:12:28,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 17:12:28,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 17:12:28,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:12:28,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:12:28,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:12:28,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:12:28,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:12:28,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:12:28,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:12:28,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:12:28,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:12:29,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 17:12:29,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 17:12:29,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 17:12:29,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:12:29,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 17:12:29,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 17:12:29,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:12:30,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:12:30,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:12:30,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:12:30,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:12:30,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:12:30,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:12:30,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:12:30,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:12:30,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:12:30,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:12:30,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:12:30,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:12:31,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 17:12:31,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:12:31,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:12:31,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 17:12:31,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 17:12:31,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:12:31,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:12:31,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 17:12:32,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 17:12:32,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 17:12:32,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:12:33,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:12:33,085 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 17:12:33,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 17:12:33,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:12:33,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:12:33,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 17:12:33,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 17:12:33,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 17:12:33,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 17:12:40,113 INFO [train.py:1046] (1/4) Epoch 28, batch 0, loss[loss=0.1805, simple_loss=0.264, pruned_loss=0.04846, over 24017.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.264, pruned_loss=0.04846, over 24017.00 frames. ], batch size: 80, lr: 3.69e-03, grad_scale: 16.0 2023-10-02 17:12:40,113 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 17:12:52,161 INFO [train.py:1078] (1/4) Epoch 28, validation: loss=0.3134, simple_loss=0.267, pruned_loss=0.1799, over 1125622.00 frames. 2023-10-02 17:12:52,161 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 17:12:54,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 17:12:56,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:12:57,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:13:03,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:13:03,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:13:03,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:13:05,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 17:13:07,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 17:13:08,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:13:09,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:13:12,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:13:12,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:13:13,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:13:13,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:13:15,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 17:13:15,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=956246.6666666666, ans=0.1 2023-10-02 17:13:16,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:13:24,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:13:24,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:13:26,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 17:13:30,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:13:30,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:13:33,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:13:36,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:13:38,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=956380.0, ans=0.125 2023-10-02 17:13:39,043 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.24 vs. limit=15.0 2023-10-02 17:13:39,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:13:41,020 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.877e+02 2.089e+02 2.395e+02 3.641e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 17:13:45,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 17:13:46,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 17:13:48,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:13:48,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:13:49,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:13:49,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:13:51,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 17:13:54,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:13:56,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:14:00,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:14:01,437 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.96 vs. limit=15.0 2023-10-02 17:14:03,408 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 17:14:04,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:14:06,217 INFO [train.py:1046] (1/4) Epoch 28, batch 50, loss[loss=0.1646, simple_loss=0.2503, pruned_loss=0.03946, over 24465.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2466, pruned_loss=0.04516, over 1061542.67 frames. ], batch size: 69, lr: 3.69e-03, grad_scale: 16.0 2023-10-02 17:14:06,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:14:09,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:14:09,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 17:14:09,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:14:09,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:14:11,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=956513.3333333334, ans=0.0 2023-10-02 17:14:12,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:14:13,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:14:16,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:14:19,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 17:14:19,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:14:25,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:14:27,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 17:14:30,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 17:14:31,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:14:33,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:14:33,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:14:34,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:14:35,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:14:37,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 17:14:37,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:14:40,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=956646.6666666666, ans=0.1 2023-10-02 17:14:46,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:14:46,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:14:46,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:14:46,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 17:14:49,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:14:49,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:14:49,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 17:14:50,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:14:51,536 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.22 vs. limit=6.0 2023-10-02 17:14:52,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 17:14:59,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:14:59,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:15:01,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:15:03,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:15:03,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:15:04,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 17:15:04,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 17:15:06,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:15:07,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:15:08,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:15:08,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:15:09,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 17:15:09,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 17:15:10,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 17:15:12,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:12,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:15:13,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 17:15:13,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 17:15:15,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:15,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:15:17,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 17:15:17,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:15:19,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=956846.6666666666, ans=0.125 2023-10-02 17:15:20,555 INFO [train.py:1046] (1/4) Epoch 28, batch 100, loss[loss=0.1691, simple_loss=0.249, pruned_loss=0.04459, over 24293.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2464, pruned_loss=0.04498, over 1876517.69 frames. ], batch size: 61, lr: 3.69e-03, grad_scale: 16.0 2023-10-02 17:15:20,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:15:22,558 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.02 vs. limit=6.0 2023-10-02 17:15:24,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:15:26,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:15:30,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 17:15:30,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:15:33,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:15:34,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:15:34,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:15:34,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:15:34,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:15:36,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 17:15:36,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:15:37,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:37,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:15:37,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:15:41,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 17:15:42,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:44,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:15:45,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:15:47,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:15:50,301 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 17:15:50,322 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 17:15:51,078 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.66 vs. limit=6.0 2023-10-02 17:15:51,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:15:51,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:15:55,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:15:57,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:59,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:01,770 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.51 vs. limit=10.0 2023-10-02 17:16:04,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:05,655 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 17:16:08,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 17:16:09,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=957046.6666666666, ans=0.1 2023-10-02 17:16:10,932 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.862e+02 2.053e+02 2.349e+02 3.571e+02, threshold=4.105e+02, percent-clipped=0.0 2023-10-02 17:16:12,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:16:12,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:16:13,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:18,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:16:21,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:16:22,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:16:25,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:25,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:16:27,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:16:27,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:16:28,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:29,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 17:16:29,986 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 17:16:30,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:16:31,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:16:33,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:33,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:16:33,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 17:16:33,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 17:16:33,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:16:33,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:34,550 INFO [train.py:1046] (1/4) Epoch 28, batch 150, loss[loss=0.1619, simple_loss=0.2407, pruned_loss=0.04158, over 23604.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.245, pruned_loss=0.04416, over 2514265.49 frames. ], batch size: 149, lr: 3.69e-03, grad_scale: 8.0 2023-10-02 17:16:34,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:16:36,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:16:36,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:16:36,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:16:40,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:16:42,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:16:42,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:16:43,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:44,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:16:45,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:47,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:16:47,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:51,489 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.68 vs. limit=15.0 2023-10-02 17:16:53,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 17:16:53,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 17:16:53,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 17:16:56,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:16:56,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:16:56,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:16:57,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:57,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:16:59,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:59,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:17:00,506 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 17:17:02,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:17:09,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:17:11,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:17:11,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 17:17:15,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:17:15,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:17:15,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:17:17,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:17:18,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:17:18,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:17:19,223 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.35 vs. limit=15.0 2023-10-02 17:17:20,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:17:20,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 17:17:24,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=957380.0, ans=0.125 2023-10-02 17:17:25,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:17:27,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:17:27,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:17:27,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:17:30,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:17:31,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 17:17:33,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:17:36,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:17:39,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:17:42,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:17:42,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 17:17:42,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:17:42,158 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 17:17:46,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:17:47,777 INFO [train.py:1046] (1/4) Epoch 28, batch 200, loss[loss=0.156, simple_loss=0.2336, pruned_loss=0.03915, over 24249.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2453, pruned_loss=0.04443, over 3011767.12 frames. ], batch size: 56, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:17:47,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:17:47,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:17:48,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=957513.3333333334, ans=0.125 2023-10-02 17:17:50,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=957513.3333333334, ans=0.0 2023-10-02 17:17:52,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 17:17:52,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:17:53,092 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.03 vs. limit=12.0 2023-10-02 17:17:53,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:17:56,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 17:17:57,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:17:59,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:18:00,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:18:02,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:18:02,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:18:02,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:18:09,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=957580.0, ans=0.125 2023-10-02 17:18:24,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:18:25,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:18:26,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:18:27,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:18:29,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 17:18:29,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:18:31,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:18:33,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:18:33,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:18:33,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:18:34,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 17:18:36,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 17:18:36,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:18:37,243 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.956e+02 2.202e+02 2.604e+02 4.152e+02, threshold=4.404e+02, percent-clipped=1.0 2023-10-02 17:18:40,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:18:44,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:18:52,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:18:52,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:18:56,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=957780.0, ans=0.125 2023-10-02 17:18:57,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:19:00,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 17:19:00,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:19:00,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:19:00,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:19:01,477 INFO [train.py:1046] (1/4) Epoch 28, batch 250, loss[loss=0.1601, simple_loss=0.2417, pruned_loss=0.03922, over 18950.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2452, pruned_loss=0.04462, over 3372113.44 frames. ], batch size: 41, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:19:01,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:19:02,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 17:19:04,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:19:04,341 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 17:19:05,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:19:07,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:19:08,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:19:09,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:19:11,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:19:11,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:19:12,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:19:17,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:19:17,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=957913.3333333334, ans=0.125 2023-10-02 17:19:25,098 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.61 vs. limit=15.0 2023-10-02 17:19:26,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:19:28,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:19:28,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:19:38,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:19:38,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:19:38,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=957980.0, ans=0.125 2023-10-02 17:19:39,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:19:39,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:19:41,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:19:41,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:19:41,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:19:44,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:19:46,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 17:19:46,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:19:47,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:19:48,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:19:48,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:19:48,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:19:50,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:19:50,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:19:50,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=958046.6666666666, ans=0.125 2023-10-02 17:19:51,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:19:52,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:19:52,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:19:57,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:20:01,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:20:04,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:20:06,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=958113.3333333334, ans=0.125 2023-10-02 17:20:09,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:20:12,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:20:14,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 17:20:15,512 INFO [train.py:1046] (1/4) Epoch 28, batch 300, loss[loss=0.1522, simple_loss=0.2164, pruned_loss=0.04405, over 23555.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2438, pruned_loss=0.04481, over 3657746.18 frames. ], batch size: 256, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:20:15,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:20:17,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:20:18,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 17:20:18,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 17:20:20,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:20:20,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 17:20:20,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=958180.0, ans=0.125 2023-10-02 17:20:25,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:20:25,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:20:28,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:20:28,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 17:20:30,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:20:32,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:20:32,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 17:20:32,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:20:35,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:20:41,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:20:43,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 17:20:43,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=958313.3333333334, ans=0.125 2023-10-02 17:20:46,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 17:20:46,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:20:47,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:20:49,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:20:49,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 17:20:49,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:20:50,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:20:52,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=958313.3333333334, ans=0.1 2023-10-02 17:20:53,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:20:53,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:20:55,494 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.29 vs. limit=15.0 2023-10-02 17:20:56,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 17:20:56,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 17:20:58,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:21:01,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:21:02,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 17:21:03,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:21:05,024 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.790e+02 1.923e+02 2.144e+02 2.937e+02, threshold=3.846e+02, percent-clipped=0.0 2023-10-02 17:21:09,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:21:10,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:21:10,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 17:21:15,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:21:15,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:21:18,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:21:18,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:21:20,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 17:21:21,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 17:21:21,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:21:22,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 17:21:24,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:21:24,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:25,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:21:25,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:21:27,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:29,222 INFO [train.py:1046] (1/4) Epoch 28, batch 350, loss[loss=0.1611, simple_loss=0.2473, pruned_loss=0.03746, over 24436.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2433, pruned_loss=0.04413, over 3901399.12 frames. ], batch size: 69, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:21:31,061 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.40 vs. limit=15.0 2023-10-02 17:21:31,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:21:31,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 17:21:33,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=958513.3333333334, ans=0.125 2023-10-02 17:21:34,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:40,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:21:42,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:21:43,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:45,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 17:21:46,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:21:46,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 17:21:49,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:49,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 17:21:50,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:21:54,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 17:21:56,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:21:56,901 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.93 vs. limit=22.5 2023-10-02 17:21:57,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:21:58,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:22:00,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:00,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:00,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:22:00,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=958646.6666666666, ans=0.0 2023-10-02 17:22:01,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:22:01,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:22:03,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:22:03,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:22:09,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:22:10,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:22:12,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=958646.6666666666, ans=15.0 2023-10-02 17:22:12,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:22:12,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:22:15,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 17:22:15,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:22:18,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=958713.3333333334, ans=0.125 2023-10-02 17:22:20,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:22:20,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:22:22,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:22:22,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 17:22:25,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:22:26,787 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 17:22:26,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 17:22:26,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:31,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:22:31,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 17:22:33,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:22:34,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:22:37,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:37,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:22:37,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:22:40,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:22:42,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=958846.6666666666, ans=0.0 2023-10-02 17:22:43,233 INFO [train.py:1046] (1/4) Epoch 28, batch 400, loss[loss=0.1433, simple_loss=0.2242, pruned_loss=0.0312, over 24599.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2422, pruned_loss=0.04385, over 4064129.25 frames. ], batch size: 60, lr: 3.68e-03, grad_scale: 16.0 2023-10-02 17:22:43,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:22:45,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:22:46,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 17:22:46,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:22:47,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:22:49,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:22:49,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:22:52,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:52,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=958846.6666666666, ans=0.125 2023-10-02 17:22:53,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:22:56,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 17:22:58,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 17:22:58,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:23:01,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 17:23:01,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:23:01,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=958913.3333333334, ans=0.125 2023-10-02 17:23:04,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:23:04,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:23:04,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 17:23:04,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=958913.3333333334, ans=0.0 2023-10-02 17:23:05,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:23:05,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:23:05,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:23:05,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:23:08,615 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 17:23:08,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 17:23:13,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:23:13,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:23:14,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 17:23:15,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=958980.0, ans=0.0 2023-10-02 17:23:16,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 17:23:18,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:23:21,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:23:27,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 17:23:27,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=959046.6666666666, ans=0.125 2023-10-02 17:23:30,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:23:31,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=959046.6666666666, ans=0.125 2023-10-02 17:23:31,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=959046.6666666666, ans=0.2 2023-10-02 17:23:32,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 17:23:34,765 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.466e+02 1.847e+02 2.073e+02 2.548e+02 3.934e+02, threshold=4.147e+02, percent-clipped=1.0 2023-10-02 17:23:34,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:23:34,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:23:34,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 17:23:39,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:23:43,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:23:43,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:23:46,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:23:46,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=959113.3333333334, ans=0.1 2023-10-02 17:23:47,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 17:23:48,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 17:23:48,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 17:23:50,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:23:50,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:23:51,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 17:23:53,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:23:54,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:23:54,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 17:23:57,428 INFO [train.py:1046] (1/4) Epoch 28, batch 450, loss[loss=0.1537, simple_loss=0.2395, pruned_loss=0.03399, over 24078.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2429, pruned_loss=0.04418, over 4196050.03 frames. ], batch size: 86, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:23:57,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 17:23:57,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:23:57,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:23:59,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:23:59,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 17:24:00,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:24:00,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:24:03,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:24:11,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:24:12,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:24:13,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=959246.6666666666, ans=0.0 2023-10-02 17:24:15,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 17:24:15,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 17:24:19,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:24:20,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=959246.6666666666, ans=0.035 2023-10-02 17:24:22,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:24:23,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:24:28,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:24:29,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:24:31,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 17:24:31,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 17:24:34,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 17:24:34,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:24:34,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:24:36,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:24:37,872 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 17:24:37,880 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 17:24:37,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:24:39,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:24:40,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 17:24:40,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=959380.0, ans=0.1 2023-10-02 17:24:45,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:24:45,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:24:46,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 17:24:46,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 17:24:49,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:24:52,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:24:52,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:24:53,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 17:24:55,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=959446.6666666666, ans=0.125 2023-10-02 17:24:56,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:24:56,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 17:24:57,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 17:24:59,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:25:02,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:25:02,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=959446.6666666666, ans=0.125 2023-10-02 17:25:04,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:25:05,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:25:05,805 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 17:25:06,420 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.44 vs. limit=22.5 2023-10-02 17:25:10,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:25:10,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=959513.3333333334, ans=0.0 2023-10-02 17:25:11,652 INFO [train.py:1046] (1/4) Epoch 28, batch 500, loss[loss=0.1501, simple_loss=0.2284, pruned_loss=0.03594, over 24621.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2439, pruned_loss=0.04446, over 4303289.40 frames. ], batch size: 60, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:25:11,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:25:11,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:25:11,754 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 17:25:13,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 17:25:13,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:25:16,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 17:25:19,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 17:25:20,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:25:22,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:25:22,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:25:23,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:25:31,495 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.18 vs. limit=15.0 2023-10-02 17:25:34,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:25:34,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:25:35,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:25:35,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:25:37,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 17:25:37,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:25:38,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:25:40,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:25:40,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:25:41,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:25:41,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 17:25:47,664 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 17:25:49,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:25:50,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:25:51,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:25:52,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:25:52,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:25:54,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 17:25:56,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:25:57,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:26:01,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=959713.3333333334, ans=0.07 2023-10-02 17:26:03,513 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.887e+02 2.063e+02 2.289e+02 3.276e+02, threshold=4.127e+02, percent-clipped=0.0 2023-10-02 17:26:03,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:26:06,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:26:10,506 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.59 vs. limit=15.0 2023-10-02 17:26:11,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:26:15,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 17:26:15,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:26:15,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:26:18,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 17:26:20,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 17:26:22,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:26:25,371 INFO [train.py:1046] (1/4) Epoch 28, batch 550, loss[loss=0.15, simple_loss=0.2401, pruned_loss=0.02998, over 24529.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2448, pruned_loss=0.04442, over 4394799.72 frames. ], batch size: 66, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:26:25,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 17:26:26,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 17:26:28,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:26:28,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 17:26:29,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:26:29,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:26:29,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:26:31,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:26:31,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:26:32,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:26:34,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:26:35,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 17:26:35,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:26:40,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:26:40,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:26:43,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:26:45,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:26:47,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 17:26:50,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 17:26:51,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:26:54,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=959980.0, ans=0.0 2023-10-02 17:26:55,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:26:55,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:26:56,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:27:02,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:27:03,353 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 17:27:04,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:27:04,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 17:27:08,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:27:09,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:27:09,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:27:10,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:27:12,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 17:27:12,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 17:27:13,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:27:13,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:27:13,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:27:13,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:27:16,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:27:18,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:27:21,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:27:21,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:27:23,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 17:27:24,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:27:27,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:27:27,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:27:28,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:27:28,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=960113.3333333334, ans=0.2 2023-10-02 17:27:29,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:27:29,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 17:27:35,900 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.73 vs. limit=15.0 2023-10-02 17:27:36,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 17:27:38,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 17:27:40,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:27:41,852 INFO [train.py:1046] (1/4) Epoch 28, batch 600, loss[loss=0.1777, simple_loss=0.2506, pruned_loss=0.05243, over 23514.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2456, pruned_loss=0.04541, over 4460414.60 frames. ], batch size: 134, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:27:41,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:27:41,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:27:48,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=960180.0, ans=0.05 2023-10-02 17:27:49,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:27:50,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:27:52,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 17:27:54,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:27:55,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:27:58,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:27:58,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=960246.6666666666, ans=0.125 2023-10-02 17:28:00,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 17:28:01,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:28:05,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 17:28:08,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=960246.6666666666, ans=0.125 2023-10-02 17:28:09,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:28:09,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:28:09,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:28:16,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:28:16,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:28:16,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:28:20,089 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:28:21,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=960313.3333333334, ans=0.125 2023-10-02 17:28:21,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=960313.3333333334, ans=0.125 2023-10-02 17:28:23,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:28:29,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:28:29,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:28:29,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:28:33,183 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.831e+02 2.042e+02 2.282e+02 3.339e+02, threshold=4.085e+02, percent-clipped=0.0 2023-10-02 17:28:36,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 17:28:40,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:28:40,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:28:45,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 17:28:47,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:28:48,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 17:28:49,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:28:49,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:28:55,752 INFO [train.py:1046] (1/4) Epoch 28, batch 650, loss[loss=0.1659, simple_loss=0.2528, pruned_loss=0.03949, over 24367.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2443, pruned_loss=0.04478, over 4513432.64 frames. ], batch size: 77, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:28:55,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 17:28:57,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:28:57,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:28:58,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:28:58,921 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:28:58,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=960513.3333333334, ans=0.0 2023-10-02 17:29:01,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:03,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 17:29:04,419 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.12 vs. limit=15.0 2023-10-02 17:29:05,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:29:10,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:29:10,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:29:14,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:29:18,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 17:29:19,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:29:19,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:29:23,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:29:23,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 17:29:24,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=960646.6666666666, ans=10.0 2023-10-02 17:29:26,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:29:26,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:27,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 17:29:27,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:28,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:29:32,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:29:32,970 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 17:29:32,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:29:32,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:29:34,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=960646.6666666666, ans=0.125 2023-10-02 17:29:35,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:35,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:29:37,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:29:37,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:29:38,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 17:29:38,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:29:40,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:29:41,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:29:41,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:29:43,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 17:29:44,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 17:29:46,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 17:29:46,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:48,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:29:48,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:29:48,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:29:51,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:29:55,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:55,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:29:57,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:29:58,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=960780.0, ans=0.0 2023-10-02 17:29:59,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:29:59,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 17:30:01,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:30:08,009 INFO [train.py:1046] (1/4) Epoch 28, batch 700, loss[loss=0.1621, simple_loss=0.2514, pruned_loss=0.03639, over 24642.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2426, pruned_loss=0.04418, over 4569656.48 frames. ], batch size: 68, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:30:08,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:30:08,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:30:08,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:30:09,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:30:11,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.56 vs. limit=15.0 2023-10-02 17:30:15,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 17:30:15,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 17:30:18,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 17:30:18,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:30:21,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:30:24,196 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.83 vs. limit=15.0 2023-10-02 17:30:24,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 17:30:27,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:30:29,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:30:30,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:30:32,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:30:32,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:30:34,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:30:36,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 17:30:37,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:30:37,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 17:30:39,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=960980.0, ans=0.125 2023-10-02 17:30:40,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 17:30:45,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:30:45,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:30:47,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:30:52,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:30:52,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 17:30:56,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:30:56,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:30:56,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 17:31:00,698 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.450e+02 1.837e+02 1.990e+02 2.293e+02 3.578e+02, threshold=3.980e+02, percent-clipped=0.0 2023-10-02 17:31:00,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:31:02,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:31:02,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=961046.6666666666, ans=0.0 2023-10-02 17:31:04,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:31:09,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:31:09,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 17:31:13,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 17:31:14,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 17:31:15,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=961113.3333333334, ans=0.125 2023-10-02 17:31:18,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:31:19,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:31:20,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:31:21,814 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:31:22,379 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.51 vs. limit=15.0 2023-10-02 17:31:22,868 INFO [train.py:1046] (1/4) Epoch 28, batch 750, loss[loss=0.1798, simple_loss=0.2652, pruned_loss=0.0472, over 24673.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2424, pruned_loss=0.0444, over 4595682.55 frames. ], batch size: 73, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:31:22,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:31:22,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 17:31:27,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 17:31:27,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 17:31:27,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 17:31:29,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 17:31:29,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 17:31:30,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:31:31,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 17:31:31,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:31:31,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:31:33,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:31:36,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:31:36,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:31:36,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:31:37,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=961246.6666666666, ans=0.125 2023-10-02 17:31:38,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:31:40,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:31:40,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:31:41,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:31:43,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:31:44,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 17:31:46,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:31:48,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:31:48,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:31:48,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=961246.6666666666, ans=0.5 2023-10-02 17:31:49,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 17:31:50,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 17:31:50,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:31:53,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 17:31:53,078 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 17:31:54,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 17:31:54,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:31:54,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 17:31:56,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:32:03,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:32:04,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:32:04,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:32:06,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:32:07,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:32:09,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 17:32:09,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:32:10,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=961380.0, ans=0.0 2023-10-02 17:32:11,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 17:32:12,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:32:13,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=961380.0, ans=0.1 2023-10-02 17:32:15,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:32:15,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 17:32:16,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:32:22,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:32:22,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=961446.6666666666, ans=0.125 2023-10-02 17:32:23,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:32:25,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:32:26,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=961446.6666666666, ans=0.125 2023-10-02 17:32:28,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:32:29,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 17:32:29,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:32:29,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:32:33,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:32:34,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:32:37,111 INFO [train.py:1046] (1/4) Epoch 28, batch 800, loss[loss=0.1774, simple_loss=0.2481, pruned_loss=0.05337, over 22903.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2434, pruned_loss=0.04462, over 4622833.08 frames. ], batch size: 322, lr: 3.68e-03, grad_scale: 16.0 2023-10-02 17:32:37,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:32:37,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:32:44,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:32:44,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:32:45,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:32:45,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:32:47,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:32:47,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:32:48,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:32:53,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:32:54,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:32:57,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 17:32:57,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:32:59,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:32:59,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:32:59,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:32:59,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 17:32:59,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:32:59,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=961580.0, ans=0.07 2023-10-02 17:33:00,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 17:33:03,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:33:06,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:33:09,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:33:09,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:33:12,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:33:12,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:33:15,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:33:15,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:33:17,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 17:33:18,549 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 17:33:19,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 17:33:19,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:33:19,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:33:21,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:33:21,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:33:27,252 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 17:33:27,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 17:33:28,511 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.943e+02 2.202e+02 2.642e+02 5.405e+02, threshold=4.403e+02, percent-clipped=5.0 2023-10-02 17:33:29,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:33:30,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:33:33,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:33:37,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:33:37,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=961780.0, ans=0.1 2023-10-02 17:33:39,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 17:33:39,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:33:41,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 17:33:45,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:33:46,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=961780.0, ans=15.0 2023-10-02 17:33:49,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:33:49,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 17:33:50,500 INFO [train.py:1046] (1/4) Epoch 28, batch 850, loss[loss=0.1524, simple_loss=0.2341, pruned_loss=0.03535, over 24462.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2447, pruned_loss=0.04509, over 4637975.58 frames. ], batch size: 63, lr: 3.68e-03, grad_scale: 16.0 2023-10-02 17:33:51,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:33:51,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:33:52,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 17:33:52,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:33:54,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:33:55,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:33:56,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:33:57,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=961846.6666666666, ans=6.0 2023-10-02 17:33:58,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:33:59,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 17:33:59,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 17:33:59,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 17:34:01,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:34:01,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:34:04,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:34:04,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:34:05,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:34:09,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:34:09,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:34:09,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 17:34:13,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 17:34:16,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:34:16,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 17:34:23,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 17:34:23,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 17:34:26,390 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 17:34:26,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:34:26,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:34:26,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 17:34:29,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:34:29,873 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.01 vs. limit=15.0 2023-10-02 17:34:30,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:34:30,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 17:34:33,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:34:35,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:34:35,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:34:36,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:34:37,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:34:39,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:34:40,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 17:34:43,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:34:43,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:34:44,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:34:44,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:34:46,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:34:50,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:34:52,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:34:52,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:34:54,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:34:54,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:34:56,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=962113.3333333334, ans=0.0 2023-10-02 17:34:58,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=962113.3333333334, ans=0.0 2023-10-02 17:34:59,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 17:35:01,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:35:01,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 17:35:01,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:35:01,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:35:04,357 INFO [train.py:1046] (1/4) Epoch 28, batch 900, loss[loss=0.1467, simple_loss=0.2289, pruned_loss=0.03225, over 24490.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2457, pruned_loss=0.04527, over 4654021.06 frames. ], batch size: 63, lr: 3.68e-03, grad_scale: 16.0 2023-10-02 17:35:04,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 17:35:11,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:35:12,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:35:13,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 17:35:15,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:35:15,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 17:35:16,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 17:35:19,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:35:19,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:35:19,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:35:19,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:35:29,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:35:29,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:35:29,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=962246.6666666666, ans=0.125 2023-10-02 17:35:30,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:35:33,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:35:38,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 17:35:40,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:35:41,455 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.01 vs. limit=15.0 2023-10-02 17:35:46,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:35:46,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:35:46,362 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 17:35:47,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 17:35:52,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:35:52,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:35:53,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:35:57,654 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.851e+02 2.081e+02 2.371e+02 3.484e+02, threshold=4.162e+02, percent-clipped=0.0 2023-10-02 17:35:59,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:35:59,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:36:01,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 17:36:01,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:36:01,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 17:36:03,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:36:04,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:36:06,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:36:06,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:36:10,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 17:36:12,112 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 17:36:13,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 17:36:13,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 17:36:14,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:36:18,174 INFO [train.py:1046] (1/4) Epoch 28, batch 950, loss[loss=0.1694, simple_loss=0.2426, pruned_loss=0.04811, over 23662.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2455, pruned_loss=0.04533, over 4662241.81 frames. ], batch size: 149, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:36:18,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 17:36:24,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:36:26,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:36:27,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:36:27,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 17:36:29,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=962513.3333333334, ans=0.125 2023-10-02 17:36:30,353 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 17:36:33,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:36:34,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:36:35,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:36:35,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:36:35,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 17:36:37,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 17:36:39,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:36:40,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 17:36:42,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:36:44,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:36:44,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:36:45,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:36:46,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 17:36:49,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 17:36:50,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:36:51,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:36:57,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:36:57,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:36:59,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=962646.6666666666, ans=0.125 2023-10-02 17:37:01,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 17:37:03,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 17:37:03,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:37:04,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:37:05,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:37:05,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:37:10,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 17:37:10,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=962713.3333333334, ans=0.1 2023-10-02 17:37:13,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:37:14,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:37:14,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:37:14,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 17:37:14,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:37:14,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:37:15,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 17:37:18,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:37:19,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:37:25,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:37:27,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 17:37:27,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=962780.0, ans=0.0 2023-10-02 17:37:28,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 17:37:32,148 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.84 vs. limit=10.0 2023-10-02 17:37:32,597 INFO [train.py:1046] (1/4) Epoch 28, batch 1000, loss[loss=0.1754, simple_loss=0.2556, pruned_loss=0.04763, over 24555.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2442, pruned_loss=0.04486, over 4673389.82 frames. ], batch size: 71, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:37:32,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:37:36,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 17:37:36,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:37:43,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:37:44,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 17:37:44,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 17:37:48,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:37:48,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:37:49,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:37:51,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 17:37:51,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=962913.3333333334, ans=0.125 2023-10-02 17:37:56,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 17:37:59,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 17:37:59,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:38:01,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 17:38:01,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=962980.0, ans=0.025 2023-10-02 17:38:03,154 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=10.84 vs. limit=15.0 2023-10-02 17:38:03,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 17:38:04,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 17:38:05,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:38:05,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=962980.0, ans=0.125 2023-10-02 17:38:05,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=962980.0, ans=0.1 2023-10-02 17:38:06,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:13,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:38:13,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:38:15,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:17,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:38:17,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 17:38:17,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:38:18,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:38:18,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:38:19,924 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 17:38:23,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 17:38:23,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=963046.6666666666, ans=0.0 2023-10-02 17:38:24,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 17:38:25,904 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.974e+02 2.112e+02 2.590e+02 4.842e+02, threshold=4.225e+02, percent-clipped=1.0 2023-10-02 17:38:26,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 17:38:27,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:38:34,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:34,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:38:34,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:35,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:38:38,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 17:38:39,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:38:39,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 17:38:39,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 17:38:42,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:38:42,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:38:43,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:38:46,984 INFO [train.py:1046] (1/4) Epoch 28, batch 1050, loss[loss=0.1699, simple_loss=0.2576, pruned_loss=0.04111, over 24431.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2438, pruned_loss=0.0443, over 4688893.66 frames. ], batch size: 69, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:38:47,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:38:47,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=963180.0, ans=0.0 2023-10-02 17:38:49,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:38:51,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:38:53,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:38:53,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=963180.0, ans=0.0 2023-10-02 17:38:54,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:38:54,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:57,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:38:57,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=963180.0, ans=0.09899494936611666 2023-10-02 17:39:00,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:39:01,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:39:03,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:39:03,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:39:03,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:39:05,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:39:06,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 17:39:07,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:39:09,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 17:39:10,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:39:10,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 17:39:10,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 17:39:14,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:39:16,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:39:16,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:39:16,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=963313.3333333334, ans=0.0 2023-10-02 17:39:19,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 17:39:19,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 17:39:19,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:39:24,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 17:39:26,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 17:39:28,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:39:31,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 17:39:33,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 17:39:35,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:39:35,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:39:39,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:39:41,030 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.50 vs. limit=15.0 2023-10-02 17:39:41,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 17:39:43,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 17:39:44,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 17:39:44,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:39:44,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:39:46,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 17:39:49,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:39:51,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:39:51,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:39:52,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:39:52,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:39:57,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:39:57,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 17:39:57,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=963446.6666666666, ans=0.125 2023-10-02 17:39:58,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:39:58,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 17:39:58,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 17:40:00,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:40:01,632 INFO [train.py:1046] (1/4) Epoch 28, batch 1100, loss[loss=0.167, simple_loss=0.259, pruned_loss=0.03744, over 24634.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2421, pruned_loss=0.04381, over 4683181.36 frames. ], batch size: 68, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:40:03,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:40:09,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:40:13,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:40:15,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:40:15,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:40:16,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 17:40:16,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:40:18,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:40:20,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=963580.0, ans=0.0 2023-10-02 17:40:21,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:40:22,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:40:22,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 17:40:24,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 17:40:25,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:40:25,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:40:28,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:40:30,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:40:33,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:40:35,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=963646.6666666666, ans=0.125 2023-10-02 17:40:38,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 17:40:39,486 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 17:40:39,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:40:40,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:40:42,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:40:42,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:40:43,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 17:40:45,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:40:45,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:40:45,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:40:46,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:40:46,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 17:40:50,308 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.68 vs. limit=22.5 2023-10-02 17:40:51,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:40:51,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 17:40:53,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=963713.3333333334, ans=0.0 2023-10-02 17:40:54,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:40:56,021 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.777e+02 1.908e+02 2.106e+02 3.203e+02, threshold=3.817e+02, percent-clipped=0.0 2023-10-02 17:40:59,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:41:02,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 17:41:02,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 17:41:03,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:41:03,968 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.30 vs. limit=22.5 2023-10-02 17:41:06,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:41:06,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:41:08,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 17:41:08,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:41:08,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:41:09,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 17:41:09,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:41:10,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 17:41:12,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:41:13,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:41:14,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:41:16,120 INFO [train.py:1046] (1/4) Epoch 28, batch 1150, loss[loss=0.1702, simple_loss=0.2603, pruned_loss=0.04009, over 24321.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2432, pruned_loss=0.04431, over 4687604.40 frames. ], batch size: 74, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:41:16,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=963846.6666666666, ans=0.1 2023-10-02 17:41:17,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:41:20,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:41:22,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:41:22,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:41:22,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 17:41:22,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:41:25,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 17:41:26,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=963846.6666666666, ans=0.125 2023-10-02 17:41:27,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:41:27,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:41:30,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 17:41:31,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:41:34,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:41:36,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:41:36,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 17:41:36,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:41:36,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:41:40,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 17:41:41,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:41:43,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:41:43,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=963913.3333333334, ans=0.125 2023-10-02 17:41:44,795 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=963980.0, ans=0.1 2023-10-02 17:41:52,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:41:56,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=963980.0, ans=0.0 2023-10-02 17:41:58,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:41:58,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 17:41:59,500 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.86 vs. limit=6.0 2023-10-02 17:42:00,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:42:00,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:42:06,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=964046.6666666666, ans=0.125 2023-10-02 17:42:08,722 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 17:42:08,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:42:15,726 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 17:42:21,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:42:22,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:42:22,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:42:22,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:42:24,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=964113.3333333334, ans=0.125 2023-10-02 17:42:27,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:42:30,691 INFO [train.py:1046] (1/4) Epoch 28, batch 1200, loss[loss=0.1594, simple_loss=0.2342, pruned_loss=0.04234, over 23362.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2435, pruned_loss=0.04471, over 4699825.34 frames. ], batch size: 105, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:42:32,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:42:32,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:42:34,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:42:34,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:42:34,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:42:37,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:42:39,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:42:40,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:42:40,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:42:43,522 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 17:42:44,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 17:42:47,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:42:49,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=964246.6666666666, ans=0.2 2023-10-02 17:42:50,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:42:52,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:42:53,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.whiten.whitening_limit, batch_count=964246.6666666666, ans=12.0 2023-10-02 17:42:53,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:42:53,831 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 17:42:55,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:43:01,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:43:01,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:43:01,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 17:43:03,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:43:08,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 17:43:14,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 17:43:14,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:43:15,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:43:17,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:43:17,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:43:18,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:43:18,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:43:18,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:43:20,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 17:43:20,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:43:21,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:43:21,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 17:43:22,956 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.818e+02 2.090e+02 2.388e+02 3.203e+02, threshold=4.180e+02, percent-clipped=0.0 2023-10-02 17:43:23,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:43:23,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:43:27,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 17:43:30,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:43:33,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 17:43:36,604 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 17:43:39,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:43:41,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=964446.6666666666, ans=0.1 2023-10-02 17:43:42,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:43:43,963 INFO [train.py:1046] (1/4) Epoch 28, batch 1250, loss[loss=0.1781, simple_loss=0.2463, pruned_loss=0.05496, over 23748.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2438, pruned_loss=0.04452, over 4706259.08 frames. ], batch size: 164, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:43:44,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:43:45,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:43:46,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 17:43:51,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:43:52,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:43:53,082 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.45 vs. limit=22.5 2023-10-02 17:43:53,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 17:43:57,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:43:57,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:43:58,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn2.whiten.whitening_limit, batch_count=964580.0, ans=22.5 2023-10-02 17:44:01,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:44:01,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:44:03,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:44:03,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:44:04,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:44:10,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 17:44:10,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:44:10,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:44:11,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:44:12,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:44:14,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:44:15,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=964646.6666666666, ans=0.0 2023-10-02 17:44:16,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 17:44:20,219 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.60 vs. limit=15.0 2023-10-02 17:44:21,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 17:44:21,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:44:22,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:44:23,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 17:44:23,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:44:23,923 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 17:44:23,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:44:23,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:44:28,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:44:33,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:44:33,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:44:35,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 17:44:35,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 17:44:35,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 17:44:38,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:44:39,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 17:44:39,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:44:42,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 17:44:42,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:44:44,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 17:44:44,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:44:45,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=964780.0, ans=0.0 2023-10-02 17:44:46,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:44:46,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 17:44:48,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:44:48,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 17:44:52,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:44:53,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:44:54,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:44:56,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:44:57,489 INFO [train.py:1046] (1/4) Epoch 28, batch 1300, loss[loss=0.1408, simple_loss=0.2219, pruned_loss=0.02983, over 24566.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2446, pruned_loss=0.04499, over 4704179.08 frames. ], batch size: 60, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:44:59,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:44:59,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 17:45:02,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:45:04,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:45:05,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:45:07,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:45:08,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:45:10,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 17:45:14,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=964913.3333333334, ans=0.125 2023-10-02 17:45:16,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:45:17,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:45:17,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=964913.3333333334, ans=0.0 2023-10-02 17:45:18,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 17:45:21,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 17:45:25,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:45:27,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:45:28,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:45:30,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:45:30,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:45:32,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 17:45:32,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 17:45:38,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:45:38,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:45:39,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 17:45:39,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 17:45:41,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:45:41,990 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.29 vs. limit=22.5 2023-10-02 17:45:42,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:45:43,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 17:45:45,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:45:45,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 17:45:47,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:45:50,982 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.887e+02 2.082e+02 2.300e+02 3.182e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-02 17:45:51,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:45:51,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:45:54,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 17:45:55,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 17:45:55,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 17:45:59,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:46:03,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 17:46:04,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:46:12,134 INFO [train.py:1046] (1/4) Epoch 28, batch 1350, loss[loss=0.1321, simple_loss=0.2097, pruned_loss=0.02721, over 24463.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2435, pruned_loss=0.04453, over 4714725.52 frames. ], batch size: 58, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:46:13,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 17:46:16,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:46:19,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:46:22,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:46:22,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:46:25,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:46:25,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:46:27,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:46:29,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 17:46:31,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:46:32,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:46:34,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 17:46:34,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:46:36,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:46:36,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 17:46:38,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 17:46:40,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 17:46:42,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:46:42,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 17:46:46,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=965313.3333333334, ans=0.0 2023-10-02 17:46:49,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=965313.3333333334, ans=0.0 2023-10-02 17:46:55,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:47:02,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:47:02,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:47:04,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 17:47:07,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:47:07,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 17:47:07,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:47:08,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:47:11,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:47:13,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 17:47:14,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:47:19,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 17:47:20,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 17:47:25,903 INFO [train.py:1046] (1/4) Epoch 28, batch 1400, loss[loss=0.1692, simple_loss=0.2364, pruned_loss=0.05094, over 23399.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2418, pruned_loss=0.04373, over 4708249.29 frames. ], batch size: 285, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:47:26,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 17:47:27,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:47:30,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:47:31,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:47:36,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 17:47:37,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 17:47:45,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=965580.0, ans=0.0 2023-10-02 17:47:46,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:47:46,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=965580.0, ans=0.1 2023-10-02 17:47:49,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:47:51,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:47:51,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 17:47:54,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:47:55,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 17:47:59,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=965646.6666666666, ans=0.125 2023-10-02 17:48:06,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:07,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:11,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 17:48:12,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:48:12,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:48:12,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:48:13,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:48:15,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:48:15,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:48:15,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:48:15,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 17:48:16,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:48:19,942 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.826e+02 2.070e+02 2.525e+02 5.054e+02, threshold=4.140e+02, percent-clipped=1.0 2023-10-02 17:48:20,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:24,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:48:31,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 17:48:33,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:48:33,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:48:36,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 17:48:36,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:48:39,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:48:40,602 INFO [train.py:1046] (1/4) Epoch 28, batch 1450, loss[loss=0.1713, simple_loss=0.2419, pruned_loss=0.05036, over 23859.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2415, pruned_loss=0.04354, over 4708776.99 frames. ], batch size: 212, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:48:43,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:48:45,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:48:45,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:45,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 17:48:51,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:48:51,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:48:52,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:48:54,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 17:48:54,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:48:55,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 17:48:57,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:58,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:48:58,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 17:48:59,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:48:59,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:48:59,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 17:48:59,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:49:00,650 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.41 vs. limit=15.0 2023-10-02 17:49:01,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:49:04,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:49:06,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:49:10,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:49:10,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:49:10,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=965980.0, ans=0.125 2023-10-02 17:49:12,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:49:13,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:49:14,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:49:14,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:49:14,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:49:16,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:49:18,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=965980.0, ans=0.125 2023-10-02 17:49:20,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 17:49:23,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:49:27,817 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 17:49:27,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:49:29,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:49:30,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:49:30,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 17:49:34,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:49:34,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 17:49:36,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 17:49:38,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:49:42,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:49:42,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:49:44,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 17:49:45,442 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.10 vs. limit=15.0 2023-10-02 17:49:46,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 17:49:46,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 17:49:48,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:49:48,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 17:49:51,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=966113.3333333334, ans=0.1 2023-10-02 17:49:54,980 INFO [train.py:1046] (1/4) Epoch 28, batch 1500, loss[loss=0.1458, simple_loss=0.2211, pruned_loss=0.03529, over 24433.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2424, pruned_loss=0.04376, over 4719427.55 frames. ], batch size: 58, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:50:00,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 17:50:00,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:50:00,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:50:00,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:50:01,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:50:01,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:50:03,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 17:50:05,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:50:05,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 17:50:05,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:50:06,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:50:09,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:50:09,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:50:10,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=966246.6666666666, ans=0.125 2023-10-02 17:50:13,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:50:13,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 17:50:16,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:50:16,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:50:17,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:50:20,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 17:50:24,445 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.56 vs. limit=10.0 2023-10-02 17:50:24,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 17:50:25,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=966313.3333333334, ans=0.125 2023-10-02 17:50:26,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:50:26,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 17:50:30,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 17:50:31,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:50:31,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:50:31,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:50:33,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 17:50:35,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:50:35,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:50:36,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 17:50:36,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:50:36,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=966313.3333333334, ans=0.1 2023-10-02 17:50:39,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=966380.0, ans=0.125 2023-10-02 17:50:42,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:50:42,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 17:50:46,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:50:47,939 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.911e+02 2.171e+02 2.439e+02 3.664e+02, threshold=4.341e+02, percent-clipped=0.0 2023-10-02 17:50:48,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:50:51,397 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 17:50:52,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:50:52,792 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 17:50:54,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:50:55,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:50:56,026 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 17:50:57,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:50:58,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 17:51:00,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:51:01,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:51:01,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:51:03,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:51:03,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:51:04,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:51:06,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 17:51:07,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 17:51:07,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:51:07,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 17:51:08,840 INFO [train.py:1046] (1/4) Epoch 28, batch 1550, loss[loss=0.1374, simple_loss=0.2189, pruned_loss=0.02797, over 19343.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2435, pruned_loss=0.04428, over 4716372.49 frames. ], batch size: 42, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:51:10,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 17:51:11,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:51:12,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:51:13,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:51:14,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:51:15,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:51:16,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=966513.3333333334, ans=0.125 2023-10-02 17:51:16,531 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.05 vs. limit=15.0 2023-10-02 17:51:17,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:51:21,132 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 17:51:21,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:51:21,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:51:22,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:51:23,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:51:23,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 17:51:25,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:51:26,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 17:51:27,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 17:51:27,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 17:51:27,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:51:27,915 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.92 vs. limit=22.5 2023-10-02 17:51:30,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:51:34,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:51:35,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 17:51:35,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 17:51:43,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:51:46,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:51:46,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:51:46,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:51:47,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 17:51:52,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:51:55,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:51:58,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:51:59,415 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.62 vs. limit=15.0 2023-10-02 17:52:01,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:52:01,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:52:01,584 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:52:02,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 17:52:02,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:52:04,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:52:04,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:52:05,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 17:52:05,500 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 17:52:08,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:52:14,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 17:52:19,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:52:20,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:52:20,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 17:52:22,670 INFO [train.py:1046] (1/4) Epoch 28, batch 1600, loss[loss=0.1446, simple_loss=0.2228, pruned_loss=0.03314, over 24303.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2444, pruned_loss=0.04408, over 4723836.06 frames. ], batch size: 56, lr: 3.67e-03, grad_scale: 32.0 2023-10-02 17:52:24,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:52:25,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:52:25,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:52:25,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:52:26,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:52:30,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:52:32,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 17:52:32,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 17:52:33,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 17:52:34,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:52:36,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 17:52:36,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:52:40,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:52:44,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:52:47,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 17:52:48,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:52:50,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 17:52:50,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:52:51,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 17:52:56,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 17:53:05,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:53:05,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 17:53:06,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:53:06,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:53:06,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:53:09,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 17:53:12,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 17:53:14,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:53:15,488 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.365e+02 1.816e+02 2.018e+02 2.318e+02 3.311e+02, threshold=4.036e+02, percent-clipped=0.0 2023-10-02 17:53:15,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:53:16,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:53:16,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:53:18,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:53:19,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:53:20,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:53:28,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:53:28,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:53:31,356 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.00 vs. limit=15.0 2023-10-02 17:53:32,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 17:53:32,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:53:32,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 17:53:36,113 INFO [train.py:1046] (1/4) Epoch 28, batch 1650, loss[loss=0.1813, simple_loss=0.2648, pruned_loss=0.04894, over 24368.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2452, pruned_loss=0.04459, over 4720572.45 frames. ], batch size: 77, lr: 3.67e-03, grad_scale: 32.0 2023-10-02 17:53:37,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:53:38,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:53:39,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:53:39,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 17:53:40,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 17:53:40,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 17:53:40,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 17:53:40,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=967180.0, ans=0.2 2023-10-02 17:53:43,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:53:43,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:53:43,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:53:45,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:53:47,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:53:49,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 17:53:51,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:53:51,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:53:51,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:53:51,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:53:53,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 17:53:53,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 17:53:58,266 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.32 vs. limit=15.0 2023-10-02 17:54:00,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:54:03,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:54:09,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 17:54:11,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:54:13,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 17:54:16,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:54:19,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:54:19,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:54:19,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:54:20,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:54:20,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:54:23,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:54:25,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:54:25,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:54:26,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:54:28,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:54:28,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:54:32,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:54:33,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 17:54:34,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:54:34,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 17:54:36,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 17:54:37,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 17:54:37,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:54:37,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:54:38,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:54:38,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:54:38,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 17:54:42,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:54:44,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:54:45,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:54:46,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 17:54:49,268 INFO [train.py:1046] (1/4) Epoch 28, batch 1700, loss[loss=0.1492, simple_loss=0.2355, pruned_loss=0.03145, over 24429.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2438, pruned_loss=0.04406, over 4728384.65 frames. ], batch size: 63, lr: 3.67e-03, grad_scale: 32.0 2023-10-02 17:54:50,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:54:50,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:54:50,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 17:54:50,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:54:52,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:54:52,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:54:54,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:54:54,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:54:54,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 17:54:57,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:55:00,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=967513.3333333334, ans=0.125 2023-10-02 17:55:04,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:55:07,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:55:07,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=967580.0, ans=0.125 2023-10-02 17:55:13,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:55:13,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:55:14,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:55:14,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:55:17,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 17:55:20,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:55:20,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:55:21,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:55:25,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 17:55:25,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 17:55:26,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 17:55:28,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:55:29,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 17:55:31,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:55:31,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=967646.6666666666, ans=0.2 2023-10-02 17:55:38,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:55:38,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:55:39,725 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.85 vs. limit=10.0 2023-10-02 17:55:40,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:55:41,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:55:41,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 17:55:42,882 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.838e+02 2.056e+02 2.244e+02 3.312e+02, threshold=4.111e+02, percent-clipped=0.0 2023-10-02 17:55:42,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:55:44,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:55:44,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 17:55:46,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:55:46,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:55:46,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:55:46,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:55:46,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=967713.3333333334, ans=0.125 2023-10-02 17:55:47,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:55:47,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:55:49,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:55:50,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:55:50,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:55:55,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:55:55,455 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:55:56,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 17:55:58,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:55:59,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:56:01,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 17:56:01,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.67 vs. limit=15.0 2023-10-02 17:56:04,365 INFO [train.py:1046] (1/4) Epoch 28, batch 1750, loss[loss=0.1366, simple_loss=0.189, pruned_loss=0.04211, over 19167.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2426, pruned_loss=0.04368, over 4722570.46 frames. ], batch size: 388, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 17:56:05,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:56:07,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=967846.6666666666, ans=0.125 2023-10-02 17:56:09,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:56:09,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 17:56:09,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 17:56:09,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:56:10,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=967846.6666666666, ans=0.125 2023-10-02 17:56:13,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:56:13,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:56:17,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 17:56:19,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:56:21,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 17:56:21,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:56:23,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:56:25,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 17:56:27,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 17:56:29,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:56:29,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 17:56:29,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=967913.3333333334, ans=0.0 2023-10-02 17:56:36,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=967980.0, ans=0.0 2023-10-02 17:56:37,455 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.20 vs. limit=15.0 2023-10-02 17:56:38,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:56:39,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:56:39,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:56:41,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=967980.0, ans=0.1 2023-10-02 17:56:42,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:56:42,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:56:45,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:56:48,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:56:49,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:56:49,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:56:51,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 17:56:53,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:56:56,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 17:56:57,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:56:57,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=968046.6666666666, ans=0.0 2023-10-02 17:56:59,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:56:59,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:57:03,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 17:57:03,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=968113.3333333334, ans=0.125 2023-10-02 17:57:04,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 17:57:05,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=968113.3333333334, ans=0.0 2023-10-02 17:57:06,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:57:07,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:57:11,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:57:14,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:57:16,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:57:16,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 17:57:16,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:57:17,428 INFO [train.py:1046] (1/4) Epoch 28, batch 1800, loss[loss=0.1626, simple_loss=0.2402, pruned_loss=0.04251, over 24249.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2417, pruned_loss=0.04388, over 4712350.39 frames. ], batch size: 61, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 17:57:17,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:57:17,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:57:17,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:57:17,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:57:19,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:57:22,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:57:24,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:57:25,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 17:57:28,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:57:33,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 17:57:33,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:57:36,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:57:39,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:57:39,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:57:41,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:57:42,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:57:42,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 17:57:42,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=968246.6666666666, ans=0.125 2023-10-02 17:57:43,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:57:46,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:57:48,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=968313.3333333334, ans=0.2 2023-10-02 17:57:49,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 17:57:52,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 17:57:52,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 17:57:54,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:57:54,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:57:54,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:57:55,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:57:55,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=968313.3333333334, ans=0.125 2023-10-02 17:58:02,341 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 17:58:03,044 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.54 vs. limit=12.0 2023-10-02 17:58:04,427 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.80 vs. limit=15.0 2023-10-02 17:58:05,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:58:06,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:58:09,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 17:58:09,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 17:58:09,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:58:10,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:58:11,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:58:12,308 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.835e+02 1.992e+02 2.220e+02 3.215e+02, threshold=3.984e+02, percent-clipped=0.0 2023-10-02 17:58:16,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 17:58:20,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:58:21,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 17:58:22,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:58:22,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:58:23,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:58:23,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 17:58:27,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:58:27,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:58:30,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 17:58:30,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:58:32,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:58:33,428 INFO [train.py:1046] (1/4) Epoch 28, batch 1850, loss[loss=0.1553, simple_loss=0.2408, pruned_loss=0.03489, over 24417.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2423, pruned_loss=0.04379, over 4718955.67 frames. ], batch size: 63, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 17:58:33,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:58:33,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:58:34,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:58:34,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:58:37,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:58:37,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:58:40,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:58:42,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:58:46,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:58:46,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 17:58:51,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 17:58:53,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 17:58:57,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:58:57,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 17:58:57,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 17:59:08,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:59:08,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 17:59:12,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:59:12,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:59:16,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 17:59:16,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:17,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=968713.3333333334, ans=0.1 2023-10-02 17:59:18,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 17:59:19,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:59:21,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:59:23,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:59:23,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=968713.3333333334, ans=0.125 2023-10-02 17:59:26,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:59:26,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:28,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 17:59:28,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:59:30,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:59:31,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:59:36,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 17:59:36,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:59:36,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=968780.0, ans=0.125 2023-10-02 17:59:39,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:59:40,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:59:40,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 17:59:40,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 17:59:42,312 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 17:59:42,384 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 17:59:43,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=968780.0, ans=0.1 2023-10-02 17:59:45,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:59:45,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:59:45,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:59:45,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:45,182 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 17:59:46,913 INFO [train.py:1046] (1/4) Epoch 28, batch 1900, loss[loss=0.2017, simple_loss=0.2646, pruned_loss=0.0694, over 19450.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2436, pruned_loss=0.04401, over 4717519.98 frames. ], batch size: 388, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 17:59:46,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:59:47,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:47,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=968846.6666666666, ans=0.025 2023-10-02 17:59:48,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:59:49,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 17:59:51,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:59:51,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 17:59:53,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:53,717 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 17:59:53,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:59:55,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:59:59,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:00:01,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:00:02,897 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 18:00:04,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 18:00:06,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:00:06,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:00:06,151 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 18:00:07,550 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 18:00:10,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=968913.3333333334, ans=0.125 2023-10-02 18:00:11,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 18:00:13,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:00:18,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 18:00:18,834 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=3.98 vs. limit=12.0 2023-10-02 18:00:19,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 18:00:26,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 18:00:28,610 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.32 vs. limit=6.0 2023-10-02 18:00:29,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 18:00:29,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:00:30,025 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 18:00:30,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 18:00:31,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 18:00:31,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 18:00:31,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:00:31,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=969046.6666666666, ans=0.2 2023-10-02 18:00:31,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=969046.6666666666, ans=0.125 2023-10-02 18:00:35,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 18:00:37,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:00:40,419 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.863e+02 2.033e+02 2.281e+02 3.695e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-02 18:00:41,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:00:41,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 18:00:44,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:00:47,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 18:00:47,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:00:52,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:00:52,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:00:52,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:00:52,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=969113.3333333334, ans=0.125 2023-10-02 18:00:53,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:00:54,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:00:54,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:00:56,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:00:58,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:00:58,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:01:00,750 INFO [train.py:1046] (1/4) Epoch 28, batch 1950, loss[loss=0.1529, simple_loss=0.2244, pruned_loss=0.04072, over 24415.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2444, pruned_loss=0.04464, over 4697322.13 frames. ], batch size: 58, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 18:01:00,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:01:00,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:01:02,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:01:02,871 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.60 vs. limit=15.0 2023-10-02 18:01:03,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:01:05,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:01:08,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:01:08,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:01:08,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:01:11,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 18:01:11,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 18:01:11,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:01:12,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:01:15,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:01:15,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:01:15,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:01:17,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:01:20,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:01:20,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:01:20,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:01:21,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:01:25,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:01:29,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:01:29,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:01:30,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:01:30,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 18:01:32,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:01:32,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:01:32,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:01:36,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:01:37,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=969313.3333333334, ans=15.0 2023-10-02 18:01:38,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:01:40,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.24 vs. limit=22.5 2023-10-02 18:01:43,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:01:43,809 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.85 vs. limit=12.0 2023-10-02 18:01:44,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=969380.0, ans=0.125 2023-10-02 18:01:47,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:01:47,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:01:49,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 18:01:49,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:01:53,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:01:53,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:01:54,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:02:00,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=969446.6666666666, ans=0.0 2023-10-02 18:02:01,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:03,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:05,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:06,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:02:07,207 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.25 vs. limit=22.5 2023-10-02 18:02:09,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:02:09,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:02:09,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 18:02:11,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:02:12,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:02:12,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 18:02:15,243 INFO [train.py:1046] (1/4) Epoch 28, batch 2000, loss[loss=0.155, simple_loss=0.234, pruned_loss=0.03802, over 24463.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2447, pruned_loss=0.04441, over 4714934.87 frames. ], batch size: 63, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 18:02:15,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:02:18,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:02:19,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:02:19,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:02:22,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:02:23,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:25,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 18:02:25,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:02:29,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:02:29,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=969580.0, ans=0.1 2023-10-02 18:02:31,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 18:02:32,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:02:32,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:02:34,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=969580.0, ans=0.2 2023-10-02 18:02:37,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:02:37,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 18:02:38,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:02:40,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:02:40,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:02:40,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 18:02:40,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:02:43,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 18:02:44,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:02:47,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:02:48,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 18:02:48,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:02:48,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:02:51,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:02:52,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 18:02:53,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 18:02:53,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:02:53,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:02:56,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=969646.6666666666, ans=0.0 2023-10-02 18:02:57,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:59,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:02:59,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:02:59,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:03:01,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:03:02,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=969713.3333333334, ans=0.125 2023-10-02 18:03:03,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:03:04,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:03:04,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:03:05,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:07,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:03:08,512 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.888e+02 2.044e+02 2.349e+02 3.109e+02, threshold=4.088e+02, percent-clipped=0.0 2023-10-02 18:03:08,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 18:03:13,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:03:14,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:03:14,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=969780.0, ans=0.04949747468305833 2023-10-02 18:03:18,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:03:18,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:03:23,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:26,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:03:26,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:26,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:03:26,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:03:28,970 INFO [train.py:1046] (1/4) Epoch 28, batch 2050, loss[loss=0.1643, simple_loss=0.2497, pruned_loss=0.03942, over 24337.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2439, pruned_loss=0.04437, over 4704665.66 frames. ], batch size: 77, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 18:03:30,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:03:30,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:32,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=969846.6666666666, ans=0.1 2023-10-02 18:03:33,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=969846.6666666666, ans=0.2 2023-10-02 18:03:35,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:03:35,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:39,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:03:41,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:03:41,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:42,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:03:44,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 18:03:44,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:03:47,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:03:47,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:03:56,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:03:56,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:03:57,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 18:03:57,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=969980.0, ans=0.04949747468305833 2023-10-02 18:04:00,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:04:01,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 18:04:01,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:04:04,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:04:07,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:04:09,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 18:04:09,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:04:10,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:04:11,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:04:11,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:04:15,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:04:15,843 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.75 vs. limit=22.5 2023-10-02 18:04:17,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:04:18,465 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.72 vs. limit=22.5 2023-10-02 18:04:20,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:04:20,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:04:22,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=970046.6666666666, ans=0.125 2023-10-02 18:04:24,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:04:30,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:04:30,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 18:04:34,938 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.24 vs. limit=15.0 2023-10-02 18:04:37,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:04:38,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:04:39,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:04:39,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=970113.3333333334, ans=0.1 2023-10-02 18:04:41,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 18:04:42,481 INFO [train.py:1046] (1/4) Epoch 28, batch 2100, loss[loss=0.162, simple_loss=0.2368, pruned_loss=0.04362, over 23492.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2429, pruned_loss=0.04397, over 4698603.82 frames. ], batch size: 134, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:04:44,390 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 18:04:44,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:04:44,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:04:45,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:04:45,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=970180.0, ans=0.0 2023-10-02 18:04:47,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:04:47,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 18:04:47,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 18:04:48,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:04:51,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:04:52,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:04:52,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=970180.0, ans=0.0 2023-10-02 18:04:54,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:04:55,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:04:55,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 18:04:56,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:04:56,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 18:04:56,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 18:04:58,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:05:00,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:05:00,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 18:05:00,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 18:05:04,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 18:05:04,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:05:08,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:05:09,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:05:09,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=970246.6666666666, ans=0.125 2023-10-02 18:05:12,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:05:13,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 18:05:13,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:05:13,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 18:05:15,537 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.11 vs. limit=22.5 2023-10-02 18:05:15,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 18:05:15,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:05:15,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 18:05:15,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 18:05:15,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 18:05:18,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:05:19,100 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.55 vs. limit=15.0 2023-10-02 18:05:21,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:05:22,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:05:24,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:05:25,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:05:26,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:05:26,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 18:05:26,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:05:26,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:05:28,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:05:28,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 18:05:29,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=970380.0, ans=0.07 2023-10-02 18:05:30,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 18:05:32,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 18:05:32,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=970380.0, ans=0.1 2023-10-02 18:05:35,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:05:38,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:05:39,457 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.839e+02 2.053e+02 2.400e+02 3.677e+02, threshold=4.107e+02, percent-clipped=0.0 2023-10-02 18:05:39,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 18:05:39,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=970380.0, ans=0.125 2023-10-02 18:05:44,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:05:47,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:05:47,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:05:47,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:05:48,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 18:05:48,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:05:49,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:05:49,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:05:49,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:05:49,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:05:52,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 18:05:54,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 18:05:54,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:05:56,854 INFO [train.py:1046] (1/4) Epoch 28, batch 2150, loss[loss=0.1581, simple_loss=0.2129, pruned_loss=0.05161, over 19161.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2421, pruned_loss=0.04346, over 4711516.67 frames. ], batch size: 388, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:05:56,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:05:56,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:05:57,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:05:58,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:06:04,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 18:06:06,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:06:07,689 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.34 vs. limit=15.0 2023-10-02 18:06:08,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:06:10,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:06:10,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:10,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:06:15,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:06:16,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:06:16,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:06:19,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:20,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 18:06:23,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:06:25,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:06:25,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:25,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:06:26,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:26,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:06:26,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:06:26,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:06:27,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:06:29,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 18:06:31,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:06:33,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:06:33,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:06:34,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:06:37,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:06:39,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:06:40,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:06:40,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:06:40,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 18:06:42,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:06:45,113 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.11 vs. limit=15.0 2023-10-02 18:06:45,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:06:45,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:47,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:06:48,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:06:49,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:06:49,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:49,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 18:06:51,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 18:06:52,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:06:52,469 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 18:06:52,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:06:52,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:06:53,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 18:06:53,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:06:53,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 18:06:54,019 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 18:06:54,020 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 18:06:55,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 18:06:56,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:06:58,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:06:58,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:06:58,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:59,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 18:07:00,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:07:01,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:07:09,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:07:10,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 18:07:11,933 INFO [train.py:1046] (1/4) Epoch 28, batch 2200, loss[loss=0.165, simple_loss=0.2511, pruned_loss=0.03948, over 23673.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2432, pruned_loss=0.04368, over 4716551.13 frames. ], batch size: 85, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:07:14,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:07:17,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:07:18,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:07:20,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:07:20,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:07:24,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:07:24,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:07:24,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 18:07:25,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=970913.3333333334, ans=0.125 2023-10-02 18:07:28,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 18:07:31,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:07:33,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=970913.3333333334, ans=0.0 2023-10-02 18:07:37,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 18:07:39,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:07:40,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:07:40,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:07:44,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:07:44,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 18:07:47,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:07:49,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:07:49,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 18:07:53,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:07:53,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=970980.0, ans=0.125 2023-10-02 18:07:54,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:07:56,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:07:57,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:08:00,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 18:08:01,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:08:02,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 18:08:06,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:08:06,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:08:06,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:08:07,324 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.865e+02 2.059e+02 2.462e+02 3.335e+02, threshold=4.117e+02, percent-clipped=0.0 2023-10-02 18:08:08,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:08:08,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:08:10,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:08:10,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:08:10,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 18:08:12,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:08:14,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:08:15,560 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.32 vs. limit=22.5 2023-10-02 18:08:16,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 18:08:17,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:08:20,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:08:20,487 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 18:08:23,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:08:24,645 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 18:08:24,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=971180.0, ans=0.025 2023-10-02 18:08:25,922 INFO [train.py:1046] (1/4) Epoch 28, batch 2250, loss[loss=0.1657, simple_loss=0.2565, pruned_loss=0.03745, over 24624.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2441, pruned_loss=0.04374, over 4719876.91 frames. ], batch size: 68, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:08:26,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 18:08:26,046 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 18:08:27,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:08:28,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:08:28,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:08:30,406 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 18:08:31,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:08:33,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:08:37,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:08:37,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:08:37,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=971180.0, ans=0.2 2023-10-02 18:08:41,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:08:42,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:08:44,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:08:44,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=971246.6666666666, ans=0.0 2023-10-02 18:08:45,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 18:08:46,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:08:46,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:08:48,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 18:08:49,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:08:49,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:08:51,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=971246.6666666666, ans=0.125 2023-10-02 18:08:52,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:08:54,525 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.42 vs. limit=15.0 2023-10-02 18:08:55,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:08:56,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 18:08:58,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 18:08:59,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 18:09:00,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:09:03,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:09:07,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=971380.0, ans=0.125 2023-10-02 18:09:08,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:09:11,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:09:12,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:09:12,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:09:16,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:09:17,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:09:22,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:09:24,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 18:09:27,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:09:28,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:09:28,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:09:32,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 18:09:34,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:09:34,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 18:09:34,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:09:35,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:09:38,369 INFO [train.py:1046] (1/4) Epoch 28, batch 2300, loss[loss=0.1853, simple_loss=0.2661, pruned_loss=0.05223, over 24047.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2449, pruned_loss=0.04436, over 4725253.65 frames. ], batch size: 80, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:09:38,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 18:09:41,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:09:41,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:09:43,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=971513.3333333334, ans=0.125 2023-10-02 18:09:46,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:09:47,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:09:48,378 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 18:09:51,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:09:58,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:09:58,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 18:09:58,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:09:59,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:09:59,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 18:09:59,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:10:01,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=971580.0, ans=0.2 2023-10-02 18:10:03,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:10:03,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:10:06,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:10:09,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:10:13,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:10:13,216 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:10:17,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:10:17,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:10:17,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=971646.6666666666, ans=0.125 2023-10-02 18:10:20,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:10:23,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:10:27,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:10:28,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:10:28,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:10:28,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 18:10:31,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=971713.3333333334, ans=0.125 2023-10-02 18:10:32,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 18:10:32,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:10:32,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:10:32,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:10:32,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:10:34,208 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.868e+02 2.077e+02 2.377e+02 3.384e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-02 18:10:34,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 18:10:34,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:10:34,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 18:10:34,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:10:34,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:10:35,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 18:10:37,841 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.43 vs. limit=15.0 2023-10-02 18:10:39,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:10:43,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:10:46,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:10:47,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:10:47,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:10:50,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:10:50,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:10:51,975 INFO [train.py:1046] (1/4) Epoch 28, batch 2350, loss[loss=0.1572, simple_loss=0.2384, pruned_loss=0.03803, over 24361.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2461, pruned_loss=0.04485, over 4713387.08 frames. ], batch size: 61, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:10:52,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:10:52,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 18:10:58,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:10:58,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 18:11:03,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 18:11:05,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:11:07,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=971913.3333333334, ans=0.0 2023-10-02 18:11:08,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=971913.3333333334, ans=0.2 2023-10-02 18:11:09,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:11:09,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:11:09,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:11:09,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:11:11,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 18:11:15,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:11:21,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 18:11:23,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:11:23,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=971980.0, ans=0.125 2023-10-02 18:11:26,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:11:26,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:11:27,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=971980.0, ans=0.1 2023-10-02 18:11:29,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:11:30,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 18:11:30,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:11:31,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:11:31,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:11:31,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:11:37,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:11:37,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 18:11:37,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:11:41,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:11:41,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:11:43,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 18:11:45,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:11:48,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 18:11:48,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:11:52,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 18:11:54,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 18:11:55,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:11:55,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 18:11:55,639 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 18:11:55,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 18:11:59,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 18:12:01,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:12:03,859 INFO [train.py:1046] (1/4) Epoch 28, batch 2400, loss[loss=0.1744, simple_loss=0.2419, pruned_loss=0.05341, over 23725.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.246, pruned_loss=0.04509, over 4701890.71 frames. ], batch size: 212, lr: 3.66e-03, grad_scale: 16.0 2023-10-02 18:12:03,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:12:08,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:12:08,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:12:09,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 18:12:09,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 18:12:09,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=972180.0, ans=0.2 2023-10-02 18:12:17,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:12:17,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:12:18,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=972246.6666666666, ans=0.0 2023-10-02 18:12:19,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 18:12:19,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:12:20,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:12:20,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 18:12:20,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=972246.6666666666, ans=0.125 2023-10-02 18:12:22,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=972246.6666666666, ans=0.2 2023-10-02 18:12:26,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:12:27,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 18:12:33,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:12:34,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=972313.3333333334, ans=0.125 2023-10-02 18:12:37,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 18:12:39,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:12:40,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:12:42,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=972313.3333333334, ans=0.125 2023-10-02 18:12:45,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:12:45,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=972313.3333333334, ans=0.0 2023-10-02 18:12:47,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 18:12:47,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:12:55,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:12:56,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:12:58,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:12:59,480 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.861e+02 2.059e+02 2.310e+02 3.271e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-02 18:12:59,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:12:59,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 18:12:59,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:12:59,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:13:00,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:13:00,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:13:03,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:13:04,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:13:04,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 18:13:05,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 18:13:08,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:13:08,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:13:08,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 18:13:08,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 18:13:08,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 18:13:08,808 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 18:13:10,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 18:13:12,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:13:13,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:13:13,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:13:15,507 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 18:13:17,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:13:17,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:13:18,806 INFO [train.py:1046] (1/4) Epoch 28, batch 2450, loss[loss=0.1739, simple_loss=0.2547, pruned_loss=0.04655, over 24371.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2445, pruned_loss=0.04477, over 4692830.91 frames. ], batch size: 77, lr: 3.66e-03, grad_scale: 16.0 2023-10-02 18:13:21,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:13:21,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:13:27,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:13:27,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:13:27,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 18:13:33,394 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.49 vs. limit=15.0 2023-10-02 18:13:34,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:13:34,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:13:37,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:13:37,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:13:37,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:13:37,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=972580.0, ans=0.125 2023-10-02 18:13:38,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 18:13:42,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:13:44,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:13:46,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:13:49,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:13:50,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:13:50,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:13:50,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:13:54,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 18:13:55,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:14:02,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:14:05,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:14:05,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:14:05,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:14:05,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:14:05,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=972713.3333333334, ans=0.125 2023-10-02 18:14:06,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:14:07,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 18:14:10,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:14:10,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:14:14,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:14:14,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:14:17,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:14:17,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 18:14:19,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:14:19,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:14:19,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 18:14:20,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:14:21,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:14:25,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:14:28,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:14:29,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:14:32,114 INFO [train.py:1046] (1/4) Epoch 28, batch 2500, loss[loss=0.1574, simple_loss=0.2413, pruned_loss=0.03674, over 24281.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2436, pruned_loss=0.04418, over 4705145.16 frames. ], batch size: 61, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:14:32,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 18:14:32,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:14:37,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:14:45,447 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:14:45,862 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.02 vs. limit=22.5 2023-10-02 18:14:48,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:14:48,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:14:49,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:14:49,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 18:14:57,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:14:57,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:14:57,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 18:14:57,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 18:14:58,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 18:14:59,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:15:01,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:15:01,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 18:15:01,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:15:02,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 18:15:02,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:15:06,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:15:08,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:15:09,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:15:11,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 18:15:12,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:15:14,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:15:17,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:15:20,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:15:23,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:15:23,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=973046.6666666666, ans=0.125 2023-10-02 18:15:23,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=973046.6666666666, ans=0.0 2023-10-02 18:15:24,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=973046.6666666666, ans=0.0 2023-10-02 18:15:29,178 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.840e+02 2.003e+02 2.187e+02 4.032e+02, threshold=4.007e+02, percent-clipped=0.0 2023-10-02 18:15:30,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:15:33,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 18:15:33,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:15:33,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:15:34,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:15:34,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:15:36,403 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 18:15:36,403 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 18:15:36,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 18:15:39,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:15:39,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=973113.3333333334, ans=0.05 2023-10-02 18:15:40,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 18:15:40,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 18:15:41,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:15:41,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 18:15:42,181 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:15:45,236 INFO [train.py:1046] (1/4) Epoch 28, batch 2550, loss[loss=0.1716, simple_loss=0.2436, pruned_loss=0.0498, over 23433.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2437, pruned_loss=0.04403, over 4706034.99 frames. ], batch size: 285, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:15:45,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 18:15:49,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:15:50,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=973180.0, ans=0.0 2023-10-02 18:15:51,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:15:51,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:15:54,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:15:54,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 18:15:55,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:15:57,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=973180.0, ans=0.0 2023-10-02 18:16:00,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 18:16:00,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:16:00,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=973246.6666666666, ans=0.125 2023-10-02 18:16:03,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:06,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:16:06,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 18:16:06,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:16:06,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:16:06,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:16:08,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:16:08,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 18:16:08,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:16:08,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:10,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 18:16:11,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=973246.6666666666, ans=0.0 2023-10-02 18:16:18,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=973313.3333333334, ans=0.0 2023-10-02 18:16:23,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:16:24,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=973313.3333333334, ans=0.125 2023-10-02 18:16:27,282 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.79 vs. limit=15.0 2023-10-02 18:16:29,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:16:29,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:29,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:16:30,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:16:37,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:16:40,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:16:40,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:16:40,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:16:40,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 18:16:42,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:16:42,661 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.38 vs. limit=15.0 2023-10-02 18:16:44,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:16:44,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:50,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=973446.6666666666, ans=0.125 2023-10-02 18:16:51,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:16:51,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 18:16:51,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:16:51,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:53,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:16:53,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:16:54,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:16:57,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=973513.3333333334, ans=0.125 2023-10-02 18:16:57,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=973513.3333333334, ans=0.125 2023-10-02 18:16:58,691 INFO [train.py:1046] (1/4) Epoch 28, batch 2600, loss[loss=0.1768, simple_loss=0.2461, pruned_loss=0.05378, over 23685.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2447, pruned_loss=0.0444, over 4721344.90 frames. ], batch size: 232, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:17:00,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:17:02,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:17:04,313 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 18:17:07,706 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 18:17:07,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:17:07,746 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 18:17:09,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 18:17:10,510 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 18:17:11,324 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.57 vs. limit=10.0 2023-10-02 18:17:11,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:17:11,947 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 18:17:13,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 18:17:13,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=973580.0, ans=0.2 2023-10-02 18:17:14,791 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 18:17:16,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:17:16,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 18:17:17,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 18:17:19,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:17:19,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 18:17:22,679 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 18:17:22,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 18:17:31,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:17:31,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:17:31,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:17:31,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 18:17:31,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=973646.6666666666, ans=0.09899494936611666 2023-10-02 18:17:33,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:17:39,881 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 18:17:42,122 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=13.16 vs. limit=15.0 2023-10-02 18:17:44,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:17:44,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:17:45,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 18:17:45,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:17:45,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:17:46,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 18:17:50,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:17:50,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:17:53,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:17:56,169 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.450e+02 1.832e+02 1.959e+02 2.190e+02 4.032e+02, threshold=3.917e+02, percent-clipped=2.0 2023-10-02 18:17:56,285 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 18:17:56,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:17:56,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:18:01,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:18:01,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:18:01,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 18:18:02,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:18:03,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:18:03,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=973780.0, ans=0.125 2023-10-02 18:18:04,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:18:08,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=973780.0, ans=0.0 2023-10-02 18:18:08,759 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.15 vs. limit=15.0 2023-10-02 18:18:11,957 INFO [train.py:1046] (1/4) Epoch 28, batch 2650, loss[loss=0.1721, simple_loss=0.2588, pruned_loss=0.04267, over 24067.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2451, pruned_loss=0.04442, over 4726971.51 frames. ], batch size: 80, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:18:12,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 18:18:13,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:18:14,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:18:17,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 18:18:17,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:18:17,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=973846.6666666666, ans=0.125 2023-10-02 18:18:19,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=973846.6666666666, ans=0.2 2023-10-02 18:18:20,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:18:20,886 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 18:18:20,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:18:22,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:18:23,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=973846.6666666666, ans=0.0 2023-10-02 18:18:26,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:18:28,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:18:30,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:18:32,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 18:18:33,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:18:33,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:18:35,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 18:18:36,722 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 18:18:39,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:18:41,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 18:18:41,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:18:42,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 18:18:45,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:18:45,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:18:45,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=973980.0, ans=0.09899494936611666 2023-10-02 18:18:46,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:18:47,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:18:48,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=973980.0, ans=0.125 2023-10-02 18:18:49,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 18:18:49,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 18:18:53,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:18:56,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 18:18:56,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:18:56,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:18:57,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:18:57,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:18:59,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:19:00,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:19:02,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:19:02,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:19:02,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:19:03,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:19:04,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:19:04,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:19:06,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:19:07,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:19:07,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=974046.6666666666, ans=0.0 2023-10-02 18:19:09,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:19:13,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:19:13,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:19:13,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:19:14,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 18:19:16,709 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:19:18,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:19:20,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:19:21,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:19:22,167 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.27 vs. limit=15.0 2023-10-02 18:19:22,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:19:24,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:19:24,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:19:25,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=974180.0, ans=0.0 2023-10-02 18:19:26,042 INFO [train.py:1046] (1/4) Epoch 28, batch 2700, loss[loss=0.1644, simple_loss=0.2524, pruned_loss=0.03813, over 24383.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2453, pruned_loss=0.04426, over 4725870.80 frames. ], batch size: 77, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:19:26,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:19:26,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 18:19:28,457 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.12 vs. limit=15.0 2023-10-02 18:19:29,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:19:31,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 18:19:33,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:19:33,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:19:33,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:19:36,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:19:36,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:19:36,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:19:36,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 18:19:36,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 18:19:37,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:19:38,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:19:40,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:19:40,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:19:43,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:19:43,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 18:19:44,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=974246.6666666666, ans=0.025 2023-10-02 18:19:45,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:19:50,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:19:50,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:19:55,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:19:57,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:19:57,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:19:57,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:20:01,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:20:04,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:20:04,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:20:04,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:20:08,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:20:08,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:20:15,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:20:15,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:20:20,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:20:20,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:20:23,617 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.829e+02 1.997e+02 2.292e+02 3.269e+02, threshold=3.995e+02, percent-clipped=0.0 2023-10-02 18:20:23,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:20:25,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:20:26,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:20:28,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:20:30,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:20:30,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:20:31,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:20:32,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:20:32,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:20:36,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 18:20:36,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:20:39,694 INFO [train.py:1046] (1/4) Epoch 28, batch 2750, loss[loss=0.1769, simple_loss=0.2622, pruned_loss=0.04577, over 24388.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2453, pruned_loss=0.04459, over 4729868.38 frames. ], batch size: 77, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:20:39,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:20:39,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 18:20:41,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 18:20:41,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:20:45,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:20:45,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:20:45,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=974513.3333333334, ans=0.0 2023-10-02 18:20:47,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:20:49,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:20:49,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:20:51,377 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.13 vs. limit=15.0 2023-10-02 18:20:51,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:20:53,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 18:20:53,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:20:53,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:20:53,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 18:20:53,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:20:53,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:21:00,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 18:21:02,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:21:02,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:21:02,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:21:04,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:21:05,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:21:06,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:21:06,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:21:08,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:21:09,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=974646.6666666666, ans=0.0 2023-10-02 18:21:11,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:21:12,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 18:21:12,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:21:12,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:21:12,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=974646.6666666666, ans=0.125 2023-10-02 18:21:14,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:21:20,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:21:21,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=974646.6666666666, ans=0.07 2023-10-02 18:21:22,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:21:22,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:21:25,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:21:25,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:21:26,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:21:31,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:21:33,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:21:33,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 18:21:37,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:21:38,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 18:21:44,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 18:21:46,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:21:46,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 18:21:49,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:21:50,293 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.41 vs. limit=15.0 2023-10-02 18:21:51,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:21:51,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 18:21:52,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:21:54,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=974846.6666666666, ans=0.125 2023-10-02 18:21:54,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=974846.6666666666, ans=0.2 2023-10-02 18:21:55,385 INFO [train.py:1046] (1/4) Epoch 28, batch 2800, loss[loss=0.1588, simple_loss=0.2457, pruned_loss=0.03589, over 24329.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2431, pruned_loss=0.0439, over 4714689.41 frames. ], batch size: 74, lr: 3.65e-03, grad_scale: 16.0 2023-10-02 18:21:55,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 18:21:56,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:21:56,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:21:56,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 18:21:56,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:21:58,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:21:59,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:21:59,720 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 18:21:59,721 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 18:22:03,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:22:05,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:22:05,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:22:07,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:22:10,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 18:22:11,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 18:22:11,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 18:22:14,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:22:16,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:22:16,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:22:19,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:22:19,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:22:19,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 18:22:21,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:22:28,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:22:29,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:22:31,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:22:31,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:22:32,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:22:33,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=974980.0, ans=0.0 2023-10-02 18:22:37,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:22:37,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 18:22:37,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:22:38,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:22:38,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:22:42,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:22:43,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:22:46,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:22:47,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:22:48,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:22:48,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:22:49,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:22:49,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:22:50,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:22:50,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 18:22:50,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:22:52,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:22:52,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:22:53,779 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 1.840e+02 2.003e+02 2.153e+02 3.195e+02, threshold=4.007e+02, percent-clipped=0.0 2023-10-02 18:22:53,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 18:22:54,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:22:54,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=975113.3333333334, ans=0.125 2023-10-02 18:22:54,661 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.50 vs. limit=12.0 2023-10-02 18:22:55,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:22:55,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:22:56,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 18:23:01,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=975113.3333333334, ans=0.0 2023-10-02 18:23:02,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:23:02,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:23:02,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:23:03,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=975113.3333333334, ans=0.125 2023-10-02 18:23:05,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:23:09,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:23:09,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:23:09,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:23:10,546 INFO [train.py:1046] (1/4) Epoch 28, batch 2850, loss[loss=0.1549, simple_loss=0.2258, pruned_loss=0.04201, over 22483.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2432, pruned_loss=0.04383, over 4712873.38 frames. ], batch size: 49, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:23:12,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:23:12,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:23:13,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:23:14,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 18:23:19,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=975180.0, ans=0.125 2023-10-02 18:23:20,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 18:23:20,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:23:22,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 18:23:24,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:23:27,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 18:23:28,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 18:23:30,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:23:39,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=975313.3333333334, ans=0.1 2023-10-02 18:23:40,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:23:42,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:23:42,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:23:43,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:23:43,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:23:43,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:23:44,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=975313.3333333334, ans=0.0 2023-10-02 18:23:47,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:23:47,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 18:23:48,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:23:50,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:23:50,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:23:50,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:23:50,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=975313.3333333334, ans=0.2 2023-10-02 18:23:53,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:23:54,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:23:54,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:23:55,547 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.63 vs. limit=15.0 2023-10-02 18:23:57,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:23:59,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:23:59,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:24:00,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:24:02,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:24:05,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:24:06,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 18:24:06,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 18:24:09,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:24:09,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:24:10,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 18:24:11,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:24:11,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:24:13,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:24:13,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:24:13,558 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 18:24:13,594 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 18:24:13,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:24:14,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:24:17,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=975446.6666666666, ans=0.125 2023-10-02 18:24:19,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 18:24:21,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:24:21,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:24:22,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 18:24:23,881 INFO [train.py:1046] (1/4) Epoch 28, batch 2900, loss[loss=0.1601, simple_loss=0.2418, pruned_loss=0.03922, over 23435.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2434, pruned_loss=0.04394, over 4705788.48 frames. ], batch size: 119, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:24:27,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:24:27,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 18:24:28,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 18:24:30,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:24:30,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:24:32,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:24:33,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:24:37,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:24:37,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:24:40,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:24:40,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 18:24:41,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:24:43,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:24:45,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 18:24:46,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 18:24:49,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:24:49,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 18:24:50,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:24:51,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:24:51,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 18:24:53,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:24:55,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:24:58,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:25:01,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:25:03,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 18:25:03,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 18:25:03,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:25:06,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:25:07,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 18:25:08,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:25:10,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=975713.3333333334, ans=0.2 2023-10-02 18:25:13,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:25:22,908 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.803e+02 1.962e+02 2.123e+02 3.494e+02, threshold=3.924e+02, percent-clipped=0.0 2023-10-02 18:25:22,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:25:23,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:25:24,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 18:25:27,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:25:27,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 18:25:27,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:25:29,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:25:30,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=975780.0, ans=0.07 2023-10-02 18:25:33,154 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.46 vs. limit=22.5 2023-10-02 18:25:36,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:25:37,847 INFO [train.py:1046] (1/4) Epoch 28, batch 2950, loss[loss=0.1779, simple_loss=0.2475, pruned_loss=0.05412, over 23719.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2441, pruned_loss=0.04431, over 4710164.96 frames. ], batch size: 179, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:25:37,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 18:25:39,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:25:39,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:25:40,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:25:40,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=975846.6666666666, ans=0.125 2023-10-02 18:25:43,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:25:44,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 18:25:44,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 18:25:46,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:25:46,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:25:49,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:25:50,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=975913.3333333334, ans=0.0 2023-10-02 18:25:52,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:25:54,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:25:54,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:25:57,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:25:57,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:25:59,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:25:59,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:26:00,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:26:02,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 18:26:06,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 18:26:06,453 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 18:26:06,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=975980.0, ans=0.0 2023-10-02 18:26:07,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:26:09,096 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 18:26:09,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 18:26:10,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:26:10,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:26:10,577 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 18:26:10,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:26:13,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 18:26:13,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:26:15,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:26:16,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:26:17,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:26:17,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:26:19,183 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 18:26:19,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:26:19,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 18:26:20,081 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.83 vs. limit=15.0 2023-10-02 18:26:24,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:26:25,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:26:27,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 18:26:27,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:26:29,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 18:26:29,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=976046.6666666666, ans=0.1 2023-10-02 18:26:33,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:26:34,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:26:34,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:26:37,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:26:37,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 18:26:38,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:26:40,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:26:40,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:26:40,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:26:41,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:26:41,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:26:43,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:26:44,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 18:26:44,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:26:47,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:26:47,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:26:51,599 INFO [train.py:1046] (1/4) Epoch 28, batch 3000, loss[loss=0.1847, simple_loss=0.2556, pruned_loss=0.05688, over 23824.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2443, pruned_loss=0.04427, over 4718312.20 frames. ], batch size: 212, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:26:51,599 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 18:27:03,155 INFO [train.py:1078] (1/4) Epoch 28, validation: loss=0.3199, simple_loss=0.2738, pruned_loss=0.183, over 1125622.00 frames. 2023-10-02 18:27:03,155 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 18:27:03,268 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 18:27:04,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 18:27:06,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:27:06,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:27:06,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=976180.0, ans=0.125 2023-10-02 18:27:07,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 18:27:07,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:27:13,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:27:23,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:27:24,746 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.11 vs. limit=15.0 2023-10-02 18:27:28,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 18:27:30,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:27:34,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:27:34,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:27:34,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:27:36,749 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.09 vs. limit=15.0 2023-10-02 18:27:37,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:27:37,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 18:27:40,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 18:27:41,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:27:43,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 18:27:45,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:27:45,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:27:45,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:27:45,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:27:48,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:27:49,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:27:49,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:27:51,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:27:54,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 18:27:54,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:27:55,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:27:56,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:28:00,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:28:00,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:28:01,806 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.829e+02 2.050e+02 2.331e+02 3.345e+02, threshold=4.099e+02, percent-clipped=0.0 2023-10-02 18:28:01,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 18:28:03,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 18:28:03,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:28:03,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 18:28:03,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:28:03,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=976446.6666666666, ans=0.07 2023-10-02 18:28:05,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=976446.6666666666, ans=0.0 2023-10-02 18:28:06,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 18:28:08,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:28:09,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:28:09,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 18:28:10,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 18:28:10,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 18:28:12,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:28:13,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:28:13,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:28:13,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:14,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:28:16,137 INFO [train.py:1046] (1/4) Epoch 28, batch 3050, loss[loss=0.1648, simple_loss=0.2553, pruned_loss=0.03719, over 24557.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2452, pruned_loss=0.04449, over 4716473.81 frames. ], batch size: 71, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:28:16,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 18:28:19,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:28:20,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:28:20,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:28:24,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:25,125 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.19 vs. limit=15.0 2023-10-02 18:28:27,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 18:28:27,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=976513.3333333334, ans=0.125 2023-10-02 18:28:34,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 18:28:34,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 18:28:34,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:28:37,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:28:40,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:40,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:28:40,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:28:43,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:28:43,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:28:44,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:28:44,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:28:44,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:28:46,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:47,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:28:49,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=976646.6666666666, ans=0.2 2023-10-02 18:28:52,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:28:52,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 18:28:53,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:53,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:28:56,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:28:58,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:28:58,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:28:59,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:29:04,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:29:04,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:29:11,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:29:12,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:29:12,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:29:14,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:29:14,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 18:29:14,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:29:17,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 18:29:18,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:29:20,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:29:20,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 18:29:21,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:29:27,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:29:27,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:29:28,530 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.94 vs. limit=15.0 2023-10-02 18:29:29,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:29:30,977 INFO [train.py:1046] (1/4) Epoch 28, batch 3100, loss[loss=0.1648, simple_loss=0.2293, pruned_loss=0.05011, over 23406.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2454, pruned_loss=0.0445, over 4714263.71 frames. ], batch size: 285, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:29:31,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 18:29:33,778 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.17 vs. limit=22.5 2023-10-02 18:29:34,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 18:29:34,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 18:29:35,344 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.28 vs. limit=22.5 2023-10-02 18:29:35,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:29:41,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:29:41,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:29:44,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 18:29:47,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:29:51,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 18:29:56,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 18:29:57,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:29:57,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:29:57,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:29:57,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=976913.3333333334, ans=0.125 2023-10-02 18:29:58,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 18:30:00,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:30:00,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 18:30:00,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:30:01,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=976980.0, ans=0.0 2023-10-02 18:30:02,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:30:04,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 18:30:05,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:30:08,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:30:10,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 18:30:10,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 18:30:10,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:11,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:30:13,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=976980.0, ans=0.1 2023-10-02 18:30:14,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:30:15,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:15,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:30:18,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:30:18,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:30:19,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:30:19,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:30:19,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:19,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 18:30:25,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:30:25,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 18:30:27,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:30:28,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 18:30:28,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:30:29,752 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.812e+02 1.984e+02 2.223e+02 4.054e+02, threshold=3.967e+02, percent-clipped=0.0 2023-10-02 18:30:29,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:29,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 18:30:30,904 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.34 vs. limit=15.0 2023-10-02 18:30:38,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=977113.3333333334, ans=0.0 2023-10-02 18:30:41,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 18:30:44,700 INFO [train.py:1046] (1/4) Epoch 28, batch 3150, loss[loss=0.1807, simple_loss=0.2482, pruned_loss=0.05656, over 23733.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2432, pruned_loss=0.04393, over 4718584.39 frames. ], batch size: 179, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:30:44,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:30:44,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:47,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:30:47,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:30:48,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 18:30:50,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:30:50,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 18:30:51,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 18:30:54,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:30:55,237 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.37 vs. limit=15.0 2023-10-02 18:30:57,338 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 18:30:58,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 18:31:00,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:31:00,921 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 18:31:00,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 18:31:03,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 18:31:03,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 18:31:03,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 18:31:03,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:31:03,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:31:06,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:31:08,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 18:31:09,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:31:10,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:31:11,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:31:12,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:31:17,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 18:31:17,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:31:20,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:31:20,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:31:21,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 18:31:24,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 18:31:26,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:31:26,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 18:31:26,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 18:31:27,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:31:27,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:31:28,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:31:28,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:31:28,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 18:31:28,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:31:28,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:31:32,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:31:32,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:31:33,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 18:31:33,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:31:35,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 18:31:35,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:31:36,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 18:31:37,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 18:31:39,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:31:39,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:31:41,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 18:31:43,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 18:31:43,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:31:46,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:31:47,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:31:47,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:31:51,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:31:53,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:31:54,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 18:31:54,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=977446.6666666666, ans=0.125 2023-10-02 18:31:58,496 INFO [train.py:1046] (1/4) Epoch 28, batch 3200, loss[loss=0.1599, simple_loss=0.2359, pruned_loss=0.04197, over 23690.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2424, pruned_loss=0.04344, over 4727672.33 frames. ], batch size: 149, lr: 3.65e-03, grad_scale: 16.0 2023-10-02 18:31:58,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:31:58,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 18:32:04,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:32:05,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:32:05,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 18:32:07,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:32:12,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:32:17,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=977580.0, ans=0.0 2023-10-02 18:32:18,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:32:25,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:32:27,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=977646.6666666666, ans=0.0 2023-10-02 18:32:31,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=977646.6666666666, ans=0.07 2023-10-02 18:32:34,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 18:32:35,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:32:36,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=977646.6666666666, ans=0.125 2023-10-02 18:32:38,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 18:32:40,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 18:32:41,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:32:41,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:32:43,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:32:46,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 18:32:48,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 18:32:49,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 18:32:50,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=977713.3333333334, ans=0.1 2023-10-02 18:32:52,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 18:32:53,114 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.06 vs. limit=15.0 2023-10-02 18:32:54,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:32:55,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=977713.3333333334, ans=0.1 2023-10-02 18:32:58,088 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.399e+02 1.829e+02 1.986e+02 2.158e+02 2.804e+02, threshold=3.972e+02, percent-clipped=0.0 2023-10-02 18:33:00,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:33:00,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:33:00,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:33:02,296 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 18:33:02,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:33:05,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:33:05,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 18:33:07,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 18:33:07,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 18:33:07,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=977780.0, ans=0.1 2023-10-02 18:33:09,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 18:33:11,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:33:13,363 INFO [train.py:1046] (1/4) Epoch 28, batch 3250, loss[loss=0.183, simple_loss=0.2536, pruned_loss=0.0562, over 23791.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2423, pruned_loss=0.04367, over 4713924.25 frames. ], batch size: 232, lr: 3.65e-03, grad_scale: 16.0 2023-10-02 18:33:15,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:33:15,388 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 18:33:15,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:33:15,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:16,816 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 18:33:20,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:33:23,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:33:30,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:33:30,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 18:33:30,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:33:32,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:33:32,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:33:33,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:33:33,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:33:36,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:36,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:33:36,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:33:36,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:36,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:38,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:33:41,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:33:44,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:33:44,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=977980.0, ans=0.0 2023-10-02 18:33:45,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:33:45,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:47,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:33:49,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:33:49,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:33:53,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 18:33:53,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=977980.0, ans=0.2 2023-10-02 18:33:54,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:33:54,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:33:54,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:33:55,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=977980.0, ans=0.2 2023-10-02 18:33:56,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:34:01,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:34:09,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:34:09,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:34:09,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 18:34:09,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:34:09,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 18:34:09,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:34:12,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 18:34:12,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 18:34:12,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=978113.3333333334, ans=0.2 2023-10-02 18:34:13,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:34:15,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:34:15,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:34:15,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 18:34:17,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:34:18,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=978113.3333333334, ans=0.0 2023-10-02 18:34:21,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:34:21,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:34:24,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 18:34:24,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:34:24,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=978113.3333333334, ans=0.125 2023-10-02 18:34:27,062 INFO [train.py:1046] (1/4) Epoch 28, batch 3300, loss[loss=0.1646, simple_loss=0.2381, pruned_loss=0.04551, over 23882.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2433, pruned_loss=0.04375, over 4719588.15 frames. ], batch size: 195, lr: 3.65e-03, grad_scale: 16.0 2023-10-02 18:34:27,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:34:27,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 18:34:29,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:34:29,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 18:34:30,705 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.01 vs. limit=22.5 2023-10-02 18:34:31,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 18:34:32,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 18:34:32,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:34:35,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:34:38,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:34:38,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:34:38,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=978180.0, ans=0.125 2023-10-02 18:34:41,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 18:34:41,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:34:44,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:34:45,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:34:48,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=978246.6666666666, ans=0.2 2023-10-02 18:34:49,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 18:34:49,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:34:49,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:34:52,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:34:52,377 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 18:34:53,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:34:55,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 18:34:55,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:34:56,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:34:57,016 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 18:34:59,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:34:59,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:35:02,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:35:02,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 18:35:04,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 18:35:04,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:35:04,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:35:05,540 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 18:35:08,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 18:35:09,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:35:11,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 18:35:14,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:35:16,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:35:16,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:35:19,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:35:20,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:35:20,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:35:20,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:35:22,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=978380.0, ans=0.0 2023-10-02 18:35:23,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:35:23,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:35:23,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:35:25,228 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 18:35:25,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=978446.6666666666, ans=0.125 2023-10-02 18:35:26,523 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.464e+02 1.840e+02 2.125e+02 2.554e+02 4.181e+02, threshold=4.250e+02, percent-clipped=1.0 2023-10-02 18:35:26,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 18:35:28,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:35:28,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:35:28,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:35:31,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:35:31,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:35:34,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:35:35,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:35:35,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:35:35,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:35:36,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:35:38,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 18:35:38,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:35:40,907 INFO [train.py:1046] (1/4) Epoch 28, batch 3350, loss[loss=0.1511, simple_loss=0.2278, pruned_loss=0.0372, over 24634.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.244, pruned_loss=0.04404, over 4723143.22 frames. ], batch size: 60, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:35:40,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:35:42,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:35:43,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:35:43,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:35:44,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=978513.3333333334, ans=0.2 2023-10-02 18:35:45,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:35:45,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:35:48,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:35:50,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:35:51,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:35:53,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=978513.3333333334, ans=10.0 2023-10-02 18:35:54,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:35:55,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:35:55,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:35:57,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:35:59,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 18:35:59,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=978580.0, ans=0.0 2023-10-02 18:36:00,547 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 18:36:00,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:36:03,490 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:36:04,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 18:36:04,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 18:36:04,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:36:05,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:36:07,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:08,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 18:36:08,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:36:08,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:36:10,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:36:11,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:36:13,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:36:13,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:36:15,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:36:18,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:36:18,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:36:22,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:36:24,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:36:25,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:36:25,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:27,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:28,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 18:36:28,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:36:28,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 18:36:29,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:36:31,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 18:36:33,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:36:34,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:36:43,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:43,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 18:36:43,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:36:43,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=978780.0, ans=0.125 2023-10-02 18:36:44,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:36:46,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:36:51,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:36:54,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 18:36:54,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:36:54,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:36:55,462 INFO [train.py:1046] (1/4) Epoch 28, batch 3400, loss[loss=0.1461, simple_loss=0.2258, pruned_loss=0.03319, over 24313.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.245, pruned_loss=0.04462, over 4714728.12 frames. ], batch size: 56, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:36:55,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:36:55,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 18:36:56,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:56,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 18:36:57,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:36:57,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:36:58,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:36:58,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:36:58,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 18:37:01,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 18:37:01,399 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 18:37:03,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:37:07,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:37:07,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:37:07,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:37:08,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:37:12,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:37:13,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 18:37:18,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:37:18,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=978913.3333333334, ans=0.125 2023-10-02 18:37:18,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=978913.3333333334, ans=0.125 2023-10-02 18:37:19,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:37:19,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:37:20,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 18:37:25,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:37:29,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 18:37:35,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:37:36,182 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.31 vs. limit=6.0 2023-10-02 18:37:36,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:37:38,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 18:37:38,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:37:39,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:37:39,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:37:39,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:37:40,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=979046.6666666666, ans=0.0 2023-10-02 18:37:43,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:37:49,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:37:49,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:37:53,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:37:54,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 18:37:55,882 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.378e+02 1.833e+02 2.007e+02 2.238e+02 3.330e+02, threshold=4.014e+02, percent-clipped=0.0 2023-10-02 18:38:00,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 18:38:04,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 18:38:07,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 18:38:07,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:38:09,001 INFO [train.py:1046] (1/4) Epoch 28, batch 3450, loss[loss=0.1594, simple_loss=0.2516, pruned_loss=0.0336, over 24464.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2456, pruned_loss=0.0447, over 4714106.28 frames. ], batch size: 69, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:38:10,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:38:10,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 18:38:11,068 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=10.62 vs. limit=22.5 2023-10-02 18:38:12,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:38:17,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:38:20,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:38:21,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:38:21,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:38:21,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:38:22,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:38:28,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 18:38:29,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=979246.6666666666, ans=0.0 2023-10-02 18:38:35,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 18:38:35,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 18:38:37,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:38:38,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:38:44,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 18:38:45,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:38:47,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=979313.3333333334, ans=0.2 2023-10-02 18:38:49,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:38:49,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:38:49,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=979313.3333333334, ans=0.0 2023-10-02 18:38:52,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:38:53,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:38:55,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 18:38:55,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:38:56,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:38:57,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:39:00,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 18:39:02,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=979380.0, ans=0.125 2023-10-02 18:39:04,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:39:08,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:39:09,123 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=13.99 vs. limit=15.0 2023-10-02 18:39:09,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:12,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:39:19,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:19,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:39:19,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=979446.6666666666, ans=0.0 2023-10-02 18:39:20,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:39:21,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:39:23,142 INFO [train.py:1046] (1/4) Epoch 28, batch 3500, loss[loss=0.1505, simple_loss=0.242, pruned_loss=0.02953, over 24460.00 frames. ], tot_loss[loss=0.166, simple_loss=0.244, pruned_loss=0.04398, over 4714076.66 frames. ], batch size: 66, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:39:26,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:39:28,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:39:28,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 18:39:29,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=979513.3333333334, ans=0.125 2023-10-02 18:39:31,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 18:39:35,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 18:39:36,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:39:36,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 18:39:41,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:39:42,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:39:44,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:39:44,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:39:44,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:39:46,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:48,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:39:48,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 18:39:49,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:49,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:39:51,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:39:53,637 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.99 vs. limit=22.5 2023-10-02 18:39:55,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:55,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 18:39:55,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:39:58,020 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.41 vs. limit=5.0 2023-10-02 18:39:58,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:39:58,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:39:59,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:40:01,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:40:01,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:40:01,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 18:40:02,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 18:40:04,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 18:40:04,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:40:06,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:40:08,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:40:08,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:40:11,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 18:40:12,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:40:13,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=979713.3333333334, ans=0.125 2023-10-02 18:40:18,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:40:19,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 18:40:19,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 18:40:19,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:40:22,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=979780.0, ans=0.0 2023-10-02 18:40:23,422 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.910e+02 2.082e+02 2.470e+02 3.693e+02, threshold=4.164e+02, percent-clipped=0.0 2023-10-02 18:40:23,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:40:23,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:40:24,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:40:27,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 18:40:28,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:40:30,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:40:30,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 18:40:31,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 18:40:33,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:40:35,945 INFO [train.py:1046] (1/4) Epoch 28, batch 3550, loss[loss=0.1613, simple_loss=0.2299, pruned_loss=0.0464, over 23437.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2429, pruned_loss=0.04384, over 4716384.64 frames. ], batch size: 285, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:40:35,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:40:36,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:40:36,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:40:38,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:40:46,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:40:47,617 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.09 vs. limit=15.0 2023-10-02 18:40:48,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=979846.6666666666, ans=0.1 2023-10-02 18:40:50,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 18:40:50,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=979913.3333333334, ans=0.125 2023-10-02 18:40:52,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:40:54,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:40:55,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:40:56,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:40:56,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:40:59,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:41:00,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:41:01,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:41:01,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 18:41:02,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:41:07,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:41:07,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:41:09,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:41:09,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:41:10,108 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.36 vs. limit=22.5 2023-10-02 18:41:10,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:41:10,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 18:41:10,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:41:12,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:41:14,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 18:41:16,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=979980.0, ans=0.0 2023-10-02 18:41:20,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:41:20,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:41:20,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:41:24,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 18:41:25,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:41:26,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 18:41:26,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:41:29,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:41:29,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:41:30,099 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.37 vs. limit=15.0 2023-10-02 18:41:32,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 18:41:33,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:41:37,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:41:39,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 18:41:39,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:41:43,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:41:46,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 18:41:49,886 INFO [train.py:1046] (1/4) Epoch 28, batch 3600, loss[loss=0.1701, simple_loss=0.2472, pruned_loss=0.04649, over 18007.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2424, pruned_loss=0.04373, over 4708066.68 frames. ], batch size: 39, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:41:54,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 18:41:54,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:41:55,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:41:58,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:42:00,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:42:00,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:42:03,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:42:04,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:42:05,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:42:05,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:42:06,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:42:06,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 18:42:10,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:42:10,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:42:13,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:42:16,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:42:17,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:42:17,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:42:18,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 18:42:19,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:42:21,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:42:22,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:42:22,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:42:23,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=980313.3333333334, ans=0.0 2023-10-02 18:42:25,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:42:26,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:42:28,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 18:42:33,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:42:34,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 18:42:34,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 18:42:39,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:42:45,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:42:49,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:42:50,299 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.786e+02 1.946e+02 2.303e+02 3.332e+02, threshold=3.893e+02, percent-clipped=0.0 2023-10-02 18:42:54,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:42:54,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:42:54,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 18:42:55,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 18:42:57,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 18:42:59,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:43:00,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:43:00,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 18:43:02,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:43:02,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:43:02,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:43:03,429 INFO [train.py:1046] (1/4) Epoch 28, batch 3650, loss[loss=0.1652, simple_loss=0.2411, pruned_loss=0.04461, over 23816.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2431, pruned_loss=0.04405, over 4692058.35 frames. ], batch size: 212, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:43:03,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 18:43:04,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 18:43:09,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:43:09,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 18:43:11,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 18:43:13,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:43:13,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=980513.3333333334, ans=0.125 2023-10-02 18:43:17,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 18:43:19,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 18:43:23,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:43:23,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:43:23,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:43:26,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:43:26,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:43:28,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 18:43:28,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:43:29,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:43:29,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 18:43:29,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:43:30,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=980580.0, ans=0.2 2023-10-02 18:43:31,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:43:31,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:43:32,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:43:35,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 18:43:36,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 18:43:36,891 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:43:38,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:43:38,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 18:43:40,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:43:40,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:43:42,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=980646.6666666666, ans=0.2 2023-10-02 18:43:45,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:43:45,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=980713.3333333334, ans=0.125 2023-10-02 18:43:47,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:43:47,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:43:49,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:43:49,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:43:51,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:43:53,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:43:53,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:43:54,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:43:56,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:43:58,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:43:59,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:44:02,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=980780.0, ans=0.125 2023-10-02 18:44:06,518 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 18:44:06,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=980780.0, ans=0.125 2023-10-02 18:44:07,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:44:07,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:44:09,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:44:10,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:44:10,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:44:12,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:44:12,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=980780.0, ans=0.1 2023-10-02 18:44:13,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 18:44:13,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:44:13,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=980780.0, ans=0.2 2023-10-02 18:44:16,240 INFO [train.py:1046] (1/4) Epoch 28, batch 3700, loss[loss=0.1761, simple_loss=0.2563, pruned_loss=0.04793, over 23402.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2446, pruned_loss=0.04452, over 4689311.44 frames. ], batch size: 93, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:44:18,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:44:20,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:44:20,721 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.44 vs. limit=15.0 2023-10-02 18:44:21,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:44:24,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:44:24,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 18:44:24,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:44:24,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 18:44:25,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:44:26,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=980846.6666666666, ans=0.1 2023-10-02 18:44:29,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 18:44:31,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:44:33,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:44:33,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:44:34,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:44:34,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=980913.3333333334, ans=0.0 2023-10-02 18:44:35,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:44:36,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=980913.3333333334, ans=0.0 2023-10-02 18:44:38,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:44:40,145 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 18:44:45,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:44:45,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 18:44:45,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:44:45,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 18:44:45,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:44:50,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:44:52,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 18:44:54,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:44:55,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:44:58,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:45:00,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:45:01,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 18:45:04,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:45:04,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 18:45:05,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:45:05,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 18:45:05,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=981046.6666666666, ans=0.0 2023-10-02 18:45:08,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=981046.6666666666, ans=0.125 2023-10-02 18:45:08,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=981046.6666666666, ans=0.125 2023-10-02 18:45:11,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:45:11,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:45:13,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:45:15,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 18:45:16,603 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.928e+02 2.164e+02 2.452e+02 3.426e+02, threshold=4.329e+02, percent-clipped=0.0 2023-10-02 18:45:16,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:45:16,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:45:16,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:45:16,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:45:20,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:45:20,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 18:45:21,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 18:45:21,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:45:23,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:45:25,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:45:26,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:45:29,014 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.63 vs. limit=15.0 2023-10-02 18:45:29,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:45:31,126 INFO [train.py:1046] (1/4) Epoch 28, batch 3750, loss[loss=0.1546, simple_loss=0.2271, pruned_loss=0.04104, over 24456.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2456, pruned_loss=0.04511, over 4700213.60 frames. ], batch size: 58, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:45:31,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:45:32,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:45:35,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 18:45:35,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 18:45:38,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:45:38,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 18:45:39,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:45:40,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:45:40,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:45:42,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:45:46,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:45:49,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:45:51,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:45:52,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:45:54,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:45:56,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 18:45:57,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:45:58,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:45:59,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:46:02,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 18:46:05,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 18:46:07,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:46:07,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:46:09,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:46:14,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=981380.0, ans=0.0 2023-10-02 18:46:15,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:46:17,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:46:20,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 18:46:23,637 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.32 vs. limit=10.0 2023-10-02 18:46:24,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:46:26,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:46:28,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:46:29,551 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.35 vs. limit=22.5 2023-10-02 18:46:30,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:46:34,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:46:35,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 18:46:36,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=981446.6666666666, ans=0.125 2023-10-02 18:46:37,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:46:40,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:46:42,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:46:44,016 INFO [train.py:1046] (1/4) Epoch 28, batch 3800, loss[loss=0.1502, simple_loss=0.2273, pruned_loss=0.03657, over 24352.00 frames. ], tot_loss[loss=0.168, simple_loss=0.246, pruned_loss=0.04503, over 4704489.25 frames. ], batch size: 56, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:46:50,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:46:54,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:46:55,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 18:46:56,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 18:46:58,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:47:00,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:47:00,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=981580.0, ans=0.125 2023-10-02 18:47:01,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 18:47:03,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 18:47:03,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:47:03,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:47:04,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=981580.0, ans=0.125 2023-10-02 18:47:05,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:47:05,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:47:05,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:47:06,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 18:47:10,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 18:47:10,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:47:12,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:47:14,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:47:15,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 18:47:16,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:47:17,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:47:20,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:47:20,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:47:20,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=981646.6666666666, ans=0.125 2023-10-02 18:47:25,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:47:25,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 18:47:27,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:47:34,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:47:38,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:47:40,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 18:47:42,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 18:47:42,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:47:43,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:47:45,042 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.791e+02 1.970e+02 2.183e+02 2.893e+02, threshold=3.939e+02, percent-clipped=0.0 2023-10-02 18:47:45,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:47:46,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 18:47:51,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 18:47:51,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 18:47:52,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:47:52,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:47:57,283 INFO [train.py:1046] (1/4) Epoch 28, batch 3850, loss[loss=0.1556, simple_loss=0.2296, pruned_loss=0.04081, over 24320.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2446, pruned_loss=0.04474, over 4699371.56 frames. ], batch size: 56, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:47:57,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:47:58,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:48:03,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:48:03,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 18:48:05,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:48:05,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:48:09,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:48:10,375 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.69 vs. limit=15.0 2023-10-02 18:48:12,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:48:13,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:48:14,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 18:48:19,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:22,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:48:23,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:48:25,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:48:28,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:28,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:48:29,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:48:29,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:48:31,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:48:33,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:48:34,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:34,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:48:34,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 18:48:34,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 18:48:36,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:48:37,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:40,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:48:40,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:40,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 18:48:41,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 18:48:44,331 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.69 vs. limit=15.0 2023-10-02 18:48:44,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:48:46,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 18:48:47,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 18:48:52,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:48:52,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:56,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:48:58,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 18:49:00,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 18:49:02,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:49:02,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:49:05,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:49:05,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:49:06,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:06,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:06,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:49:06,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 18:49:08,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:49:09,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 18:49:09,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:09,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:49:10,990 INFO [train.py:1046] (1/4) Epoch 28, batch 3900, loss[loss=0.1567, simple_loss=0.2476, pruned_loss=0.03292, over 24683.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.243, pruned_loss=0.04425, over 4702750.89 frames. ], batch size: 73, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:49:11,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:49:13,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:15,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:49:15,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:49:15,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:49:17,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:49:17,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 18:49:18,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:22,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:49:22,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:49:22,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:49:24,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:49:25,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:49:25,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:28,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:49:30,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 18:49:30,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:49:30,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 18:49:32,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:32,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 18:49:34,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 18:49:34,650 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.52 vs. limit=15.0 2023-10-02 18:49:38,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:49:40,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:49:40,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:49:40,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:49:45,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:49:47,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:49:49,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:49:49,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:49:49,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:49:55,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:49:55,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=982380.0, ans=10.0 2023-10-02 18:49:56,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:49:57,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=982380.0, ans=0.2 2023-10-02 18:50:00,159 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.32 vs. limit=15.0 2023-10-02 18:50:03,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:50:04,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:50:13,024 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.882e+02 2.101e+02 2.412e+02 3.470e+02, threshold=4.202e+02, percent-clipped=0.0 2023-10-02 18:50:13,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:50:16,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:50:16,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 18:50:16,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 18:50:16,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:50:17,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 18:50:19,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:50:19,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 18:50:24,520 INFO [train.py:1046] (1/4) Epoch 28, batch 3950, loss[loss=0.1709, simple_loss=0.245, pruned_loss=0.04835, over 23761.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2427, pruned_loss=0.04406, over 4709642.26 frames. ], batch size: 232, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:50:25,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:50:26,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 18:50:27,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:50:29,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:50:31,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:50:34,418 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 18:50:35,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:50:35,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 18:50:35,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=982513.3333333334, ans=10.0 2023-10-02 18:50:37,077 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 18:50:37,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:50:38,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:50:39,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 18:50:39,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:50:41,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 18:50:43,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:50:44,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:50:44,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:50:44,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:50:45,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:50:56,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:50:56,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:51:02,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 18:51:08,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 18:51:08,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 18:51:09,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:51:10,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=982713.3333333334, ans=0.1 2023-10-02 18:51:12,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:51:18,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:51:18,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:51:18,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:51:19,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:51:19,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 18:51:21,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=982713.3333333334, ans=0.1 2023-10-02 18:51:22,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:51:24,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:51:28,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 18:51:33,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=982780.0, ans=0.125 2023-10-02 18:51:38,607 INFO [train.py:1046] (1/4) Epoch 28, batch 4000, loss[loss=0.1742, simple_loss=0.2619, pruned_loss=0.04328, over 23953.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2436, pruned_loss=0.0446, over 4692135.87 frames. ], batch size: 86, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:51:38,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:51:44,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:51:50,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:51:50,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:51:50,912 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.74 vs. limit=6.0 2023-10-02 18:51:51,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:51:51,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 18:51:52,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:51:53,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=982913.3333333334, ans=0.125 2023-10-02 18:51:54,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 18:51:54,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:51:54,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 18:51:55,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:51:59,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:51:59,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:51:59,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:51:59,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:51:59,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 18:52:02,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:52:03,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=982913.3333333334, ans=15.0 2023-10-02 18:52:03,823 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 18:52:05,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:52:05,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:52:05,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=982913.3333333334, ans=0.125 2023-10-02 18:52:08,399 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 18:52:09,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:52:09,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:52:12,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=982980.0, ans=0.125 2023-10-02 18:52:17,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 18:52:18,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:52:18,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:52:19,839 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.60 vs. limit=15.0 2023-10-02 18:52:20,165 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 18:52:21,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:52:22,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 18:52:22,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:52:24,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:52:24,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:52:25,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:52:26,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:52:26,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:52:28,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 18:52:28,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:52:31,535 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 18:52:36,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:52:37,248 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.97 vs. limit=22.5 2023-10-02 18:52:37,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 18:52:40,741 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.514e+02 1.835e+02 2.053e+02 2.244e+02 3.735e+02, threshold=4.107e+02, percent-clipped=0.0 2023-10-02 18:52:40,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:52:40,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:52:42,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:52:44,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:52:44,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=983113.3333333334, ans=0.125 2023-10-02 18:52:49,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:52:49,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=983113.3333333334, ans=0.05 2023-10-02 18:52:52,210 INFO [train.py:1046] (1/4) Epoch 28, batch 4050, loss[loss=0.1616, simple_loss=0.2494, pruned_loss=0.0369, over 24659.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2445, pruned_loss=0.04457, over 4699924.49 frames. ], batch size: 65, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:52:52,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 18:52:53,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 18:52:55,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:52:55,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:52:56,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:52:57,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:52:57,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:53:02,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:53:05,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:53:06,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:53:08,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:53:08,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:53:11,976 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.54 vs. limit=22.5 2023-10-02 18:53:13,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:53:15,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:53:18,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=983246.6666666666, ans=0.125 2023-10-02 18:53:19,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 18:53:20,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 18:53:20,604 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 18:53:22,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:53:30,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 18:53:30,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:53:34,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:53:34,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=983380.0, ans=0.2 2023-10-02 18:53:34,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=983380.0, ans=0.125 2023-10-02 18:53:35,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:53:36,618 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.41 vs. limit=15.0 2023-10-02 18:53:37,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:53:37,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:53:40,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:53:45,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 18:53:45,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:53:47,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:53:48,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 18:53:51,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:53:58,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 18:53:58,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:53:58,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:54:01,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 18:54:01,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 18:54:01,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:54:02,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:54:02,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:02,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:54:05,492 INFO [train.py:1046] (1/4) Epoch 28, batch 4100, loss[loss=0.2252, simple_loss=0.2926, pruned_loss=0.07893, over 19631.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.245, pruned_loss=0.04481, over 4705891.02 frames. ], batch size: 389, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:54:10,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 18:54:10,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=983513.3333333334, ans=0.2 2023-10-02 18:54:12,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 18:54:12,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=983513.3333333334, ans=0.125 2023-10-02 18:54:13,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 18:54:14,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 18:54:14,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:54:15,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:15,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:15,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:54:17,186 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 18:54:19,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:54:20,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:54:20,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:54:21,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:54:23,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=983580.0, ans=0.5 2023-10-02 18:54:25,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:54:25,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:54:26,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:54:27,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 18:54:28,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:28,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:54:28,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:54:28,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:54:29,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 18:54:32,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:54:35,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 18:54:36,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:54:38,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:54:38,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 18:54:40,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:54:41,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:54:41,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:54:44,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 18:54:47,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 18:54:47,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:54:49,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=983713.3333333334, ans=0.5 2023-10-02 18:54:50,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 18:54:52,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:53,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:54:56,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:55:01,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:55:01,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=983713.3333333334, ans=0.1 2023-10-02 18:55:03,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=983780.0, ans=0.0 2023-10-02 18:55:04,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:55:05,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:55:07,211 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.958e+02 2.238e+02 2.586e+02 4.135e+02, threshold=4.476e+02, percent-clipped=1.0 2023-10-02 18:55:07,842 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.55 vs. limit=15.0 2023-10-02 18:55:11,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:55:11,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:55:13,581 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:55:15,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:55:17,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:55:19,515 INFO [train.py:1046] (1/4) Epoch 28, batch 4150, loss[loss=0.1802, simple_loss=0.262, pruned_loss=0.04918, over 23991.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2456, pruned_loss=0.04455, over 4707355.14 frames. ], batch size: 86, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 18:55:22,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:55:23,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:55:24,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:55:24,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:55:27,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 18:55:27,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:55:28,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 18:55:28,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 18:55:28,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 18:55:30,069 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=983846.6666666666, ans=0.125 2023-10-02 18:55:31,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:55:34,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:55:34,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:55:34,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=983913.3333333334, ans=0.125 2023-10-02 18:55:39,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:55:41,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:55:41,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:55:42,146 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.02 vs. limit=15.0 2023-10-02 18:55:42,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 18:55:42,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:55:44,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:55:45,012 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.99 vs. limit=22.5 2023-10-02 18:55:47,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:55:50,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:55:53,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 18:55:55,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 18:55:55,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:55:57,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 18:55:57,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:55:57,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:56:00,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:56:02,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:56:03,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 18:56:07,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:56:09,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:56:09,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 18:56:10,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:56:12,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 18:56:15,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:56:15,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:56:16,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:56:18,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 18:56:18,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:56:18,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:56:19,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:56:21,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 18:56:21,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:56:21,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:56:21,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:56:23,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 18:56:23,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:56:23,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:56:25,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:56:26,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:56:26,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 18:56:27,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:56:33,343 INFO [train.py:1046] (1/4) Epoch 28, batch 4200, loss[loss=0.1641, simple_loss=0.2126, pruned_loss=0.05777, over 19424.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2442, pruned_loss=0.04393, over 4704972.62 frames. ], batch size: 388, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 18:56:33,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:56:35,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 18:56:36,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:56:38,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:56:39,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:56:39,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:56:40,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:56:43,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 18:56:46,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 18:56:46,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:56:48,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:56:49,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:56:54,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 18:56:56,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:56:57,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:56:57,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 18:56:58,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:57:00,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:57:00,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=984246.6666666666, ans=0.0 2023-10-02 18:57:01,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:57:01,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:57:02,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:57:03,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=984313.3333333334, ans=0.1 2023-10-02 18:57:04,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 18:57:04,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:57:05,037 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.55 vs. limit=22.5 2023-10-02 18:57:05,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=984313.3333333334, ans=0.1 2023-10-02 18:57:07,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:57:09,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:57:11,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:57:13,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:57:15,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:57:15,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 18:57:15,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:57:16,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:57:17,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=984380.0, ans=0.0 2023-10-02 18:57:18,320 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.55 vs. limit=12.0 2023-10-02 18:57:20,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=984380.0, ans=0.09899494936611666 2023-10-02 18:57:22,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:57:23,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:57:29,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:57:30,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=984380.0, ans=0.125 2023-10-02 18:57:32,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 18:57:35,472 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.827e+02 2.035e+02 2.482e+02 4.070e+02, threshold=4.070e+02, percent-clipped=0.0 2023-10-02 18:57:35,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:57:35,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=984446.6666666666, ans=0.125 2023-10-02 18:57:38,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:57:40,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:57:41,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 18:57:47,013 INFO [train.py:1046] (1/4) Epoch 28, batch 4250, loss[loss=0.1539, simple_loss=0.2388, pruned_loss=0.03455, over 24486.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2427, pruned_loss=0.04377, over 4698946.98 frames. ], batch size: 63, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 18:57:47,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:57:51,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:57:51,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:57:55,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:57:56,067 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.12 vs. limit=22.5 2023-10-02 18:57:59,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:57:59,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 18:58:00,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:58:02,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:58:03,106 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.94 vs. limit=22.5 2023-10-02 18:58:06,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:58:09,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:58:09,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:58:12,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:58:12,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:58:12,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:58:14,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:58:15,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:58:16,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:58:16,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:58:17,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=984646.6666666666, ans=0.125 2023-10-02 18:58:20,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 18:58:23,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 18:58:23,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:58:25,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:58:25,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:58:26,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:58:26,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:58:27,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:58:30,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=984713.3333333334, ans=0.125 2023-10-02 18:58:31,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 18:58:33,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:58:37,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:58:40,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:58:40,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 18:58:40,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:58:41,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 18:58:43,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:58:44,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:58:46,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:58:46,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:58:47,091 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.70 vs. limit=6.0 2023-10-02 18:58:48,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 18:58:50,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:58:50,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:58:55,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:58:57,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:58:57,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:58:57,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=984780.0, ans=0.125 2023-10-02 18:58:57,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=984780.0, ans=0.125 2023-10-02 18:58:58,028 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.16 vs. limit=22.5 2023-10-02 18:58:58,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:59:00,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:59:01,437 INFO [train.py:1046] (1/4) Epoch 28, batch 4300, loss[loss=0.1652, simple_loss=0.2384, pruned_loss=0.04603, over 23620.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2423, pruned_loss=0.04348, over 4707348.47 frames. ], batch size: 256, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 18:59:01,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:59:02,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:59:02,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 18:59:04,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:59:08,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:59:10,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:59:13,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:59:20,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:59:20,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 18:59:22,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:59:23,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:59:23,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:59:23,908 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 18:59:26,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=984913.3333333334, ans=0.1 2023-10-02 18:59:28,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:59:29,141 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.38 vs. limit=15.0 2023-10-02 18:59:29,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:59:32,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 18:59:33,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:59:33,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 18:59:35,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 18:59:36,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:59:39,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:59:39,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:59:41,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:59:41,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=984980.0, ans=0.0 2023-10-02 18:59:44,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:59:45,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:59:45,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 18:59:46,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 18:59:48,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:59:51,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:59:51,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 18:59:51,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:59:51,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:59:51,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 18:59:51,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 18:59:52,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 18:59:52,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:59:52,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 18:59:54,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 18:59:58,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:59:59,216 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=985113.3333333334, ans=0.125 2023-10-02 19:00:00,812 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 19:00:00,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:00:02,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:00:02,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:00:03,578 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.971e+02 2.225e+02 2.582e+02 4.307e+02, threshold=4.451e+02, percent-clipped=1.0 2023-10-02 19:00:05,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 19:00:05,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:00:05,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:00:06,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:00:06,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:00:07,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:00:09,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:00:10,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:00:13,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:00:13,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:00:15,289 INFO [train.py:1046] (1/4) Epoch 28, batch 4350, loss[loss=0.1732, simple_loss=0.2448, pruned_loss=0.05085, over 23526.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2432, pruned_loss=0.04361, over 4711764.38 frames. ], batch size: 285, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:00:18,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 19:00:18,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:00:24,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:00:26,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:00:29,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:00:29,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:00:32,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:00:36,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:00:39,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:00:39,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:00:43,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:00:45,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:00:46,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:00:52,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 19:00:53,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:00:53,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:00:58,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:02,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 19:01:02,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=985380.0, ans=0.2 2023-10-02 19:01:05,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:01:06,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:01:10,773 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 19:01:13,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:01:13,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:01:13,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=985446.6666666666, ans=0.2 2023-10-02 19:01:15,316 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 19:01:15,386 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 19:01:15,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:01:15,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:01:16,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:01:16,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:01:18,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:01:18,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:01:20,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 19:01:20,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:20,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:01:20,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:22,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 19:01:24,820 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 19:01:24,825 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 19:01:24,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 19:01:28,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:01:28,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:01:28,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:01:29,575 INFO [train.py:1046] (1/4) Epoch 28, batch 4400, loss[loss=0.1777, simple_loss=0.2683, pruned_loss=0.04359, over 24377.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2438, pruned_loss=0.04374, over 4716635.29 frames. ], batch size: 74, lr: 3.63e-03, grad_scale: 32.0 2023-10-02 19:01:29,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:01:31,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 19:01:33,691 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 19:01:33,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:37,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:01:37,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:38,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:01:39,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=985513.3333333334, ans=0.125 2023-10-02 19:01:40,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 19:01:41,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 19:01:41,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 19:01:41,897 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 19:01:43,879 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.74 vs. limit=15.0 2023-10-02 19:01:44,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 19:01:44,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:01:46,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 19:01:48,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:48,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=985580.0, ans=0.125 2023-10-02 19:01:49,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:01:49,401 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 19:01:52,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:01:52,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 19:01:52,297 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 19:01:53,205 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.82 vs. limit=22.5 2023-10-02 19:01:54,330 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.32 vs. limit=10.0 2023-10-02 19:01:56,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 19:01:56,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 19:01:56,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 19:01:56,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:01:58,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:01:58,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:01:59,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:01:59,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=985646.6666666666, ans=0.1 2023-10-02 19:02:01,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 19:02:02,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 19:02:03,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:02:04,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:02:04,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:02:06,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:02:06,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:02:06,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 19:02:07,632 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 19:02:11,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:02:15,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:02:19,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 19:02:22,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:02:24,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:02:28,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:02:29,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 19:02:29,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:02:29,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:02:29,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:02:29,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:02:32,186 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.814e+02 2.040e+02 2.293e+02 3.611e+02, threshold=4.080e+02, percent-clipped=0.0 2023-10-02 19:02:33,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 19:02:35,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 19:02:37,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 19:02:37,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:02:37,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 19:02:38,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:02:41,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:02:42,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 19:02:43,983 INFO [train.py:1046] (1/4) Epoch 28, batch 4450, loss[loss=0.1661, simple_loss=0.2449, pruned_loss=0.04367, over 23243.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2452, pruned_loss=0.04445, over 4702106.44 frames. ], batch size: 93, lr: 3.63e-03, grad_scale: 32.0 2023-10-02 19:02:46,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:02:48,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:02:48,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:02:54,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:02:56,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:03:00,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:03:01,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:03:04,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:03:04,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:03:06,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 19:03:06,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:03:06,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:03:06,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:03:06,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:03:09,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 19:03:10,160 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.84 vs. limit=15.0 2023-10-02 19:03:15,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:03:15,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:03:17,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:03:17,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:03:18,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:03:24,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 19:03:24,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 19:03:25,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 19:03:25,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:03:26,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:03:28,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 19:03:31,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:03:34,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:03:36,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 19:03:36,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:03:36,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:03:36,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:03:36,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:03:38,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:03:42,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 19:03:43,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 19:03:45,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:03:48,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:03:49,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:03:51,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:03:51,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 19:03:53,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:03:56,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 19:03:57,701 INFO [train.py:1046] (1/4) Epoch 28, batch 4500, loss[loss=0.157, simple_loss=0.2237, pruned_loss=0.04519, over 23533.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2453, pruned_loss=0.04467, over 4702273.10 frames. ], batch size: 256, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:03:57,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:03:58,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=986180.0, ans=0.125 2023-10-02 19:03:58,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten.whitening_limit, batch_count=986180.0, ans=15.0 2023-10-02 19:04:00,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:04:00,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=986180.0, ans=0.125 2023-10-02 19:04:01,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 19:04:01,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 19:04:01,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=986180.0, ans=0.125 2023-10-02 19:04:04,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:04:09,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:04:09,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:04:09,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:04:11,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:04:11,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:04:13,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:04:16,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=986246.6666666666, ans=0.125 2023-10-02 19:04:23,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:04:24,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:04:24,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=986246.6666666666, ans=0.1 2023-10-02 19:04:27,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:04:27,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:04:28,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:04:34,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:04:38,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:04:43,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:04:46,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:04:46,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 19:04:47,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:04:49,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:04:50,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:04:50,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:04:52,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:04:52,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 19:04:52,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:04:52,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:04:56,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:04:56,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:04:57,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:05:00,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:05:00,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:05:02,251 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.810e+02 2.088e+02 2.457e+02 3.731e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-02 19:05:03,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 19:05:05,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 19:05:05,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 19:05:06,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 19:05:10,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 19:05:10,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:05:13,563 INFO [train.py:1046] (1/4) Epoch 28, batch 4550, loss[loss=0.1786, simple_loss=0.2414, pruned_loss=0.05787, over 23712.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2445, pruned_loss=0.04477, over 4701585.96 frames. ], batch size: 164, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:05:15,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:05:15,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:05:19,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:05:22,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:05:22,813 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.61 vs. limit=15.0 2023-10-02 19:05:25,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:05:25,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:05:25,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:05:25,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:05:29,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:05:29,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:05:32,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:05:35,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 19:05:37,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 19:05:38,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:05:38,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 19:05:43,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 19:05:45,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:05:47,223 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.68 vs. limit=15.0 2023-10-02 19:05:50,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 19:05:51,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:05:52,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=986646.6666666666, ans=0.125 2023-10-02 19:05:54,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:05:54,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:05:56,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:05:57,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 19:06:00,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:06:03,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:06:03,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:06:04,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:06:06,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 19:06:07,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 19:06:07,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:06:09,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 19:06:11,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 19:06:11,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:06:12,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:06:12,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:06:13,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:06:13,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:06:17,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:06:18,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 19:06:18,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:06:18,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 19:06:18,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=986780.0, ans=0.0 2023-10-02 19:06:20,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 19:06:20,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:06:20,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 19:06:23,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:06:23,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:06:25,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:06:26,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:06:26,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:06:26,277 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=986780.0, ans=0.0 2023-10-02 19:06:28,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:06:30,047 INFO [train.py:1046] (1/4) Epoch 28, batch 4600, loss[loss=0.1674, simple_loss=0.2496, pruned_loss=0.0426, over 23552.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.243, pruned_loss=0.04435, over 4696870.58 frames. ], batch size: 120, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:06:30,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:06:31,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:06:31,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:06:31,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=986846.6666666666, ans=0.2 2023-10-02 19:06:35,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:06:35,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:06:38,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:06:38,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=986846.6666666666, ans=0.125 2023-10-02 19:06:39,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 19:06:40,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:06:43,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:06:45,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:06:48,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:06:54,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 19:06:55,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:06:57,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:00,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:07:00,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:07:05,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 19:07:05,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:07:06,387 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.12 vs. limit=22.5 2023-10-02 19:07:08,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:07:10,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:12,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:07:12,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:07:14,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 19:07:16,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:07:21,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:07:22,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:07:25,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:07:25,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 19:07:25,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:27,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 19:07:27,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:07:27,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:07:29,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:07:29,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:07:31,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:07:32,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 19:07:32,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 19:07:33,820 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.870e+02 2.077e+02 2.555e+02 3.884e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-02 19:07:33,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 19:07:33,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:07:34,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:07:35,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:07:35,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:07:38,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=987113.3333333334, ans=0.125 2023-10-02 19:07:44,200 INFO [train.py:1046] (1/4) Epoch 28, batch 4650, loss[loss=0.1545, simple_loss=0.2436, pruned_loss=0.03271, over 24481.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2434, pruned_loss=0.04448, over 4698161.76 frames. ], batch size: 66, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:07:46,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:07:48,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:07:48,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=987180.0, ans=0.125 2023-10-02 19:07:49,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:49,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:07:49,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:07:49,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:07:53,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:56,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 19:07:59,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:07:59,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 19:07:59,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:08:00,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 19:08:00,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:08:02,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 19:08:02,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 19:08:02,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:08:03,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:08:06,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:08:08,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:08:08,144 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 19:08:10,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:08:10,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 19:08:13,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:08:13,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:08:16,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 19:08:16,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:08:19,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:08:24,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:08:29,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:08:31,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:08:31,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:08:31,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:08:35,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 19:08:35,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 19:08:35,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 19:08:35,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 19:08:38,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:08:44,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:08:44,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:08:44,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 19:08:45,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:08:47,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:08:47,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:08:47,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:08:47,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=987446.6666666666, ans=0.0 2023-10-02 19:08:48,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:08:48,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:08:50,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:08:50,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=987446.6666666666, ans=0.0 2023-10-02 19:08:53,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:08:54,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:08:54,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:08:54,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 19:08:57,390 INFO [train.py:1046] (1/4) Epoch 28, batch 4700, loss[loss=0.1684, simple_loss=0.2617, pruned_loss=0.03751, over 24326.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2443, pruned_loss=0.04452, over 4711137.08 frames. ], batch size: 74, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:08:57,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:08:58,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 19:09:00,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=987513.3333333334, ans=0.0 2023-10-02 19:09:06,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:09:08,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:09:08,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:09:08,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:09:10,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:09:14,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 19:09:14,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 19:09:17,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:09:19,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:09:20,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:09:23,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:09:30,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:09:32,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 19:09:33,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:09:38,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 19:09:40,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:09:42,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:09:47,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 19:09:47,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:09:53,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:09:53,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 19:09:55,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:09:55,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:09:56,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=987780.0, ans=0.125 2023-10-02 19:09:57,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:09:57,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:09:57,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 19:09:59,311 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 19:09:59,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:10:00,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:10:00,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:10:02,129 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.850e+02 1.974e+02 2.169e+02 3.460e+02, threshold=3.948e+02, percent-clipped=0.0 2023-10-02 19:10:02,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 19:10:02,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:10:04,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 19:10:09,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:10:09,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:10:10,753 INFO [train.py:1046] (1/4) Epoch 28, batch 4750, loss[loss=0.1949, simple_loss=0.2746, pruned_loss=0.05758, over 23722.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.245, pruned_loss=0.04467, over 4720199.77 frames. ], batch size: 85, lr: 3.63e-03, grad_scale: 8.0 2023-10-02 19:10:13,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:10:13,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:10:15,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 19:10:15,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:10:17,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 19:10:21,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:10:21,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:10:21,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:10:26,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 19:10:32,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:10:34,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 19:10:34,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:10:39,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:10:39,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:10:39,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:10:39,364 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 19:10:40,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 19:10:46,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 19:10:49,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:10:50,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:10:52,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=987980.0, ans=0.2 2023-10-02 19:10:53,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:10:53,687 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 19:10:53,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:10:56,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:10:59,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:11:00,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 19:11:00,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=988046.6666666666, ans=0.1 2023-10-02 19:11:01,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 19:11:01,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:11:01,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:11:01,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:11:03,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 19:11:03,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 19:11:04,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 19:11:09,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:11:14,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:11:14,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 19:11:14,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:11:15,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:11:17,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:11:18,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:11:18,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:11:21,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:11:22,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 19:11:23,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 19:11:24,794 INFO [train.py:1046] (1/4) Epoch 28, batch 4800, loss[loss=0.1572, simple_loss=0.2448, pruned_loss=0.0348, over 24629.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2456, pruned_loss=0.04479, over 4721757.09 frames. ], batch size: 68, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:11:24,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 19:11:26,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:11:27,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:11:27,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 19:11:30,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:11:32,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:11:39,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:11:40,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:11:41,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:11:41,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 19:11:42,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:11:42,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:11:43,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:11:47,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:11:48,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:11:49,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:11:51,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:11:51,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 19:11:51,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:11:51,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:11:56,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:11:58,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:12:01,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:12:01,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:12:02,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 19:12:02,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:12:04,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 19:12:04,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 19:12:04,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:12:05,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:12:05,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:12:05,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:12:05,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:12:08,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:12:08,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:12:13,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:12:14,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:15,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:12:19,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 19:12:21,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:12:21,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:21,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:12:22,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:12:27,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:12:27,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:12:27,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:28,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:12:28,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:12:28,923 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:12:30,076 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.923e+02 2.078e+02 2.346e+02 3.782e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-02 19:12:30,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:12:34,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:12:34,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:34,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:12:35,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 19:12:38,350 INFO [train.py:1046] (1/4) Epoch 28, batch 4850, loss[loss=0.1598, simple_loss=0.2381, pruned_loss=0.04079, over 23379.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2455, pruned_loss=0.04481, over 4729477.16 frames. ], batch size: 119, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:12:38,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 19:12:38,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:12:38,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:12:40,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:12:40,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:41,238 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.06 vs. limit=15.0 2023-10-02 19:12:43,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:12:49,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 19:12:52,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:12:56,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:12:57,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:12:57,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:13:00,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:13:01,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:13:02,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:13:03,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=988580.0, ans=0.125 2023-10-02 19:13:04,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 19:13:06,319 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.02 vs. limit=6.0 2023-10-02 19:13:06,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:13:09,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:13:09,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:13:11,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:13:11,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 19:13:11,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=988646.6666666666, ans=0.0 2023-10-02 19:13:13,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=988646.6666666666, ans=0.1 2023-10-02 19:13:14,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:13:14,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:13:19,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:13:19,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 19:13:19,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 19:13:21,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:13:30,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:13:30,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 19:13:31,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:13:31,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:13:32,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:13:34,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 19:13:34,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:13:37,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 19:13:37,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:13:37,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:13:38,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 19:13:38,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=988780.0, ans=0.0 2023-10-02 19:13:46,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=988780.0, ans=0.0 2023-10-02 19:13:47,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:13:52,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:13:52,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:13:54,089 INFO [train.py:1046] (1/4) Epoch 28, batch 4900, loss[loss=0.178, simple_loss=0.2451, pruned_loss=0.05547, over 23741.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2448, pruned_loss=0.04455, over 4733237.70 frames. ], batch size: 164, lr: 3.63e-03, grad_scale: 8.0 2023-10-02 19:13:56,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 19:13:56,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:14:02,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:14:03,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:14:04,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:14:06,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=988846.6666666666, ans=0.0 2023-10-02 19:14:07,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 19:14:07,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=988913.3333333334, ans=0.05 2023-10-02 19:14:12,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 19:14:15,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 19:14:17,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 19:14:17,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:14:17,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:14:17,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:14:17,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:14:19,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:14:19,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 19:14:22,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=988980.0, ans=0.125 2023-10-02 19:14:23,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 19:14:25,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:14:25,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:14:26,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:14:26,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:14:28,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:14:28,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:14:28,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 19:14:29,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:14:31,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:14:31,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 19:14:31,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 19:14:32,036 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=988980.0, ans=0.0 2023-10-02 19:14:35,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 19:14:37,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:14:38,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:14:38,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:14:39,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:14:39,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 19:14:39,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:14:39,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 19:14:44,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:14:45,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 19:14:47,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:14:51,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 19:14:52,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:14:52,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 19:14:53,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 19:14:58,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=989113.3333333334, ans=0.1 2023-10-02 19:14:59,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:15:01,254 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.868e+02 2.016e+02 2.240e+02 3.516e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-02 19:15:01,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:15:01,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=989113.3333333334, ans=0.0 2023-10-02 19:15:04,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 19:15:04,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 19:15:04,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:15:05,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:15:08,285 INFO [train.py:1046] (1/4) Epoch 28, batch 4950, loss[loss=0.15, simple_loss=0.2261, pruned_loss=0.03698, over 24471.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2429, pruned_loss=0.04425, over 4718687.49 frames. ], batch size: 58, lr: 3.63e-03, grad_scale: 8.0 2023-10-02 19:15:09,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:15:09,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:15:09,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:15:09,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 19:15:10,421 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.61 vs. limit=12.0 2023-10-02 19:15:12,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:15:15,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:15:15,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 19:15:17,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 19:15:19,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 19:15:19,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:15:19,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=989180.0, ans=0.1 2023-10-02 19:15:20,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 19:15:20,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:20,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:15:22,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:15:22,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:15:23,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:15:25,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:15:26,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:15:26,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:15:27,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=989246.6666666666, ans=0.125 2023-10-02 19:15:29,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:29,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:15:31,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:15:35,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:36,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:15:38,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:38,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:15:39,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:15:39,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=989313.3333333334, ans=0.125 2023-10-02 19:15:42,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 19:15:42,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 19:15:45,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:15:47,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:15:47,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:15:49,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:15:50,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:15:50,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:15:50,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=989313.3333333334, ans=0.125 2023-10-02 19:15:53,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:15:55,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:15:57,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:15:58,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:58,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:15:59,326 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.08 vs. limit=8.0 2023-10-02 19:15:59,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 19:15:59,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:16:01,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:16:07,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:16:08,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:16:08,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:16:08,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:16:08,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:16:09,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:16:11,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:16:12,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:16:12,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:16:12,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 19:16:17,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:16:21,701 INFO [train.py:1046] (1/4) Epoch 28, batch 5000, loss[loss=0.1571, simple_loss=0.2332, pruned_loss=0.04052, over 23734.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2422, pruned_loss=0.04384, over 4711749.75 frames. ], batch size: 232, lr: 3.62e-03, grad_scale: 8.0 2023-10-02 19:16:21,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 19:16:21,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 19:16:22,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=989513.3333333334, ans=0.0 2023-10-02 19:16:26,849 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.07 vs. limit=22.5 2023-10-02 19:16:27,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:16:28,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:16:30,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 19:16:30,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 19:16:33,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:16:34,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 19:16:34,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:16:34,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:16:36,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 19:16:36,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:16:37,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:16:38,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 19:16:38,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:16:38,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:16:41,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 19:16:41,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 19:16:41,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:16:43,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 19:16:43,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:16:43,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:16:43,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:16:43,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 19:16:44,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 19:16:45,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 19:16:45,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:16:47,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:16:47,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 19:16:49,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:16:51,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:16:51,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:16:51,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=989646.6666666666, ans=0.2 2023-10-02 19:16:53,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 19:16:54,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 19:16:55,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:16:57,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:17:01,782 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 19:17:05,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:17:07,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:17:07,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:10,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 19:17:10,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:17:10,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:17:10,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=989713.3333333334, ans=0.125 2023-10-02 19:17:11,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:17:11,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=989713.3333333334, ans=0.125 2023-10-02 19:17:12,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 19:17:12,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:17:15,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:17:16,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:17:19,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=989780.0, ans=15.0 2023-10-02 19:17:23,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 19:17:28,006 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.761e+02 1.932e+02 2.126e+02 3.192e+02, threshold=3.865e+02, percent-clipped=0.0 2023-10-02 19:17:28,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:35,324 INFO [train.py:1046] (1/4) Epoch 28, batch 5050, loss[loss=0.1687, simple_loss=0.2416, pruned_loss=0.04797, over 23363.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2422, pruned_loss=0.04349, over 4726575.76 frames. ], batch size: 119, lr: 3.62e-03, grad_scale: 8.0 2023-10-02 19:17:36,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:17:36,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:36,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:17:36,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:17:38,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:17:38,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:17:38,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:41,431 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:17:43,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:43,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 19:17:43,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:17:46,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:17:47,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:17:47,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 19:17:49,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:17:49,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:17:52,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:17:54,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:17:54,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:17:56,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=989913.3333333334, ans=0.125 2023-10-02 19:18:02,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=989913.3333333334, ans=0.05 2023-10-02 19:18:03,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 19:18:03,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 19:18:05,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:18:05,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 19:18:05,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:18:06,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:18:08,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:18:08,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:18:08,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 19:18:09,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 19:18:10,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:18:11,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=989980.0, ans=0.1 2023-10-02 19:18:12,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:18:15,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:18:16,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 19:18:17,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:18:21,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 19:18:21,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=990046.6666666666, ans=0.0 2023-10-02 19:18:22,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:18:22,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:18:22,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:18:24,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:18:26,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:18:28,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:18:30,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:18:30,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:18:30,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:18:30,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 19:18:32,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:18:33,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:18:38,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:18:38,392 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 19:18:38,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 19:18:39,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:18:39,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:18:41,025 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 19:18:43,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:18:43,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 19:18:43,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:18:46,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:18:46,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:18:48,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 19:18:49,391 INFO [train.py:1046] (1/4) Epoch 28, batch 5100, loss[loss=0.1672, simple_loss=0.2609, pruned_loss=0.03676, over 24567.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2434, pruned_loss=0.04412, over 4718293.57 frames. ], batch size: 71, lr: 3.62e-03, grad_scale: 8.0 2023-10-02 19:18:49,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 19:18:50,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:18:52,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:18:53,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:18:54,415 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 19:18:55,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:18:57,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 19:18:58,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 19:18:59,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:19:01,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:19:03,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:19:04,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 19:19:04,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 19:19:04,743 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=990246.6666666666, ans=0.125 2023-10-02 19:19:09,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:19:09,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:19:13,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:19:15,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 19:19:15,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:19:17,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:19:17,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 19:19:21,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:19:22,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:19:22,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 19:19:24,354 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 19:19:25,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:19:25,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 19:19:26,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 19:19:29,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:19:35,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:19:38,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 19:19:38,697 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 19:19:38,705 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 19:19:40,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 19:19:40,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:19:44,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 19:19:47,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 19:19:50,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 19:19:52,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:19:54,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 19:19:55,503 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.787e+02 2.059e+02 2.354e+02 3.734e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-02 19:19:57,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:19:58,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 19:19:59,359 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.75 vs. limit=15.0 2023-10-02 19:19:59,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.71 vs. limit=12.0 2023-10-02 19:20:02,928 INFO [train.py:1046] (1/4) Epoch 28, batch 5150, loss[loss=0.1565, simple_loss=0.2338, pruned_loss=0.0396, over 19347.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2441, pruned_loss=0.04395, over 4727941.04 frames. ], batch size: 41, lr: 3.62e-03, grad_scale: 8.0 2023-10-02 19:20:04,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:20:04,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:20:04,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:20:04,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:20:06,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:20:06,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:20:06,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 19:20:06,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 19:20:07,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 19:20:07,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:20:07,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 19:20:09,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:20:09,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 19:20:10,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:20:11,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:20:14,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=990513.3333333334, ans=0.0 2023-10-02 19:20:16,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 19:20:16,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 19:20:18,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:20:18,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:20:18,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:20:18,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:20:18,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:20:20,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:20:20,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:20:20,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 19:20:22,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:20:23,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:20:26,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:20:28,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 19:20:29,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:20:29,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=990580.0, ans=0.1 2023-10-02 19:20:34,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:20:37,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 19:20:41,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:20:42,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=990646.6666666666, ans=0.1 2023-10-02 19:20:45,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:20:45,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:20:49,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:20:50,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:20:52,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 19:20:57,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:20:59,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:20:59,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:21:03,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:21:04,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:21:05,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 19:21:06,678 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:21:09,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:21:11,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:21:14,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:21:14,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:21:15,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 19:21:15,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:21:15,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:21:15,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:21:16,764 INFO [train.py:1046] (1/4) Epoch 28, batch 5200, loss[loss=0.1408, simple_loss=0.2182, pruned_loss=0.03174, over 24336.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.245, pruned_loss=0.04439, over 4725311.39 frames. ], batch size: 56, lr: 3.62e-03, grad_scale: 16.0 2023-10-02 19:21:19,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:21:19,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:21:22,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:21:28,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 19:21:30,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:21:31,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:21:34,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:21:34,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:21:34,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:21:34,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=990913.3333333334, ans=0.125 2023-10-02 19:21:37,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 19:21:40,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:21:40,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:21:42,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 19:21:43,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:21:45,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:21:46,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 19:21:46,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 19:21:49,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 19:21:50,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:21:50,671 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 19:21:50,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:21:50,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=990980.0, ans=0.0 2023-10-02 19:21:52,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:21:52,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:21:53,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 19:21:53,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:21:55,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:21:57,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 19:22:00,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 19:22:00,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 19:22:04,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 19:22:05,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:22:10,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:22:10,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:22:11,271 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.22 vs. limit=15.0 2023-10-02 19:22:12,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 19:22:12,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:22:12,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 19:22:12,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:22:13,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:22:16,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:22:17,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:22:20,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:22:22,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:22:22,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:22:22,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=991113.3333333334, ans=0.2 2023-10-02 19:22:23,452 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.958e+02 2.159e+02 2.397e+02 4.088e+02, threshold=4.317e+02, percent-clipped=0.0 2023-10-02 19:22:26,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:22:27,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 19:22:28,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:22:28,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:22:28,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=991113.3333333334, ans=0.0 2023-10-02 19:22:29,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:22:30,962 INFO [train.py:1046] (1/4) Epoch 28, batch 5250, loss[loss=0.1624, simple_loss=0.2352, pruned_loss=0.04478, over 23590.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2442, pruned_loss=0.04417, over 4726252.76 frames. ], batch size: 149, lr: 3.62e-03, grad_scale: 16.0 2023-10-02 19:22:31,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:22:31,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:22:33,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:22:37,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:22:37,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:22:38,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:22:38,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=991180.0, ans=0.0 2023-10-02 19:22:44,966 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=15.26 vs. limit=15.0 2023-10-02 19:22:45,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:22:47,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:22:50,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:22:51,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:22:54,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 19:22:54,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:22:54,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:23:22,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=991380.0, ans=0.125 2023-10-02 19:23:26,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=991446.6666666666, ans=0.1 2023-10-02 19:23:26,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=991446.6666666666, ans=0.125 2023-10-02 19:23:39,953 INFO [train.py:1046] (1/4) Epoch 28, batch 5300, loss[loss=0.1578, simple_loss=0.2281, pruned_loss=0.04376, over 23781.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2431, pruned_loss=0.044, over 4714946.23 frames. ], batch size: 212, lr: 3.62e-03, grad_scale: 16.0 2023-10-02 19:23:47,302 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.47 vs. limit=12.0 2023-10-02 19:23:55,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:23:55,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 19:23:55,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 19:23:55,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:23:55,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:23:55,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:23:55,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:23:55,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:23:55,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:23:56,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:23:56,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:23:56,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:23:56,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 19:23:56,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 19:23:56,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 19:23:56,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 19:23:56,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 19:23:56,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 19:23:56,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:23:57,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:23:57,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:23:57,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:23:57,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:23:57,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:23:57,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:23:58,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:23:58,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:23:58,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:23:58,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:23:58,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:23:58,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:23:58,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 19:23:58,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:23:59,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:23:59,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 19:23:59,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 19:23:59,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:23:59,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:23:59,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 19:23:59,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 19:23:59,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 19:23:59,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:23:59,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:23:59,971 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 19:24:00,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 19:24:00,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:24:00,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:24:00,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 19:24:00,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 19:24:00,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 19:24:01,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 19:24:07,176 INFO [train.py:1046] (1/4) Epoch 29, batch 0, loss[loss=0.1598, simple_loss=0.2445, pruned_loss=0.03754, over 24680.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2445, pruned_loss=0.03754, over 24680.00 frames. ], batch size: 65, lr: 3.56e-03, grad_scale: 32.0 2023-10-02 19:24:07,176 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 19:24:17,937 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([1.9538, 1.9157, 4.7621, 4.3935], device='cuda:1') 2023-10-02 19:24:19,057 INFO [train.py:1078] (1/4) Epoch 29, validation: loss=0.3081, simple_loss=0.2785, pruned_loss=0.1688, over 1125622.00 frames. 2023-10-02 19:24:19,058 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 19:24:20,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 19:24:20,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:24:22,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:24:27,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:24:27,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:24:27,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:24:29,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 19:24:30,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 19:24:33,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:24:33,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:24:38,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:24:38,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:24:39,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:24:39,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:24:40,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 19:24:43,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:24:50,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:24:50,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:24:54,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 19:24:57,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:24:57,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:24:59,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:24:59,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=991733.3333333334, ans=0.2 2023-10-02 19:25:03,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:25:05,700 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.855e+02 2.126e+02 2.436e+02 5.590e+02, threshold=4.252e+02, percent-clipped=2.0 2023-10-02 19:25:07,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:25:13,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 19:25:17,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 19:25:18,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:25:18,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:25:18,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:25:19,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:25:21,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 19:25:22,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:25:24,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:25:26,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:25:29,436 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 19:25:30,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:25:32,151 INFO [train.py:1046] (1/4) Epoch 29, batch 50, loss[loss=0.1811, simple_loss=0.2678, pruned_loss=0.0472, over 24436.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2458, pruned_loss=0.04379, over 1075648.81 frames. ], batch size: 77, lr: 3.56e-03, grad_scale: 32.0 2023-10-02 19:25:33,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:25:34,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:25:35,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 19:25:36,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:25:36,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:25:37,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:25:38,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=991933.3333333334, ans=0.125 2023-10-02 19:25:41,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:25:43,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:25:44,520 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.96 vs. limit=15.0 2023-10-02 19:25:46,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 19:25:46,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:25:53,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:25:54,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 19:25:56,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 19:25:58,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:25:59,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=992000.0, ans=0.2 2023-10-02 19:26:00,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:26:00,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:26:01,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:26:03,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:26:03,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:26:03,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:26:09,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=992066.6666666666, ans=0.0 2023-10-02 19:26:10,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:26:13,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:26:13,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:26:14,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 19:26:15,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:26:17,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:26:17,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 19:26:17,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:26:18,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 19:26:26,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:26:26,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:26:27,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=992133.3333333334, ans=0.125 2023-10-02 19:26:28,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:26:28,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:26:28,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:26:32,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 19:26:32,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 19:26:33,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:26:34,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:26:34,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=992200.0, ans=0.125 2023-10-02 19:26:35,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:26:35,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:26:36,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 19:26:37,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 19:26:38,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 19:26:38,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:26:38,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:26:39,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 19:26:39,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 19:26:39,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=992200.0, ans=0.125 2023-10-02 19:26:41,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:26:42,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:26:45,434 INFO [train.py:1046] (1/4) Epoch 29, batch 100, loss[loss=0.1753, simple_loss=0.2585, pruned_loss=0.04609, over 24304.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2458, pruned_loss=0.04378, over 1894276.38 frames. ], batch size: 77, lr: 3.56e-03, grad_scale: 16.0 2023-10-02 19:26:45,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 19:26:45,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:26:46,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:26:47,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=992266.6666666666, ans=0.125 2023-10-02 19:26:47,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.95 vs. limit=22.5 2023-10-02 19:26:49,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:26:51,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:26:51,710 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.89 vs. limit=22.5 2023-10-02 19:26:52,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 19:26:52,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:26:56,122 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.93 vs. limit=15.0 2023-10-02 19:26:56,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:26:56,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:26:58,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:26:58,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:26:58,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:27:00,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 19:27:02,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:27:02,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:27:02,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:27:02,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:27:06,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 19:27:06,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:27:09,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:27:11,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:27:11,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=992333.3333333334, ans=0.1 2023-10-02 19:27:13,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:27:15,832 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 19:27:15,846 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 19:27:17,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:27:17,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:27:21,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:27:24,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:27:24,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=992400.0, ans=0.125 2023-10-02 19:27:25,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:30,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:30,482 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 19:27:31,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 19:27:34,993 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.846e+02 1.979e+02 2.261e+02 3.658e+02, threshold=3.958e+02, percent-clipped=0.0 2023-10-02 19:27:35,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:27:37,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:27:38,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:40,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:27:43,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:27:45,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:27:48,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:48,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:27:49,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:27:49,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:27:49,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:49,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 19:27:49,598 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 19:27:50,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:27:50,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:27:52,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:27:52,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:27:52,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 19:27:52,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 19:27:53,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:27:53,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:27:53,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:27:53,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:27:55,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:27:56,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:27:57,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:27:59,696 INFO [train.py:1046] (1/4) Epoch 29, batch 150, loss[loss=0.1779, simple_loss=0.2421, pruned_loss=0.05688, over 23666.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2476, pruned_loss=0.04518, over 2511525.02 frames. ], batch size: 232, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:27:59,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:27:59,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:28:01,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:04,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:28:04,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:07,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:28:08,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:11,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 19:28:11,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 19:28:11,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 19:28:14,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:28:14,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:28:14,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:28:17,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:28:17,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:28:17,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:18,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:20,053 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 19:28:21,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:28:23,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=992666.6666666666, ans=0.125 2023-10-02 19:28:26,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:28:30,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:28:30,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 19:28:33,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:28:33,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:28:33,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:28:36,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:28:37,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:28:40,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:28:41,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:28:41,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 19:28:48,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:28:48,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=992800.0, ans=0.0 2023-10-02 19:28:50,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:28:50,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:28:50,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:28:52,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:28:55,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 19:28:58,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:28:59,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:29:01,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:29:02,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:29:03,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 19:29:04,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:29:04,402 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 19:29:07,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:29:11,748 INFO [train.py:1046] (1/4) Epoch 29, batch 200, loss[loss=0.1826, simple_loss=0.2466, pruned_loss=0.05937, over 23825.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2483, pruned_loss=0.04565, over 2999632.31 frames. ], batch size: 179, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:29:11,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:29:11,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:29:16,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 19:29:17,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:29:17,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:29:19,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=992933.3333333334, ans=0.125 2023-10-02 19:29:20,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=992933.3333333334, ans=0.2 2023-10-02 19:29:21,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 19:29:22,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 19:29:24,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:29:24,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:29:29,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:29:29,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:29:29,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:29:30,087 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.18 vs. limit=8.0 2023-10-02 19:29:45,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:29:45,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:29:46,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:29:47,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=993066.6666666666, ans=0.0 2023-10-02 19:29:48,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:29:49,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 19:29:49,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:29:52,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:29:52,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:29:53,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:29:53,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:29:55,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 19:29:55,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 19:29:55,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:29:59,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:29:59,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=993133.3333333334, ans=0.04949747468305833 2023-10-02 19:30:00,533 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.804e+02 1.986e+02 2.199e+02 3.066e+02, threshold=3.973e+02, percent-clipped=0.0 2023-10-02 19:30:03,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:30:10,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:11,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:30:20,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:23,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 19:30:23,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:30:23,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:30:23,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:30:24,483 INFO [train.py:1046] (1/4) Epoch 29, batch 250, loss[loss=0.153, simple_loss=0.2355, pruned_loss=0.03526, over 24330.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2461, pruned_loss=0.04452, over 3384131.02 frames. ], batch size: 61, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:30:24,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:30:24,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 19:30:24,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=993266.6666666666, ans=0.0 2023-10-02 19:30:25,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:30:26,001 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 19:30:27,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:28,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:30:30,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:30,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:30:31,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=993266.6666666666, ans=0.125 2023-10-02 19:30:33,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:30:34,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:36,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:30:40,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:30:48,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=993333.3333333334, ans=0.0 2023-10-02 19:30:49,132 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.34 vs. limit=15.0 2023-10-02 19:30:53,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:30:54,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:30:55,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:30:56,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=993400.0, ans=0.0 2023-10-02 19:31:00,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:31:01,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:31:01,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:31:01,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:31:02,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:31:04,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:31:04,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:31:05,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:31:08,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 19:31:08,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:31:11,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:31:11,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:31:11,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:31:14,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:31:14,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:31:14,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:31:16,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:31:19,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:31:19,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:31:24,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:31:26,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:31:29,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:31:34,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:31:36,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:31:37,985 INFO [train.py:1046] (1/4) Epoch 29, batch 300, loss[loss=0.1544, simple_loss=0.215, pruned_loss=0.04694, over 23476.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2442, pruned_loss=0.04413, over 3665813.15 frames. ], batch size: 285, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:31:38,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 19:31:39,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:31:39,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:31:40,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 19:31:40,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 19:31:43,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:31:44,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 19:31:48,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:31:48,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:31:51,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:31:53,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 19:31:53,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:31:56,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:31:56,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 19:31:56,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:31:56,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=993666.6666666666, ans=0.05 2023-10-02 19:31:57,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=993666.6666666666, ans=0.125 2023-10-02 19:31:59,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:32:02,546 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.91 vs. limit=15.0 2023-10-02 19:32:03,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:32:04,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 19:32:07,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 19:32:07,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:10,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:32:10,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:10,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 19:32:10,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:32:13,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:32:15,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:32:15,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:32:20,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 19:32:20,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 19:32:21,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:32:24,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:26,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 19:32:27,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:32:28,949 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.830e+02 2.036e+02 2.251e+02 3.092e+02, threshold=4.071e+02, percent-clipped=0.0 2023-10-02 19:32:30,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:32:33,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:32:33,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 19:32:37,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:37,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:32:41,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:43,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:32:43,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 19:32:45,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:32:45,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:32:45,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 19:32:46,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:46,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:32:48,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:32:49,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:32:49,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:32:52,472 INFO [train.py:1046] (1/4) Epoch 29, batch 350, loss[loss=0.1726, simple_loss=0.2374, pruned_loss=0.05389, over 22847.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2425, pruned_loss=0.04379, over 3885868.89 frames. ], batch size: 322, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:32:54,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:32:54,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 19:32:57,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:02,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:33:05,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:33:06,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:09,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 19:33:11,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:33:11,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 19:33:14,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:16,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 19:33:16,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:33:17,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=994000.0, ans=0.2 2023-10-02 19:33:17,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=994000.0, ans=0.1 2023-10-02 19:33:19,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 19:33:21,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:33:22,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:33:22,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:33:23,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:33:23,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:33:25,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:33:25,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:33:25,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:33:26,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=994066.6666666666, ans=0.1 2023-10-02 19:33:27,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:33:27,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:32,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:33:34,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:33:35,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:33:35,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:33:38,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=994133.3333333334, ans=0.125 2023-10-02 19:33:41,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 19:33:41,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:45,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:33:45,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:33:45,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:33:46,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 19:33:46,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=994133.3333333334, ans=0.2 2023-10-02 19:33:49,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:33:50,997 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 19:33:52,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 19:33:52,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:33:54,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:33:54,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 19:33:55,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:33:58,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:33:58,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:34:00,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:34:00,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:34:04,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:34:05,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=994266.6666666666, ans=0.0 2023-10-02 19:34:05,696 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:34:06,741 INFO [train.py:1046] (1/4) Epoch 29, batch 400, loss[loss=0.1672, simple_loss=0.2366, pruned_loss=0.04897, over 23816.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2414, pruned_loss=0.04313, over 4060110.13 frames. ], batch size: 179, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:34:08,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:34:09,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:34:10,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 19:34:10,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:34:12,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:34:13,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:34:16,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:34:18,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:34:20,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:34:20,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=994333.3333333334, ans=0.1 2023-10-02 19:34:22,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=994333.3333333334, ans=0.125 2023-10-02 19:34:23,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 19:34:24,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 19:34:24,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:34:26,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 19:34:27,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:34:32,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:34:32,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:34:32,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 19:34:32,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:34:32,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:34:32,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:34:33,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:34:33,865 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 19:34:35,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 19:34:39,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:34:40,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:34:40,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 19:34:41,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 19:34:44,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:34:47,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:34:51,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=994466.6666666666, ans=0.0 2023-10-02 19:34:52,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 19:34:55,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 19:34:57,566 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.820e+02 1.968e+02 2.221e+02 3.877e+02, threshold=3.936e+02, percent-clipped=0.0 2023-10-02 19:34:57,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 19:34:59,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:34:59,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:34:59,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 19:35:01,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:35:03,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:35:04,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:35:07,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:35:07,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 19:35:10,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 19:35:11,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 19:35:14,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:35:14,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:35:17,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 19:35:19,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:35:20,554 INFO [train.py:1046] (1/4) Epoch 29, batch 450, loss[loss=0.183, simple_loss=0.2653, pruned_loss=0.05035, over 24433.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2422, pruned_loss=0.04336, over 4204082.78 frames. ], batch size: 69, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:35:20,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:35:20,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 19:35:22,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 19:35:22,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:35:22,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=994600.0, ans=0.1 2023-10-02 19:35:23,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:35:23,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:35:23,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 19:35:24,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:35:24,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:35:26,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:35:33,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:35:33,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:35:35,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 19:35:35,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 19:35:39,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:35:42,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:35:43,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:35:48,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:35:48,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:35:50,787 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.69 vs. limit=6.0 2023-10-02 19:35:51,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 19:35:51,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 19:35:54,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 19:35:55,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:35:57,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:35:57,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:35:58,888 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 19:35:58,896 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 19:36:00,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:36:01,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:36:02,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 19:36:05,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:36:07,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:36:07,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 19:36:07,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 19:36:10,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:36:11,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:36:13,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:36:13,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 19:36:14,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=994800.0, ans=0.0 2023-10-02 19:36:16,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:36:17,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 19:36:19,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 19:36:19,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:36:25,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:36:28,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:36:29,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:36:29,631 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 19:36:32,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:36:33,961 INFO [train.py:1046] (1/4) Epoch 29, batch 500, loss[loss=0.1763, simple_loss=0.2621, pruned_loss=0.04528, over 24445.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.243, pruned_loss=0.04391, over 4308001.24 frames. ], batch size: 77, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:36:34,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:36:34,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:36:34,111 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 19:36:36,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 19:36:36,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:36:39,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:36:42,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:36:43,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=994933.3333333334, ans=10.0 2023-10-02 19:36:44,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:36:47,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:36:47,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:36:47,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:36:50,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=995000.0, ans=0.04949747468305833 2023-10-02 19:36:59,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:36:59,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 19:36:59,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:36:59,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:36:59,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 19:37:01,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:37:03,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:37:03,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:37:03,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:37:04,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:37:05,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 19:37:06,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=995066.6666666666, ans=0.1 2023-10-02 19:37:08,192 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 19:37:10,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:37:12,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:37:14,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:37:14,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:37:15,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:37:16,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 19:37:20,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:37:22,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:37:24,158 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.819e+02 1.997e+02 2.314e+02 3.134e+02, threshold=3.994e+02, percent-clipped=0.0 2023-10-02 19:37:25,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:37:28,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:37:33,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:37:37,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 19:37:37,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:37:37,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:37:40,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 19:37:41,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 19:37:42,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:37:47,415 INFO [train.py:1046] (1/4) Epoch 29, batch 550, loss[loss=0.1518, simple_loss=0.2286, pruned_loss=0.03749, over 24434.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2445, pruned_loss=0.04416, over 4406758.12 frames. ], batch size: 58, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:37:47,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 19:37:48,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 19:37:48,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:37:48,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 19:37:50,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:37:50,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:37:50,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:37:52,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:37:52,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:37:52,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:37:55,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:37:56,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 19:37:56,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:38:01,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:01,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:38:04,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=995333.3333333334, ans=0.125 2023-10-02 19:38:05,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:38:05,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:38:09,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 19:38:10,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 19:38:11,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=995333.3333333334, ans=0.125 2023-10-02 19:38:12,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:38:12,864 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=8.00 vs. limit=12.0 2023-10-02 19:38:18,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:38:18,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:38:19,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:38:22,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:22,760 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 19:38:24,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:38:25,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 19:38:28,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:38:29,431 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.78 vs. limit=10.0 2023-10-02 19:38:30,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:38:30,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:38:31,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:33,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 19:38:33,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 19:38:34,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:38:34,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:38:34,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:38:34,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:38:37,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:38:37,960 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:38:39,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:38:39,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=995466.6666666666, ans=0.125 2023-10-02 19:38:40,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:38:41,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:41,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 19:38:43,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:38:45,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:38:46,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:38:46,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:48,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:38:48,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 19:38:54,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 19:38:57,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 19:38:57,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:38:57,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:38:58,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:39:02,065 INFO [train.py:1046] (1/4) Epoch 29, batch 600, loss[loss=0.1618, simple_loss=0.2365, pruned_loss=0.04353, over 23696.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2454, pruned_loss=0.04477, over 4467621.37 frames. ], batch size: 149, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:39:07,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:39:07,981 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:39:09,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:39:10,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 19:39:13,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:39:13,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:39:15,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:39:16,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 19:39:18,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:39:24,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 19:39:27,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:39:27,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:39:27,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:39:31,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=995733.3333333334, ans=0.1 2023-10-02 19:39:31,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=995733.3333333334, ans=0.125 2023-10-02 19:39:36,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:39:36,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:39:36,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:39:44,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:39:47,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:39:47,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:39:49,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:39:53,224 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.812e+02 1.989e+02 2.203e+02 3.587e+02, threshold=3.977e+02, percent-clipped=0.0 2023-10-02 19:39:55,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 19:40:02,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:40:02,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:40:05,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 19:40:06,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:40:09,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 19:40:09,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:40:09,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:40:11,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=995866.6666666666, ans=0.125 2023-10-02 19:40:15,416 INFO [train.py:1046] (1/4) Epoch 29, batch 650, loss[loss=0.1614, simple_loss=0.2265, pruned_loss=0.04815, over 23641.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2435, pruned_loss=0.04399, over 4519189.48 frames. ], batch size: 232, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:40:15,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 19:40:16,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:40:20,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:40:20,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:40:20,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=995933.3333333334, ans=10.0 2023-10-02 19:40:20,469 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:40:22,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:40:26,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 19:40:27,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:40:33,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:40:33,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:40:35,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=996000.0, ans=0.125 2023-10-02 19:40:36,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:40:39,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 19:40:42,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:40:44,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:40:45,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:40:45,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 19:40:46,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=996066.6666666666, ans=0.0 2023-10-02 19:40:48,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:40:48,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:40:50,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:40:51,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:40:51,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:40:54,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:40:54,628 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 19:40:54,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:40:55,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:40:58,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:00,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:41:00,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:41:00,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=996133.3333333334, ans=10.0 2023-10-02 19:41:01,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:41:01,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 19:41:02,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:41:02,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:41:04,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 19:41:04,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:41:05,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:41:07,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 19:41:08,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 19:41:09,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:09,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:41:09,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:41:09,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:41:13,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:41:18,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:18,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:41:20,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:41:23,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:41:23,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 19:41:23,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:41:24,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=996200.0, ans=0.1 2023-10-02 19:41:27,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=996200.0, ans=0.0 2023-10-02 19:41:31,062 INFO [train.py:1046] (1/4) Epoch 29, batch 700, loss[loss=0.173, simple_loss=0.2601, pruned_loss=0.04293, over 23999.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2428, pruned_loss=0.04407, over 4562415.91 frames. ], batch size: 80, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:41:31,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:41:31,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:41:31,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:41:31,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:41:35,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 19:41:36,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 19:41:38,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 19:41:39,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:41,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:41:44,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 19:41:48,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:41:51,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:41:51,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:52,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:41:54,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:41:55,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:42:00,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 19:42:00,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:42:00,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=996400.0, ans=0.04949747468305833 2023-10-02 19:42:02,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 19:42:03,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 19:42:03,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=996400.0, ans=0.125 2023-10-02 19:42:07,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:42:07,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:42:10,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:42:15,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:42:15,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 19:42:19,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:42:19,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:42:21,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 19:42:22,597 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.803e+02 2.016e+02 2.224e+02 3.281e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-02 19:42:25,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:42:25,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:42:28,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:42:33,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:42:34,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 19:42:37,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 19:42:39,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 19:42:40,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:42:43,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:42:43,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:42:45,105 INFO [train.py:1046] (1/4) Epoch 29, batch 750, loss[loss=0.1759, simple_loss=0.248, pruned_loss=0.05187, over 23799.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2424, pruned_loss=0.0438, over 4594648.59 frames. ], batch size: 179, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:42:45,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=996600.0, ans=0.1 2023-10-02 19:42:46,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:42:46,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 19:42:46,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=996600.0, ans=0.125 2023-10-02 19:42:50,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 19:42:50,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 19:42:50,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 19:42:51,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 19:42:51,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 19:42:52,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:42:52,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 19:42:54,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:42:55,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:42:55,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:42:57,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:42:58,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:43:00,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:43:02,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:43:04,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:43:05,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:43:07,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:43:09,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:43:09,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 19:43:10,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:43:10,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:43:13,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:43:15,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 19:43:16,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 19:43:16,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:43:19,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 19:43:19,847 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 19:43:19,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 19:43:19,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:43:19,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:43:22,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:43:28,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:43:28,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:43:28,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:43:31,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:43:32,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:43:32,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 19:43:32,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=996800.0, ans=0.125 2023-10-02 19:43:34,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:43:35,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 19:43:35,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:43:38,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:43:38,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 19:43:40,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:43:44,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:43:45,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:43:45,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:43:47,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:43:52,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 19:43:53,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:43:53,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:43:57,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:43:57,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:43:59,212 INFO [train.py:1046] (1/4) Epoch 29, batch 800, loss[loss=0.1702, simple_loss=0.2383, pruned_loss=0.05109, over 23752.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.243, pruned_loss=0.04416, over 4610699.72 frames. ], batch size: 195, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:44:01,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:44:01,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:44:06,107 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.30 vs. limit=22.5 2023-10-02 19:44:08,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:44:08,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:08,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=996933.3333333334, ans=0.125 2023-10-02 19:44:11,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:44:11,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:44:11,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:12,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:44:14,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:16,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:44:18,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:44:21,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 19:44:22,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:44:24,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:44:24,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:44:26,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:44:26,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 19:44:26,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:44:27,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 19:44:31,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:33,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:44:34,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:44:34,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:44:37,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:44:37,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:44:42,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:44:42,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:44:42,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 19:44:42,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=997133.3333333334, ans=0.2 2023-10-02 19:44:43,889 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 19:44:43,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 19:44:45,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:44:45,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:44:46,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:46,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:44:50,297 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 19:44:51,488 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.909e+02 2.086e+02 2.400e+02 3.373e+02, threshold=4.173e+02, percent-clipped=0.0 2023-10-02 19:44:51,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 19:44:51,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:44:53,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:44:57,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:45:00,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:45:02,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 19:45:02,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:45:06,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 19:45:13,271 INFO [train.py:1046] (1/4) Epoch 29, batch 850, loss[loss=0.1592, simple_loss=0.2338, pruned_loss=0.04233, over 23632.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2443, pruned_loss=0.0448, over 4617752.09 frames. ], batch size: 149, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:45:13,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:45:14,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:45:16,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 19:45:17,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:45:18,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:45:20,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 19:45:20,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:45:20,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:45:23,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:45:23,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:45:24,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:45:26,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 19:45:26,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 19:45:26,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 19:45:29,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:45:29,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:45:31,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:45:31,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:45:31,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:45:36,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:45:36,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:45:36,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 19:45:41,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 19:45:44,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:45:45,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 19:45:50,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 19:45:51,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 19:45:53,327 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 19:45:53,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:45:53,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:45:53,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 19:45:56,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:45:57,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:45:57,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 19:45:58,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:45:58,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:46:00,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:46:02,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:46:02,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:46:03,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:46:03,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 19:46:08,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:46:08,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:46:08,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:46:10,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:46:10,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:46:14,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:46:16,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:46:17,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 19:46:17,966 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.72 vs. limit=12.0 2023-10-02 19:46:18,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:46:20,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:46:27,046 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.69 vs. limit=22.5 2023-10-02 19:46:27,487 INFO [train.py:1046] (1/4) Epoch 29, batch 900, loss[loss=0.1703, simple_loss=0.2383, pruned_loss=0.05112, over 23744.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.245, pruned_loss=0.04503, over 4641201.87 frames. ], batch size: 164, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:46:27,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 19:46:28,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:46:29,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 19:46:29,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:46:30,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:46:31,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 19:46:37,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:46:41,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:46:41,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 19:46:44,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:46:44,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 19:46:44,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 19:46:44,216 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=997666.6666666666, ans=0.125 2023-10-02 19:46:47,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:46:47,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:46:47,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:46:48,180 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.57 vs. limit=15.0 2023-10-02 19:46:48,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:46:56,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:46:56,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:46:57,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:47:00,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:47:00,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=997733.3333333334, ans=0.125 2023-10-02 19:47:03,982 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:47:05,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 19:47:07,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:47:12,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 19:47:12,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:47:13,248 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.12 vs. limit=12.0 2023-10-02 19:47:13,907 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 19:47:13,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 19:47:19,916 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.861e+02 2.059e+02 2.455e+02 3.512e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-02 19:47:20,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 19:47:20,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:47:21,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:47:27,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:47:27,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:47:27,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=997866.6666666666, ans=0.1 2023-10-02 19:47:30,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 19:47:30,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:47:32,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 19:47:33,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:47:33,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:47:36,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:47:36,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:47:40,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 19:47:40,506 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 19:47:40,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 19:47:42,439 INFO [train.py:1046] (1/4) Epoch 29, batch 950, loss[loss=0.1642, simple_loss=0.2456, pruned_loss=0.04139, over 24495.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2452, pruned_loss=0.04475, over 4665552.47 frames. ], batch size: 66, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:47:42,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 19:47:43,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:47:48,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 19:47:52,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:47:54,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:47:56,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:47:56,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:47:57,701 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 19:48:03,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:48:03,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:48:04,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:48:04,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:48:06,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 19:48:06,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 19:48:07,134 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.03 vs. limit=6.0 2023-10-02 19:48:07,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:07,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=998000.0, ans=0.125 2023-10-02 19:48:09,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 19:48:09,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:48:13,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:13,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:48:14,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:48:16,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 19:48:17,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:48:19,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:48:21,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:48:25,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:48:25,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:48:28,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 19:48:31,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 19:48:31,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:48:31,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:48:31,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:31,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:48:33,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=998133.3333333334, ans=0.2 2023-10-02 19:48:35,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 19:48:36,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:48:38,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:48:40,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:40,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 19:48:40,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:48:40,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:48:40,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 19:48:44,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:48:47,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:48:50,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:48:52,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 19:48:52,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 19:48:55,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:56,953 INFO [train.py:1046] (1/4) Epoch 29, batch 1000, loss[loss=0.1639, simple_loss=0.2249, pruned_loss=0.05141, over 22768.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2437, pruned_loss=0.04466, over 4668514.04 frames. ], batch size: 322, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:48:59,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 19:48:59,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:05,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:49:07,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 19:49:07,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 19:49:11,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:49:11,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:49:13,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:49:16,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 19:49:20,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 19:49:22,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 19:49:22,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:49:24,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 19:49:25,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 19:49:25,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 19:49:27,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=998400.0, ans=0.2 2023-10-02 19:49:28,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:49:28,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:35,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:49:36,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=998400.0, ans=0.125 2023-10-02 19:49:37,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:49:37,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:38,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:49:38,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 19:49:38,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:49:40,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:49:40,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:49:41,925 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 19:49:44,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 19:49:46,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 19:49:47,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 19:49:48,771 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.876e+02 2.032e+02 2.220e+02 3.868e+02, threshold=4.064e+02, percent-clipped=0.0 2023-10-02 19:49:48,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:49:56,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:56,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:49:56,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:58,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:49:59,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 19:49:59,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:49:59,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 19:49:59,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 19:50:01,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:50:01,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:50:03,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:50:07,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:50:07,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=998533.3333333334, ans=0.125 2023-10-02 19:50:08,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:50:08,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=998533.3333333334, ans=0.125 2023-10-02 19:50:10,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=998600.0, ans=0.125 2023-10-02 19:50:11,790 INFO [train.py:1046] (1/4) Epoch 29, batch 1050, loss[loss=0.1764, simple_loss=0.2601, pruned_loss=0.04637, over 24469.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2423, pruned_loss=0.04416, over 4664016.78 frames. ], batch size: 66, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:50:11,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:50:11,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:50:13,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:50:13,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:50:16,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:50:17,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:50:18,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=998600.0, ans=0.2 2023-10-02 19:50:21,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:50:22,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:50:23,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:50:23,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:50:24,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=998600.0, ans=0.125 2023-10-02 19:50:25,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:50:26,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 19:50:26,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:50:28,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 19:50:32,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:50:32,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 19:50:32,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 19:50:36,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:50:38,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 19:50:38,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:50:41,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 19:50:41,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 19:50:42,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:50:44,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 19:50:48,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 19:50:50,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:50:52,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=998733.3333333334, ans=22.5 2023-10-02 19:50:52,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 19:50:54,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 19:50:55,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:50:55,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:50:58,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:51:02,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 19:51:03,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 19:51:03,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 19:51:05,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:51:05,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:51:06,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 19:51:11,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:51:11,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=998866.6666666666, ans=0.125 2023-10-02 19:51:13,304 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.08 vs. limit=15.0 2023-10-02 19:51:13,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:51:13,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:51:13,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:51:14,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:51:19,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:51:19,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 19:51:21,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:51:21,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 19:51:21,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 19:51:22,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:51:26,506 INFO [train.py:1046] (1/4) Epoch 29, batch 1100, loss[loss=0.1635, simple_loss=0.252, pruned_loss=0.03754, over 24439.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2419, pruned_loss=0.044, over 4681961.34 frames. ], batch size: 69, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:51:26,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:51:30,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:51:32,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=998933.3333333334, ans=0.125 2023-10-02 19:51:34,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:51:36,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:51:37,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:51:37,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 19:51:37,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:51:40,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:51:40,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=999000.0, ans=0.125 2023-10-02 19:51:41,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:51:41,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=999000.0, ans=0.125 2023-10-02 19:51:45,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:51:45,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 19:51:46,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 19:51:46,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:51:46,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:51:50,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:51:51,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:51:55,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:51:59,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 19:52:01,126 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 19:52:02,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:52:03,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:52:05,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:52:05,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:52:06,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 19:52:07,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:52:07,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:52:07,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:52:08,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:52:09,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 19:52:14,832 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.94 vs. limit=12.0 2023-10-02 19:52:17,275 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.440e+02 1.856e+02 2.110e+02 2.423e+02 3.915e+02, threshold=4.220e+02, percent-clipped=0.0 2023-10-02 19:52:17,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:52:17,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 19:52:19,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:52:24,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:52:25,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=999200.0, ans=0.1 2023-10-02 19:52:26,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 19:52:26,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 19:52:27,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=999200.0, ans=0.2 2023-10-02 19:52:28,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:52:30,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:52:30,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:52:32,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 19:52:32,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:52:33,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:52:33,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 19:52:34,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:52:34,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 19:52:36,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:52:36,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:52:37,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:52:39,039 INFO [train.py:1046] (1/4) Epoch 29, batch 1150, loss[loss=0.2114, simple_loss=0.2777, pruned_loss=0.07259, over 19572.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2428, pruned_loss=0.0444, over 4685165.69 frames. ], batch size: 388, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:52:40,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:52:43,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:52:46,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:52:46,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:52:46,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 19:52:46,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:52:50,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 19:52:52,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:52:52,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:52:55,787 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.60 vs. limit=15.0 2023-10-02 19:52:57,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 19:52:59,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:53:03,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:53:03,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:53:03,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 19:53:03,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 19:53:04,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:53:07,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 19:53:07,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:53:08,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:53:19,146 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.52 vs. limit=15.0 2023-10-02 19:53:20,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:53:21,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=999400.0, ans=0.2 2023-10-02 19:53:25,878 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:53:27,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:53:28,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 19:53:28,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:53:28,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:53:33,966 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 19:53:36,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:53:42,355 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 19:53:45,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:53:47,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:53:47,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:53:47,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:53:50,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:53:53,723 INFO [train.py:1046] (1/4) Epoch 29, batch 1200, loss[loss=0.1633, simple_loss=0.2409, pruned_loss=0.0429, over 23448.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2432, pruned_loss=0.04416, over 4703651.62 frames. ], batch size: 119, lr: 3.54e-03, grad_scale: 32.0 2023-10-02 19:53:55,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:53:55,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:53:55,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:53:55,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:53:56,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:53:57,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:54:00,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:54:01,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:54:01,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:54:03,163 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 19:54:04,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 19:54:06,842 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.45 vs. limit=15.0 2023-10-02 19:54:08,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:54:10,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:54:12,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:54:14,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:54:14,905 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 19:54:16,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:54:16,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=999666.6666666666, ans=0.1 2023-10-02 19:54:17,023 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.45 vs. limit=22.5 2023-10-02 19:54:26,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:54:26,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:54:26,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 19:54:27,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:54:30,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 19:54:34,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 19:54:34,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:54:35,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:54:36,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=999800.0, ans=0.0 2023-10-02 19:54:37,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:54:38,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 19:54:38,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:54:38,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:54:39,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:54:41,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 19:54:41,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:54:41,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:54:41,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 19:54:44,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:54:44,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:54:45,321 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.823e+02 1.995e+02 2.246e+02 2.877e+02, threshold=3.989e+02, percent-clipped=0.0 2023-10-02 19:54:49,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 19:54:50,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:54:53,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 19:54:58,466 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 19:54:59,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:55:02,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:55:05,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:55:06,512 INFO [train.py:1046] (1/4) Epoch 29, batch 1250, loss[loss=0.1474, simple_loss=0.2313, pruned_loss=0.03175, over 24591.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2445, pruned_loss=0.04458, over 4706081.35 frames. ], batch size: 60, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:55:06,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:55:08,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 19:55:12,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:55:13,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:55:13,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 19:55:16,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:55:16,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:55:23,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:55:23,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:55:24,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:55:24,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:55:26,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:55:31,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 19:55:32,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:55:32,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:55:34,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:55:34,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:55:36,416 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.66 vs. limit=15.0 2023-10-02 19:55:36,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:55:36,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 19:55:42,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 19:55:43,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:55:45,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:55:46,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 19:55:46,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:55:46,472 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 19:55:46,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:55:46,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:55:52,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:55:55,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:55:56,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:55:57,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 19:55:59,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 19:55:59,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 19:56:01,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:56:03,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 19:56:04,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:56:06,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 19:56:06,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:56:06,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1000200.0, ans=0.1 2023-10-02 19:56:09,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 19:56:09,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:56:10,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:56:10,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 19:56:10,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:56:13,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 19:56:16,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:56:17,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:56:18,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:56:20,360 INFO [train.py:1046] (1/4) Epoch 29, batch 1300, loss[loss=0.1679, simple_loss=0.2399, pruned_loss=0.04792, over 23676.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2459, pruned_loss=0.04497, over 4713000.97 frames. ], batch size: 164, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 19:56:21,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:56:23,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:56:23,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 19:56:23,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1000266.6666666666, ans=0.125 2023-10-02 19:56:28,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:56:31,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:56:32,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:56:33,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1000266.6666666666, ans=0.0 2023-10-02 19:56:34,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:56:36,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:56:36,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 19:56:40,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:56:40,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1000333.3333333334, ans=0.0 2023-10-02 19:56:42,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 19:56:43,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 19:56:45,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:56:48,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:56:51,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:56:51,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:56:51,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:56:53,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:56:54,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 19:56:55,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 19:57:01,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:57:01,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:57:02,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 19:57:04,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:57:05,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:57:06,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1000466.6666666666, ans=0.125 2023-10-02 19:57:09,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:57:09,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 19:57:10,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:57:10,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 19:57:12,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:57:14,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:57:16,070 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.865e+02 2.006e+02 2.214e+02 3.009e+02, threshold=4.012e+02, percent-clipped=0.0 2023-10-02 19:57:16,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:57:17,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1000466.6666666666, ans=0.0 2023-10-02 19:57:18,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 19:57:18,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 19:57:20,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 19:57:20,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1000533.3333333334, ans=0.0 2023-10-02 19:57:24,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:57:25,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1000533.3333333334, ans=0.0 2023-10-02 19:57:27,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 19:57:29,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:57:35,526 INFO [train.py:1046] (1/4) Epoch 29, batch 1350, loss[loss=0.1484, simple_loss=0.2238, pruned_loss=0.03653, over 24474.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2451, pruned_loss=0.04457, over 4721223.96 frames. ], batch size: 58, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 19:57:37,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 19:57:40,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:57:42,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:57:45,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:57:45,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:57:47,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:57:48,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:57:51,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:57:52,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 19:57:53,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1000666.6666666666, ans=0.04949747468305833 2023-10-02 19:57:54,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:57:54,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:57:57,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 19:57:57,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:57:57,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1000666.6666666666, ans=0.0 2023-10-02 19:57:58,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:57:58,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 19:58:01,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 19:58:03,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 19:58:05,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:58:05,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 19:58:07,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1000733.3333333334, ans=0.125 2023-10-02 19:58:15,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:58:24,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:58:24,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:58:25,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 19:58:27,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:58:29,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 19:58:29,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:58:29,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:58:29,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1000800.0, ans=0.5 2023-10-02 19:58:32,044 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.49 vs. limit=12.0 2023-10-02 19:58:32,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:58:33,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 19:58:35,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:58:40,907 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.50 vs. limit=15.0 2023-10-02 19:58:41,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 19:58:42,382 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=6.38 vs. limit=15.0 2023-10-02 19:58:44,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 19:58:48,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 19:58:49,919 INFO [train.py:1046] (1/4) Epoch 29, batch 1400, loss[loss=0.1843, simple_loss=0.2649, pruned_loss=0.05187, over 23660.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2431, pruned_loss=0.04425, over 4701367.34 frames. ], batch size: 85, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 19:58:49,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:58:53,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:58:53,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:58:54,198 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=5.26 vs. limit=12.0 2023-10-02 19:58:59,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 19:58:59,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 19:59:04,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1001000.0, ans=0.125 2023-10-02 19:59:09,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:59:11,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:59:14,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:59:14,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 19:59:16,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:59:17,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1001000.0, ans=0.125 2023-10-02 19:59:17,975 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.26 vs. limit=15.0 2023-10-02 19:59:18,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 19:59:28,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:59:28,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:59:30,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1001066.6666666666, ans=0.125 2023-10-02 19:59:33,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 19:59:33,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:59:33,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:59:35,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:59:35,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:59:36,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:59:36,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:59:36,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:59:38,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 19:59:38,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:59:40,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1001133.3333333334, ans=0.1 2023-10-02 19:59:45,129 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 1.875e+02 2.105e+02 2.508e+02 3.725e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-02 19:59:45,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:59:49,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:59:56,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 19:59:58,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:59:59,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:00:01,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 20:00:03,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:00:04,353 INFO [train.py:1046] (1/4) Epoch 29, batch 1450, loss[loss=0.1454, simple_loss=0.2226, pruned_loss=0.03415, over 24645.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2421, pruned_loss=0.0437, over 4718957.42 frames. ], batch size: 60, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:00:04,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:00:08,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:00:10,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:00:10,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:10,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 20:00:15,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:00:15,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:00:16,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1001266.6666666666, ans=0.125 2023-10-02 20:00:18,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:00:18,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 20:00:18,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:00:18,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1001333.3333333334, ans=0.125 2023-10-02 20:00:19,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 20:00:21,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:22,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:00:22,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 20:00:23,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:00:23,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:00:24,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 20:00:25,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:00:25,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:00:26,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:29,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:00:33,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:00:33,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:00:35,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1001400.0, ans=0.125 2023-10-02 20:00:36,422 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.20 vs. limit=15.0 2023-10-02 20:00:37,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:00:37,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:39,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:00:39,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:00:39,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:39,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:00:43,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 20:00:46,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:00:49,702 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 20:00:51,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:00:52,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:00:52,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:00:53,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 20:00:56,047 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.53 vs. limit=10.0 2023-10-02 20:00:56,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:00:57,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 20:00:58,561 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.18 vs. limit=15.0 2023-10-02 20:00:58,969 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.32 vs. limit=5.0 2023-10-02 20:01:01,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 20:01:02,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:01:05,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:01:07,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:01:07,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 20:01:10,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 20:01:10,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 20:01:13,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:01:13,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:01:19,194 INFO [train.py:1046] (1/4) Epoch 29, batch 1500, loss[loss=0.1716, simple_loss=0.246, pruned_loss=0.04865, over 23653.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2426, pruned_loss=0.04383, over 4729381.26 frames. ], batch size: 149, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:01:21,719 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.39 vs. limit=15.0 2023-10-02 20:01:23,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 20:01:23,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:01:23,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:01:24,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:01:26,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:01:26,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:01:27,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 20:01:29,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:01:29,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:01:31,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:01:31,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:01:31,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:01:32,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:01:40,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:01:40,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 20:01:40,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:01:40,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:01:42,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:01:45,065 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:01:46,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 20:01:49,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1001733.3333333334, ans=0.2 2023-10-02 20:01:50,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 20:01:52,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:01:52,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 20:01:54,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:01:57,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:01:57,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:01:58,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:01:59,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1001733.3333333334, ans=0.125 2023-10-02 20:02:00,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 20:02:00,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:02:00,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:02:01,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1001733.3333333334, ans=0.125 2023-10-02 20:02:02,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 20:02:02,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:02:09,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:02:09,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 20:02:13,648 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.835e+02 2.062e+02 2.409e+02 3.555e+02, threshold=4.124e+02, percent-clipped=0.0 2023-10-02 20:02:15,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 20:02:15,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1001800.0, ans=0.0 2023-10-02 20:02:16,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:02:21,305 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 20:02:21,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:02:21,367 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 20:02:23,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:02:23,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:02:24,610 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 20:02:26,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:02:28,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 20:02:30,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:02:33,223 INFO [train.py:1046] (1/4) Epoch 29, batch 1550, loss[loss=0.1675, simple_loss=0.239, pruned_loss=0.04798, over 23867.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2435, pruned_loss=0.04422, over 4723900.26 frames. ], batch size: 212, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:02:33,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:02:33,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:02:34,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:02:34,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:02:34,876 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1001933.3333333334, ans=0.125 2023-10-02 20:02:34,921 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:02:36,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:02:36,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 20:02:38,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 20:02:39,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:02:39,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 20:02:39,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 20:02:42,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:02:43,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:02:43,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:02:43,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:02:46,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:02:46,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:02:49,056 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 20:02:49,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:02:50,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:02:50,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:02:52,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:02:52,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 20:02:55,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:02:55,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 20:02:55,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 20:02:56,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 20:02:56,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:02:56,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:03:01,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:03:02,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 20:03:02,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 20:03:07,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1002066.6666666666, ans=0.125 2023-10-02 20:03:07,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1002066.6666666666, ans=0.125 2023-10-02 20:03:11,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:03:11,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1002066.6666666666, ans=0.0 2023-10-02 20:03:15,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:03:15,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:03:15,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:03:17,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 20:03:17,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1002133.3333333334, ans=0.125 2023-10-02 20:03:18,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1002133.3333333334, ans=0.125 2023-10-02 20:03:20,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1002133.3333333334, ans=0.2 2023-10-02 20:03:23,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:03:25,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:03:27,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:03:30,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:03:30,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:03:30,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 20:03:31,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:03:33,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:03:33,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:03:34,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 20:03:34,712 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 20:03:38,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:03:42,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 20:03:46,752 INFO [train.py:1046] (1/4) Epoch 29, batch 1600, loss[loss=0.1652, simple_loss=0.2476, pruned_loss=0.04143, over 24677.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2444, pruned_loss=0.04468, over 4716940.21 frames. ], batch size: 65, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 20:03:46,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:03:48,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:03:49,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 20:03:49,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:03:49,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:03:49,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:03:51,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:03:52,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:03:54,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:03:55,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 20:03:57,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 20:03:59,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 20:04:01,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:04:03,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 20:04:03,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:04:04,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:04:11,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:04:13,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 20:04:16,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:04:16,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1002400.0, ans=0.0 2023-10-02 20:04:18,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 20:04:18,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:04:18,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 20:04:19,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1002400.0, ans=0.1 2023-10-02 20:04:24,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 20:04:31,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:04:31,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 20:04:33,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:04:33,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:04:33,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:04:37,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 20:04:40,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 20:04:42,331 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.872e+02 2.190e+02 2.383e+02 3.841e+02, threshold=4.379e+02, percent-clipped=0.0 2023-10-02 20:04:42,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:04:42,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:04:42,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:04:43,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:04:45,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:04:46,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:04:49,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:04:55,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:04:55,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:04:58,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 20:04:58,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:04:58,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 20:05:00,932 INFO [train.py:1046] (1/4) Epoch 29, batch 1650, loss[loss=0.1638, simple_loss=0.2449, pruned_loss=0.04137, over 24465.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2453, pruned_loss=0.0451, over 4703967.75 frames. ], batch size: 63, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 20:05:02,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:05:04,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:05:05,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:05:05,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 20:05:05,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 20:05:05,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 20:05:07,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 20:05:11,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:05:12,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:05:13,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:05:13,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:05:16,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:05:17,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 20:05:19,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:05:19,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:05:19,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:05:19,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:05:20,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 20:05:20,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 20:05:21,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1002666.6666666666, ans=0.0 2023-10-02 20:05:25,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:05:27,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1002666.6666666666, ans=0.1 2023-10-02 20:05:28,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:05:34,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 20:05:36,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:05:38,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 20:05:42,628 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.77 vs. limit=15.0 2023-10-02 20:05:42,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:05:44,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:05:46,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:05:46,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:05:47,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:05:47,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:05:50,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:05:50,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:05:52,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:05:52,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:05:53,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:05:53,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:05:56,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:05:56,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 20:05:57,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:05:57,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 20:05:58,514 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.92 vs. limit=22.5 2023-10-02 20:06:00,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 20:06:00,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 20:06:00,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:06:01,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1002866.6666666666, ans=0.0 2023-10-02 20:06:02,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:06:02,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:06:03,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:06:03,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 20:06:07,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:06:10,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:06:10,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:06:11,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 20:06:16,116 INFO [train.py:1046] (1/4) Epoch 29, batch 1700, loss[loss=0.1691, simple_loss=0.2537, pruned_loss=0.04224, over 24457.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.245, pruned_loss=0.04503, over 4705777.63 frames. ], batch size: 69, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:06:16,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:06:16,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:06:16,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 20:06:17,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:06:17,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:06:17,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:06:19,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:06:19,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:06:20,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 20:06:23,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:06:31,501 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.29 vs. limit=15.0 2023-10-02 20:06:32,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:06:34,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:06:38,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:06:40,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:06:40,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:06:40,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:06:43,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 20:06:44,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:06:44,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:06:46,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:06:47,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:06:49,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 20:06:50,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 20:06:51,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:06:52,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 20:06:54,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:07:02,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:07:04,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:07:04,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:07:05,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:07:07,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 20:07:07,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:07:09,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:07:09,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 20:07:10,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:07:10,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:07:10,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:07:10,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:07:13,659 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.858e+02 2.032e+02 2.351e+02 3.196e+02, threshold=4.063e+02, percent-clipped=0.0 2023-10-02 20:07:13,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:07:13,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:07:15,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:07:16,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:07:17,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:07:20,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:07:22,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 20:07:24,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:07:24,846 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.80 vs. limit=15.0 2023-10-02 20:07:25,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:07:27,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 20:07:30,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1003266.6666666666, ans=0.1 2023-10-02 20:07:31,194 INFO [train.py:1046] (1/4) Epoch 29, batch 1750, loss[loss=0.1671, simple_loss=0.2428, pruned_loss=0.04572, over 23381.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.244, pruned_loss=0.04446, over 4701048.43 frames. ], batch size: 106, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:07:32,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:07:34,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:07:35,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:07:35,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 20:07:35,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:07:40,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:07:40,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:07:40,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1003266.6666666666, ans=0.0 2023-10-02 20:07:43,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 20:07:45,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:07:46,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1003333.3333333334, ans=0.0 2023-10-02 20:07:48,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 20:07:48,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:07:49,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:07:52,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 20:07:52,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 20:07:53,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1003333.3333333334, ans=0.125 2023-10-02 20:07:55,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:07:55,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 20:08:01,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:08:04,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:08:04,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:08:05,263 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.12 vs. limit=15.0 2023-10-02 20:08:06,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1003400.0, ans=0.125 2023-10-02 20:08:07,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:08:07,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:08:09,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:08:09,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1003400.0, ans=0.1 2023-10-02 20:08:11,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:08:13,244 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.63 vs. limit=15.0 2023-10-02 20:08:13,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:08:14,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:08:15,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 20:08:15,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1003466.6666666666, ans=0.1 2023-10-02 20:08:15,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1003466.6666666666, ans=0.0 2023-10-02 20:08:18,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:08:18,829 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.03 vs. limit=15.0 2023-10-02 20:08:21,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 20:08:21,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:08:24,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:08:24,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:08:28,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:08:29,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 20:08:29,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:08:31,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:08:33,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:08:37,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:08:39,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:08:39,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 20:08:39,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:08:41,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:08:41,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:08:41,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:08:41,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:08:42,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:08:45,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:08:46,642 INFO [train.py:1046] (1/4) Epoch 29, batch 1800, loss[loss=0.1594, simple_loss=0.212, pruned_loss=0.05343, over 19236.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2432, pruned_loss=0.04421, over 4685994.86 frames. ], batch size: 389, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:08:46,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:08:48,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:08:50,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:08:54,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:08:55,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:08:58,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:09:01,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:09:01,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:09:02,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:09:03,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:09:03,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 20:09:04,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:06,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:07,388 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.62 vs. limit=15.0 2023-10-02 20:09:12,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 20:09:13,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 20:09:13,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 20:09:15,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:09:15,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:09:15,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:09:16,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:09:18,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1003733.3333333334, ans=0.125 2023-10-02 20:09:21,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1003733.3333333334, ans=0.0 2023-10-02 20:09:22,536 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 20:09:23,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:09:25,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:26,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 20:09:28,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 20:09:28,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:09:31,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:09:32,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:09:36,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 20:09:42,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:09:43,456 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.936e+02 2.168e+02 2.501e+02 3.680e+02, threshold=4.336e+02, percent-clipped=0.0 2023-10-02 20:09:43,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 20:09:43,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:09:43,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:09:45,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:09:46,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 20:09:47,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1003866.6666666666, ans=0.125 2023-10-02 20:09:48,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:09:49,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:09:52,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 20:09:52,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:09:54,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:09:54,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:09:54,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:55,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:55,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:09:55,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1003866.6666666666, ans=0.125 2023-10-02 20:09:58,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:09:58,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:10:01,021 INFO [train.py:1046] (1/4) Epoch 29, batch 1850, loss[loss=0.2071, simple_loss=0.2696, pruned_loss=0.07225, over 19197.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2432, pruned_loss=0.04412, over 4687496.84 frames. ], batch size: 388, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:10:01,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:10:02,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:10:09,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:10:09,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 20:10:10,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1003933.3333333334, ans=0.2 2023-10-02 20:10:13,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 20:10:16,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 20:10:20,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:10:20,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 20:10:20,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 20:10:22,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1004000.0, ans=0.0 2023-10-02 20:10:24,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1004000.0, ans=0.0 2023-10-02 20:10:30,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:10:32,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 20:10:35,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:10:35,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:10:39,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 20:10:39,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:10:40,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:10:40,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:10:43,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:10:45,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:10:46,021 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.86 vs. limit=15.0 2023-10-02 20:10:47,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1004133.3333333334, ans=0.125 2023-10-02 20:10:49,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:10:51,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:10:51,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 20:10:51,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:10:55,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:10:55,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:10:58,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 20:10:58,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:11:02,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:11:03,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:11:03,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 20:11:03,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 20:11:05,226 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 20:11:06,592 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 20:11:06,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:11:06,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:11:06,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:11:07,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:11:09,258 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 20:11:09,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:11:09,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:11:10,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:11:10,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:11:12,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:11:13,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 20:11:14,861 INFO [train.py:1046] (1/4) Epoch 29, batch 1900, loss[loss=0.1848, simple_loss=0.2669, pruned_loss=0.05132, over 24052.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2442, pruned_loss=0.04455, over 4699190.20 frames. ], batch size: 80, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:11:15,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:11:15,612 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 20:11:15,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:11:17,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:11:21,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:11:23,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:11:23,175 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 20:11:25,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 20:11:25,955 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.92 vs. limit=15.0 2023-10-02 20:11:26,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:11:27,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:11:27,829 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 20:11:27,854 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 20:11:30,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 20:11:32,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:11:32,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1004333.3333333334, ans=0.0 2023-10-02 20:11:36,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 20:11:38,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 20:11:46,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1004400.0, ans=0.0 2023-10-02 20:11:48,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 20:11:50,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 20:11:52,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:11:52,811 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 20:11:54,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 20:11:54,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 20:11:54,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 20:11:54,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:11:58,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 20:12:00,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:12:03,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:12:03,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 20:12:05,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:12:08,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 20:12:08,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:12:11,360 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.825e+02 1.966e+02 2.194e+02 3.470e+02, threshold=3.932e+02, percent-clipped=0.0 2023-10-02 20:12:16,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:12:16,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:12:16,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:12:16,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:12:17,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:12:19,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:12:19,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:12:23,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:12:23,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:12:26,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:12:26,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:12:28,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:12:30,109 INFO [train.py:1046] (1/4) Epoch 29, batch 1950, loss[loss=0.165, simple_loss=0.2398, pruned_loss=0.04512, over 23686.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2445, pruned_loss=0.04458, over 4709317.13 frames. ], batch size: 232, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:12:30,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:12:32,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:12:34,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:12:34,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:12:34,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:12:35,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 20:12:37,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 20:12:38,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:12:39,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:12:42,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:12:42,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:12:42,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:12:42,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1004666.6666666666, ans=0.125 2023-10-02 20:12:45,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:12:50,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:12:50,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:12:50,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:12:50,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:12:51,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1004666.6666666666, ans=0.125 2023-10-02 20:12:54,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:12:57,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:12:57,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:12:57,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:12:57,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 20:12:59,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:12:59,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:12:59,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:13:04,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:13:07,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:13:09,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:13:12,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:13:14,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:13:14,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 20:13:14,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:13:19,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:13:20,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:13:21,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:13:28,577 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.86 vs. limit=15.0 2023-10-02 20:13:30,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:13:30,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:13:31,192 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.35 vs. limit=15.0 2023-10-02 20:13:33,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:13:36,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:13:38,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:13:38,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:13:38,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 20:13:38,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:13:38,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:13:41,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 20:13:41,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:13:43,992 INFO [train.py:1046] (1/4) Epoch 29, batch 2000, loss[loss=0.1695, simple_loss=0.2459, pruned_loss=0.04654, over 23320.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2451, pruned_loss=0.04432, over 4721829.82 frames. ], batch size: 105, lr: 3.53e-03, grad_scale: 16.0 2023-10-02 20:13:45,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:13:45,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1004933.3333333334, ans=0.125 2023-10-02 20:13:46,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:13:46,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:13:49,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:13:52,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:13:54,527 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1004933.3333333334, ans=0.0 2023-10-02 20:13:55,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 20:13:55,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:13:59,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:14:01,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 20:14:02,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:14:02,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:14:03,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1005000.0, ans=0.125 2023-10-02 20:14:04,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:14:05,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 20:14:07,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:09,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:09,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:10,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 20:14:11,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:14:12,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1005066.6666666666, ans=0.0 2023-10-02 20:14:13,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 20:14:13,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:14:14,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:14:16,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 20:14:16,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:17,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:14:18,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:14:20,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 20:14:23,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 20:14:23,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:14:23,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:14:27,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:14:29,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:14:29,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:14:29,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:14:32,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:14:32,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:14:32,872 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:14:33,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:14:33,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:14:34,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1005133.3333333334, ans=0.0 2023-10-02 20:14:35,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:38,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:14:38,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 20:14:39,876 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.873e+02 1.992e+02 2.268e+02 3.359e+02, threshold=3.985e+02, percent-clipped=0.0 2023-10-02 20:14:41,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:14:42,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:14:44,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:14:44,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:14:48,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:49,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:14:49,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:51,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:14:51,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:14:54,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:14:54,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1005200.0, ans=0.125 2023-10-02 20:14:56,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:57,709 INFO [train.py:1046] (1/4) Epoch 29, batch 2050, loss[loss=0.1679, simple_loss=0.2483, pruned_loss=0.04372, over 24628.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2445, pruned_loss=0.0446, over 4719685.17 frames. ], batch size: 68, lr: 3.53e-03, grad_scale: 16.0 2023-10-02 20:14:57,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:14:57,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:15:03,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:15:05,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:15:05,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:15:07,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:15:09,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1005266.6666666666, ans=0.1 2023-10-02 20:15:10,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 20:15:10,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:15:12,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:15:13,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:15:17,752 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:15:17,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1005333.3333333334, ans=0.2 2023-10-02 20:15:23,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:15:23,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:15:25,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 20:15:25,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1005400.0, ans=0.09899494936611666 2023-10-02 20:15:28,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:15:28,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 20:15:28,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:15:33,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:15:35,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:15:37,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:15:37,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:15:38,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:15:40,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:15:40,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:15:40,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1005466.6666666666, ans=0.125 2023-10-02 20:15:42,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:15:44,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:15:47,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:15:47,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1005466.6666666666, ans=0.1 2023-10-02 20:15:48,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:15:48,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1005466.6666666666, ans=0.125 2023-10-02 20:15:52,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:15:55,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1005533.3333333334, ans=0.125 2023-10-02 20:15:56,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1005533.3333333334, ans=0.2 2023-10-02 20:15:58,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:15:58,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 20:16:04,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:16:04,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:16:06,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:16:10,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 20:16:11,384 INFO [train.py:1046] (1/4) Epoch 29, batch 2100, loss[loss=0.1501, simple_loss=0.2046, pruned_loss=0.04787, over 19675.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2432, pruned_loss=0.04415, over 4712611.30 frames. ], batch size: 388, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:16:12,817 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 20:16:12,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:16:14,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:16:14,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:16:15,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:16:15,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 20:16:15,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 20:16:17,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:16:19,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:16:19,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:16:24,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:16:24,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1005666.6666666666, ans=0.125 2023-10-02 20:16:25,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:16:25,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 20:16:27,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:16:27,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 20:16:27,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 20:16:28,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:16:28,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:16:28,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 20:16:30,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 20:16:35,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 20:16:35,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:16:38,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:16:39,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:16:41,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:16:43,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 20:16:43,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:16:43,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 20:16:45,147 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.76 vs. limit=12.0 2023-10-02 20:16:45,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 20:16:45,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:16:45,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 20:16:45,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 20:16:47,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 20:16:48,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:16:51,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:16:52,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:16:54,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:16:56,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:16:57,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:16:57,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 20:16:57,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:16:57,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:16:58,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:16:58,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 20:17:00,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 20:17:02,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 20:17:06,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:17:09,076 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.849e+02 2.136e+02 2.701e+02 4.119e+02, threshold=4.273e+02, percent-clipped=1.0 2023-10-02 20:17:09,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:17:09,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 20:17:16,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:17:17,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:17:19,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:17:19,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:17:19,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 20:17:19,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:17:20,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:17:20,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:17:22,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:17:22,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:17:23,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 20:17:24,498 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:17:24,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1005933.3333333334, ans=0.125 2023-10-02 20:17:25,463 INFO [train.py:1046] (1/4) Epoch 29, batch 2150, loss[loss=0.1518, simple_loss=0.2236, pruned_loss=0.03995, over 23435.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2416, pruned_loss=0.04348, over 4721778.83 frames. ], batch size: 285, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:17:26,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 20:17:26,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:17:27,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1005933.3333333334, ans=0.125 2023-10-02 20:17:28,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:17:28,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:17:28,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:17:28,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1005933.3333333334, ans=0.04949747468305833 2023-10-02 20:17:29,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:17:34,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 20:17:37,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:17:38,429 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.26 vs. limit=22.5 2023-10-02 20:17:38,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:17:39,319 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.02 vs. limit=12.0 2023-10-02 20:17:40,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:17:40,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:17:40,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:17:41,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1006000.0, ans=0.125 2023-10-02 20:17:43,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:17:44,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:17:44,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:17:47,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:17:48,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 20:17:50,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1006000.0, ans=0.07 2023-10-02 20:17:51,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:17:53,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:17:53,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:17:54,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:17:54,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:17:55,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:17:55,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:17:55,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:17:56,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:17:59,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 20:17:59,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:18:00,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:18:00,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:18:01,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:18:03,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:18:07,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:18:07,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:18:07,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:18:07,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 20:18:08,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:18:13,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:18:13,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:13,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:18:14,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:18:15,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:16,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:16,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 20:18:17,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 20:18:18,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:18:18,909 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 20:18:20,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:20,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:18:21,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 20:18:21,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:18:21,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 20:18:21,475 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 20:18:21,475 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 20:18:22,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 20:18:24,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:24,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:18:24,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:18:25,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:26,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:18:29,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:29,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:30,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1006200.0, ans=0.09899494936611666 2023-10-02 20:18:37,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:18:39,726 INFO [train.py:1046] (1/4) Epoch 29, batch 2200, loss[loss=0.1432, simple_loss=0.2251, pruned_loss=0.03063, over 24647.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2422, pruned_loss=0.04357, over 4723477.51 frames. ], batch size: 60, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:18:39,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 20:18:42,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:18:48,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:48,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:18:49,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:18:49,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:18:52,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:52,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:18:52,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 20:18:55,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1006333.3333333334, ans=0.125 2023-10-02 20:18:56,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 20:18:59,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:19:05,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 20:19:07,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:19:09,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:19:09,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:19:13,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:19:13,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 20:19:16,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:19:17,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:19:19,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 20:19:20,001 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.78 vs. limit=22.5 2023-10-02 20:19:20,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:19:22,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:19:23,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:19:24,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:19:26,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 20:19:28,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:19:29,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 20:19:31,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:19:32,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:19:32,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:19:34,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:19:34,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:19:34,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:19:36,202 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.909e+02 2.141e+02 2.599e+02 8.500e+02, threshold=4.282e+02, percent-clipped=2.0 2023-10-02 20:19:36,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:19:36,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:19:38,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:19:39,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:19:43,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 20:19:44,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:19:46,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:19:47,375 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 20:19:47,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1006533.3333333334, ans=0.035 2023-10-02 20:19:48,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:19:48,818 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 20:19:50,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 20:19:51,523 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 20:19:52,847 INFO [train.py:1046] (1/4) Epoch 29, batch 2250, loss[loss=0.166, simple_loss=0.2367, pruned_loss=0.04769, over 23381.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2433, pruned_loss=0.04399, over 4716407.66 frames. ], batch size: 285, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:19:52,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:19:52,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:19:54,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:19:54,297 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 20:19:56,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:20:00,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:20:05,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:20:05,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:20:09,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:20:11,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:20:11,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:20:14,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 20:20:14,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:20:14,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1006666.6666666666, ans=0.1 2023-10-02 20:20:16,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:20:17,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 20:20:18,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:20:18,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:20:21,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:20:25,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1006733.3333333334, ans=0.0 2023-10-02 20:20:26,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:20:28,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 20:20:28,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:20:29,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 20:20:31,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:20:34,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:20:37,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:20:38,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:20:40,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:20:40,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:20:41,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:20:43,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:20:48,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:20:49,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:20:54,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:20:54,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:20:55,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:20:55,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1006866.6666666666, ans=0.0 2023-10-02 20:20:56,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1006866.6666666666, ans=0.0 2023-10-02 20:20:58,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 20:21:00,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 20:21:00,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 20:21:00,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:21:00,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:21:03,401 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.44 vs. limit=10.0 2023-10-02 20:21:04,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 20:21:06,753 INFO [train.py:1046] (1/4) Epoch 29, batch 2300, loss[loss=0.1593, simple_loss=0.2348, pruned_loss=0.04191, over 23692.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.244, pruned_loss=0.04422, over 4712634.32 frames. ], batch size: 135, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:21:06,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:21:06,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:21:07,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1006933.3333333334, ans=0.04949747468305833 2023-10-02 20:21:11,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:21:13,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:21:13,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1006933.3333333334, ans=0.0 2023-10-02 20:21:16,151 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 20:21:17,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:21:24,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:21:24,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 20:21:24,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:21:25,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:21:25,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 20:21:27,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:21:27,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1007000.0, ans=0.2 2023-10-02 20:21:30,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:21:31,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:21:34,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:21:37,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:21:39,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:21:44,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:21:45,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:21:47,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:21:49,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:21:53,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:21:53,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1007133.3333333334, ans=0.0 2023-10-02 20:21:54,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:21:54,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:21:54,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 20:22:00,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 20:22:00,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:22:01,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:22:01,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:22:01,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:22:02,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 20:22:02,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:22:02,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 20:22:02,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:22:02,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:22:04,288 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.800e+02 1.972e+02 2.166e+02 3.182e+02, threshold=3.945e+02, percent-clipped=0.0 2023-10-02 20:22:04,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 20:22:10,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:22:12,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:22:17,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:22:17,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:22:17,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:22:19,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:22:20,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:22:21,777 INFO [train.py:1046] (1/4) Epoch 29, batch 2350, loss[loss=0.1884, simple_loss=0.266, pruned_loss=0.05545, over 23273.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2446, pruned_loss=0.04422, over 4706740.54 frames. ], batch size: 93, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:22:21,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:22:21,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 20:22:27,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:22:27,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 20:22:32,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 20:22:37,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:22:40,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:22:40,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:22:40,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:22:40,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:22:41,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 20:22:47,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:22:50,938 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.89 vs. limit=6.0 2023-10-02 20:22:51,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 20:22:52,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:22:57,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:22:57,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:22:58,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:22:58,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 20:22:59,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1007400.0, ans=0.125 2023-10-02 20:23:00,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:23:01,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:23:01,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:23:02,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:23:06,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:23:08,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 20:23:08,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:23:11,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:23:11,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:23:13,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 20:23:14,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:23:16,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 20:23:16,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:23:21,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 20:23:24,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 20:23:25,584 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.58 vs. limit=6.0 2023-10-02 20:23:25,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:23:26,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 20:23:26,030 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 20:23:26,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 20:23:29,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 20:23:30,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:23:35,439 INFO [train.py:1046] (1/4) Epoch 29, batch 2400, loss[loss=0.1595, simple_loss=0.2222, pruned_loss=0.04837, over 23800.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2445, pruned_loss=0.04438, over 4698101.73 frames. ], batch size: 164, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:23:35,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:23:37,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1007600.0, ans=0.125 2023-10-02 20:23:38,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:23:39,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:23:39,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 20:23:41,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 20:23:47,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:23:47,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:23:50,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 20:23:50,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:23:51,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:23:51,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 20:23:56,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:23:59,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 20:24:05,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:24:07,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 20:24:09,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:24:10,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:24:15,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:24:15,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 20:24:15,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1007733.3333333334, ans=0.125 2023-10-02 20:24:17,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:24:18,232 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.30 vs. limit=15.0 2023-10-02 20:24:22,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:24:25,709 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.49 vs. limit=15.0 2023-10-02 20:24:26,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:24:28,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:24:30,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:24:30,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 20:24:30,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:24:30,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:24:30,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:24:30,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 20:24:34,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1007866.6666666666, ans=0.125 2023-10-02 20:24:35,051 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.909e+02 2.127e+02 2.478e+02 3.965e+02, threshold=4.255e+02, percent-clipped=1.0 2023-10-02 20:24:35,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:24:35,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:24:35,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 20:24:36,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 20:24:38,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:24:38,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:24:38,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 20:24:39,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 20:24:39,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 20:24:39,436 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 20:24:40,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 20:24:41,595 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.41 vs. limit=6.0 2023-10-02 20:24:42,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:24:45,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:24:45,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:24:46,692 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 20:24:48,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:24:48,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 20:24:50,013 INFO [train.py:1046] (1/4) Epoch 29, batch 2450, loss[loss=0.1664, simple_loss=0.2478, pruned_loss=0.04253, over 24662.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2421, pruned_loss=0.04417, over 4689111.71 frames. ], batch size: 68, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:24:51,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:24:51,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:24:53,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1007933.3333333334, ans=0.125 2023-10-02 20:24:55,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:24:55,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:24:58,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 20:25:03,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:25:03,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:25:07,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:25:07,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:25:08,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:25:08,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 20:25:12,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:25:14,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:25:15,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:25:18,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 20:25:20,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:25:21,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:25:21,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:25:24,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 20:25:24,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:25:30,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:25:31,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:25:31,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:25:31,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:25:33,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:25:35,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:25:36,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 20:25:40,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:25:40,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:25:45,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:25:45,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:25:45,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1008133.3333333334, ans=0.0 2023-10-02 20:25:48,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:25:48,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 20:25:48,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:25:49,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:25:49,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 20:25:51,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:25:51,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:25:55,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:25:59,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:25:59,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:26:02,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 20:26:04,387 INFO [train.py:1046] (1/4) Epoch 29, batch 2500, loss[loss=0.1668, simple_loss=0.2496, pruned_loss=0.04194, over 24057.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2414, pruned_loss=0.04375, over 4687521.51 frames. ], batch size: 80, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:26:04,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:26:09,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:26:17,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:26:18,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:26:18,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:26:18,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 20:26:24,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:26:26,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:26:26,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 20:26:26,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 20:26:27,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 20:26:27,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:26:29,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:26:29,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 20:26:29,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:26:30,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 20:26:30,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:26:34,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:26:36,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:26:37,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:26:39,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 20:26:40,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:26:40,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:26:45,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:26:46,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:26:49,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:26:56,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:26:58,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 20:26:59,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:26:59,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:27:01,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:27:01,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:27:01,267 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 20:27:01,267 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 20:27:01,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 20:27:04,409 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.824e+02 2.011e+02 2.167e+02 3.747e+02, threshold=4.021e+02, percent-clipped=0.0 2023-10-02 20:27:04,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:27:06,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 20:27:06,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 20:27:07,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:27:08,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 20:27:10,732 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1008533.3333333334, ans=0.125 2023-10-02 20:27:13,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 20:27:17,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:27:18,563 INFO [train.py:1046] (1/4) Epoch 29, batch 2550, loss[loss=0.1707, simple_loss=0.2641, pruned_loss=0.03868, over 24281.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2423, pruned_loss=0.04421, over 4685375.19 frames. ], batch size: 74, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:27:18,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:27:18,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:27:20,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:27:21,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 20:27:23,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:27:26,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 20:27:26,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:27:29,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:27:32,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:27:32,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 20:27:32,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:27:32,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:27:32,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:27:36,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:27:36,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 20:27:36,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:27:36,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:27:36,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 20:27:36,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1008666.6666666666, ans=0.125 2023-10-02 20:27:42,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1008666.6666666666, ans=0.125 2023-10-02 20:27:44,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1008666.6666666666, ans=0.125 2023-10-02 20:27:50,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:27:54,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:27:54,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:27:54,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:27:56,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:27:56,678 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1008733.3333333334, ans=0.125 2023-10-02 20:28:03,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:28:06,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:28:06,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:28:06,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:28:06,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:28:07,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:28:08,734 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.32 vs. limit=6.0 2023-10-02 20:28:11,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:28:12,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:28:15,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:28:15,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 20:28:15,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:28:17,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:28:18,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:28:19,367 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.65 vs. limit=15.0 2023-10-02 20:28:19,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:28:21,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:28:27,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:28:28,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:28:32,974 INFO [train.py:1046] (1/4) Epoch 29, batch 2600, loss[loss=0.1787, simple_loss=0.2442, pruned_loss=0.05657, over 23755.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2436, pruned_loss=0.04424, over 4707012.18 frames. ], batch size: 179, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:28:33,013 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 20:28:36,077 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 20:28:36,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:28:36,128 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 20:28:37,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 20:28:37,531 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 20:28:37,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1008933.3333333334, ans=0.125 2023-10-02 20:28:39,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:28:40,912 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 20:28:42,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 20:28:43,691 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 20:28:45,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:28:47,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 20:28:48,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 20:28:49,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:28:51,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 20:28:53,779 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 20:28:53,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 20:29:00,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:01,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:29:01,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:29:01,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 20:29:02,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:29:08,913 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 20:29:15,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:29:15,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:15,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 20:29:16,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:29:16,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:29:16,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1009133.3333333334, ans=0.125 2023-10-02 20:29:17,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 20:29:22,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:29:22,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:29:25,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:29:28,060 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 20:29:28,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:29:29,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:29:30,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1009133.3333333334, ans=0.025 2023-10-02 20:29:32,337 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.423e+02 1.950e+02 2.073e+02 2.316e+02 4.084e+02, threshold=4.145e+02, percent-clipped=1.0 2023-10-02 20:29:33,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:29:35,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:29:35,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 20:29:35,790 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.38 vs. limit=15.0 2023-10-02 20:29:36,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:29:38,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:29:38,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:29:42,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 20:29:43,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:46,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:29:47,472 INFO [train.py:1046] (1/4) Epoch 29, batch 2650, loss[loss=0.1612, simple_loss=0.2508, pruned_loss=0.03585, over 24401.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2439, pruned_loss=0.04392, over 4723203.38 frames. ], batch size: 69, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:29:49,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 20:29:49,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:50,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:29:50,931 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 20:29:52,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:29:53,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:54,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:29:56,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:29:59,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:30:00,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 20:30:00,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:30:01,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:30:05,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 20:30:05,533 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 20:30:08,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:30:11,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 20:30:12,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:30:14,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 20:30:18,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:30:18,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:30:18,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:30:18,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:22,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 20:30:24,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 20:30:26,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:30:28,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 20:30:28,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:30:30,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:32,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:30:32,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:30:32,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:30:33,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:30:34,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:30:36,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:30:37,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:30:39,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:30:39,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:30:39,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:30:42,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:30:43,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:30:43,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 20:30:46,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:49,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:30:49,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:30:49,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 20:30:49,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1009533.3333333334, ans=0.125 2023-10-02 20:30:54,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:30:55,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:56,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:57,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:00,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:31:00,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:00,553 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.40 vs. limit=15.0 2023-10-02 20:31:01,351 INFO [train.py:1046] (1/4) Epoch 29, batch 2700, loss[loss=0.1405, simple_loss=0.216, pruned_loss=0.03255, over 24320.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2447, pruned_loss=0.04423, over 4713000.60 frames. ], batch size: 56, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:31:02,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:31:02,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 20:31:03,059 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:31:04,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:31:04,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1009600.0, ans=0.95 2023-10-02 20:31:07,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 20:31:08,217 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.77 vs. limit=10.0 2023-10-02 20:31:08,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:31:08,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:09,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:10,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:31:10,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:31:11,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:31:11,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:31:11,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 20:31:11,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:31:12,400 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.67 vs. limit=6.0 2023-10-02 20:31:15,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:31:16,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:31:16,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:31:19,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:31:20,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 20:31:20,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:31:26,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:31:26,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:31:29,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1009733.3333333334, ans=0.125 2023-10-02 20:31:34,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:31:34,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:31:34,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:31:34,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:31:35,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:31:38,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:31:38,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:31:38,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:31:42,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:43,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:31:45,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten.whitening_limit, batch_count=1009800.0, ans=22.5 2023-10-02 20:31:50,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:31:51,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:31:54,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:31:54,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:31:58,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:58,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:31:58,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:32:00,991 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.786e+02 2.084e+02 2.386e+02 3.655e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-02 20:32:01,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:01,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:32:03,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:32:05,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:32:07,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:32:08,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:32:11,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 20:32:11,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:32:14,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:32:14,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 20:32:15,900 INFO [train.py:1046] (1/4) Epoch 29, batch 2750, loss[loss=0.1711, simple_loss=0.2468, pruned_loss=0.04766, over 23335.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2446, pruned_loss=0.04423, over 4711054.90 frames. ], batch size: 105, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:32:17,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 20:32:17,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:32:18,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:32:18,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:32:21,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:21,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:32:21,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:24,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:32:26,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 20:32:26,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:32:26,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:26,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 20:32:26,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:32:28,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:32:33,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 20:32:35,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:32:36,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:36,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:32:36,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 20:32:38,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:32:38,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:32:39,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:32:39,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:32:43,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:32:43,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 20:32:45,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:32:45,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:47,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:32:50,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1010066.6666666666, ans=0.125 2023-10-02 20:32:54,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:32:55,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:32:55,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:33:00,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:33:00,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:33:00,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:33:06,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:33:06,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:33:06,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 20:33:11,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:33:11,744 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.35 vs. limit=22.5 2023-10-02 20:33:13,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 20:33:19,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 20:33:21,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:33:21,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 20:33:21,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:33:24,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:33:24,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 20:33:24,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:33:24,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1010200.0, ans=0.0 2023-10-02 20:33:27,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 20:33:29,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:33:29,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:33:29,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 20:33:30,441 INFO [train.py:1046] (1/4) Epoch 29, batch 2800, loss[loss=0.1769, simple_loss=0.2577, pruned_loss=0.04803, over 24027.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2435, pruned_loss=0.04386, over 4708529.65 frames. ], batch size: 80, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:33:30,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:33:30,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:33:31,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:33:33,821 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 20:33:33,822 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 20:33:36,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:33:38,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:33:39,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:33:42,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:33:43,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 20:33:44,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 20:33:48,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 20:33:49,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:33:49,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:33:49,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:33:49,920 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.37 vs. limit=10.0 2023-10-02 20:33:52,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:33:52,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:33:53,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 20:33:54,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:33:58,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1010400.0, ans=0.05 2023-10-02 20:34:02,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:34:04,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:34:07,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:34:09,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:34:09,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:34:13,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:34:13,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 20:34:13,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:34:15,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:34:15,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:34:19,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:34:19,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:34:23,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:34:23,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1010466.6666666666, ans=0.125 2023-10-02 20:34:25,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:34:26,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:34:26,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:34:27,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:34:27,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:34:28,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:34:28,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 20:34:28,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:34:29,612 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.502e+02 1.917e+02 2.083e+02 2.340e+02 4.683e+02, threshold=4.167e+02, percent-clipped=1.0 2023-10-02 20:34:31,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:34:31,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:34:32,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 20:34:34,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:34:34,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:34:35,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:34:35,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 20:34:36,719 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.16 vs. limit=15.0 2023-10-02 20:34:40,739 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.54 vs. limit=6.0 2023-10-02 20:34:43,738 INFO [train.py:1046] (1/4) Epoch 29, batch 2850, loss[loss=0.1572, simple_loss=0.232, pruned_loss=0.04121, over 23125.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2424, pruned_loss=0.04363, over 4694661.01 frames. ], batch size: 105, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:34:43,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:34:43,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:34:43,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:34:44,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1010600.0, ans=0.125 2023-10-02 20:34:45,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:34:50,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:34:50,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:34:50,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:34:52,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:34:52,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:34:54,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:34:55,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 20:35:02,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 20:35:02,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:35:03,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 20:35:04,531 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.35 vs. limit=6.0 2023-10-02 20:35:05,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:35:07,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 20:35:08,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 20:35:10,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:35:11,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1010666.6666666666, ans=0.0 2023-10-02 20:35:11,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1010666.6666666666, ans=0.0 2023-10-02 20:35:11,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1010666.6666666666, ans=0.2 2023-10-02 20:35:19,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:35:21,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:35:21,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:35:23,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:35:23,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:35:23,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:35:25,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:35:25,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 20:35:25,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1010733.3333333334, ans=0.125 2023-10-02 20:35:28,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:35:28,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:35:28,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1010800.0, ans=0.04949747468305833 2023-10-02 20:35:29,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:35:31,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:35:32,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:35:33,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:35:33,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:35:35,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:35:36,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:35:38,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:35:39,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1010800.0, ans=0.0 2023-10-02 20:35:40,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:35:41,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:35:46,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:35:48,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 20:35:49,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 20:35:51,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:35:51,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:35:51,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 20:35:53,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:35:53,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:35:53,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:35:53,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:35:53,132 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 20:35:53,164 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 20:35:53,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:35:53,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1010866.6666666666, ans=0.125 2023-10-02 20:35:54,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:35:57,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1010933.3333333334, ans=0.2 2023-10-02 20:35:58,936 INFO [train.py:1046] (1/4) Epoch 29, batch 2900, loss[loss=0.1805, simple_loss=0.2547, pruned_loss=0.05312, over 23814.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2418, pruned_loss=0.04335, over 4690476.95 frames. ], batch size: 212, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:35:59,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1010933.3333333334, ans=0.1 2023-10-02 20:36:00,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 20:36:00,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:36:00,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1010933.3333333334, ans=0.125 2023-10-02 20:36:01,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:36:02,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 20:36:07,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:36:07,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 20:36:07,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1010933.3333333334, ans=0.0 2023-10-02 20:36:08,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 20:36:10,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:36:10,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:36:13,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:36:14,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:36:16,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:36:18,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:36:19,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:36:20,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 20:36:22,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:36:22,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:36:25,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 20:36:25,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 20:36:30,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:36:30,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 20:36:30,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:36:33,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:36:33,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 20:36:36,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:36:37,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:36:40,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:36:43,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:36:45,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 20:36:45,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 20:36:45,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:36:48,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:36:50,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 20:36:52,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:36:54,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:37:00,193 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.887e+02 2.065e+02 2.292e+02 3.818e+02, threshold=4.129e+02, percent-clipped=0.0 2023-10-02 20:37:03,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:37:03,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:37:04,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 20:37:04,980 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.65 vs. limit=15.0 2023-10-02 20:37:06,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:37:06,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 20:37:07,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:37:08,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:37:09,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1011200.0, ans=0.1 2023-10-02 20:37:13,643 INFO [train.py:1046] (1/4) Epoch 29, batch 2950, loss[loss=0.176, simple_loss=0.2518, pruned_loss=0.05016, over 23825.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2429, pruned_loss=0.04371, over 4694303.11 frames. ], batch size: 195, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:37:15,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:37:17,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 20:37:18,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:37:18,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:37:20,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:37:21,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:37:22,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 20:37:23,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 20:37:25,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:37:25,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:37:30,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:37:32,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:37:35,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:37:35,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:37:38,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:37:38,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:37:39,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:37:40,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:37:40,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:37:44,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 20:37:48,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 20:37:48,695 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 20:37:50,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:37:51,987 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 20:37:53,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 20:37:54,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:37:54,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:37:54,750 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 20:37:54,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 20:37:56,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 20:37:57,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:37:58,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:38:00,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1011466.6666666666, ans=0.125 2023-10-02 20:38:01,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:38:02,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:38:02,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:38:02,961 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 20:38:03,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1011466.6666666666, ans=0.125 2023-10-02 20:38:04,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:38:04,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 20:38:09,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:38:10,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:38:10,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 20:38:10,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:38:12,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 20:38:15,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:38:16,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:38:16,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:38:19,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:38:19,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 20:38:21,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:38:21,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:38:21,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:38:22,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:38:24,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:38:24,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:38:24,884 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.40 vs. limit=22.5 2023-10-02 20:38:25,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:38:25,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 20:38:26,934 INFO [train.py:1046] (1/4) Epoch 29, batch 3000, loss[loss=0.2034, simple_loss=0.2711, pruned_loss=0.06788, over 19130.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2437, pruned_loss=0.04437, over 4686988.28 frames. ], batch size: 388, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:38:26,935 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 20:38:33,045 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.2957, 3.2184, 4.2690, 3.8313], device='cuda:1') 2023-10-02 20:38:39,083 INFO [train.py:1078] (1/4) Epoch 29, validation: loss=0.3203, simple_loss=0.2757, pruned_loss=0.1825, over 1125622.00 frames. 2023-10-02 20:38:39,083 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 20:38:39,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:38:42,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:38:42,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 20:38:46,009 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 20:38:46,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 20:38:47,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:38:48,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:38:48,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 20:38:48,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:38:48,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1011600.0, ans=0.0 2023-10-02 20:38:51,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1011600.0, ans=0.2 2023-10-02 20:38:55,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:39:05,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:39:11,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 20:39:12,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:39:15,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:39:16,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:39:16,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:39:18,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:39:18,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 20:39:21,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 20:39:21,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:39:22,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 20:39:25,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:39:25,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:39:25,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:39:25,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:39:29,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:39:29,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:39:29,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:39:30,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:39:33,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 20:39:33,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:39:35,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:39:35,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:39:38,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:39:39,806 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.868e+02 2.049e+02 2.188e+02 3.716e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-02 20:39:39,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:39:39,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 20:39:41,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 20:39:41,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:39:41,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 20:39:42,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:39:43,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1011866.6666666666, ans=0.05 2023-10-02 20:39:44,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 20:39:46,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:39:46,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1011866.6666666666, ans=0.125 2023-10-02 20:39:49,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 20:39:49,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 20:39:50,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 20:39:50,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:39:50,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1011866.6666666666, ans=0.125 2023-10-02 20:39:51,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:39:51,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:39:53,101 INFO [train.py:1046] (1/4) Epoch 29, batch 3050, loss[loss=0.1419, simple_loss=0.2191, pruned_loss=0.0324, over 24290.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2447, pruned_loss=0.04513, over 4681412.92 frames. ], batch size: 56, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:39:53,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:39:53,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:39:54,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:39:54,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 20:39:56,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:39:57,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:39:59,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:40:01,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:40:06,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 20:40:09,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 20:40:09,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 20:40:09,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:40:14,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:40:17,003 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.45 vs. limit=15.0 2023-10-02 20:40:18,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:40:18,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:40:18,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:40:21,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:40:21,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:40:21,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:40:23,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:40:23,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:40:23,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:40:23,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1012066.6666666666, ans=0.0 2023-10-02 20:40:25,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:40:27,712 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.16 vs. limit=15.0 2023-10-02 20:40:28,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:40:28,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 20:40:29,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:40:29,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:40:31,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1012066.6666666666, ans=0.0 2023-10-02 20:40:32,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:40:32,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:40:34,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:40:35,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:40:39,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:40:41,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:40:46,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:40:47,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:40:47,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:40:48,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:40:48,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1012133.3333333334, ans=0.125 2023-10-02 20:40:49,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 20:40:49,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:40:51,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 20:40:51,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:40:52,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:40:53,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 20:40:55,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:40:58,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1012200.0, ans=0.0 2023-10-02 20:40:58,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1012200.0, ans=0.125 2023-10-02 20:41:01,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:41:02,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:41:05,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:41:07,163 INFO [train.py:1046] (1/4) Epoch 29, batch 3100, loss[loss=0.1652, simple_loss=0.2545, pruned_loss=0.03795, over 24571.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2437, pruned_loss=0.04424, over 4690180.99 frames. ], batch size: 71, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:41:08,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 20:41:09,203 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.99 vs. limit=15.0 2023-10-02 20:41:11,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 20:41:12,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 20:41:12,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1012266.6666666666, ans=0.125 2023-10-02 20:41:15,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:41:17,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1012266.6666666666, ans=0.125 2023-10-02 20:41:19,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:41:19,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:41:21,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 20:41:22,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1012333.3333333334, ans=0.125 2023-10-02 20:41:25,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:41:30,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 20:41:34,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 20:41:34,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:41:36,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:41:36,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:41:39,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 20:41:40,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:41:40,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 20:41:40,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:41:42,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:41:43,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 20:41:44,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:41:47,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:41:47,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 20:41:48,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1012400.0, ans=0.0 2023-10-02 20:41:49,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 20:41:49,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:41:51,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:41:51,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1012466.6666666666, ans=0.09899494936611666 2023-10-02 20:41:52,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:41:52,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:41:54,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:41:56,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:41:56,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:41:57,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:41:57,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:41:57,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:41:57,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 20:42:01,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:42:03,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 20:42:05,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:42:05,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 20:42:07,525 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.894e+02 2.077e+02 2.394e+02 5.109e+02, threshold=4.155e+02, percent-clipped=1.0 2023-10-02 20:42:07,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:42:07,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:42:07,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 20:42:09,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1012533.3333333334, ans=0.125 2023-10-02 20:42:10,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1012533.3333333334, ans=0.125 2023-10-02 20:42:18,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 20:42:18,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1012533.3333333334, ans=0.125 2023-10-02 20:42:20,658 INFO [train.py:1046] (1/4) Epoch 29, batch 3150, loss[loss=0.1563, simple_loss=0.2437, pruned_loss=0.03448, over 24491.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2426, pruned_loss=0.0438, over 4697377.95 frames. ], batch size: 63, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:42:20,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:42:20,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:42:24,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:42:24,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:42:24,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 20:42:26,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1012600.0, ans=0.0 2023-10-02 20:42:27,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:42:27,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 20:42:29,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 20:42:31,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:42:34,129 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 20:42:34,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 20:42:35,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:42:35,703 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 20:42:35,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 20:42:37,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 20:42:38,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 20:42:38,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 20:42:38,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:42:38,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:42:40,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:42:40,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 20:42:41,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:42:41,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:42:42,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:42:46,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:42:48,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 20:42:49,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:42:52,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:42:54,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:42:54,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 20:42:55,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 20:42:57,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:42:57,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 20:42:57,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 20:42:58,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:42:58,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:42:59,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:42:59,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 20:43:01,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 20:43:01,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:43:02,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:03,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:43:03,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:43:03,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 20:43:04,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1012800.0, ans=0.125 2023-10-02 20:43:05,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:43:06,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1012800.0, ans=0.05 2023-10-02 20:43:08,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 20:43:08,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:11,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 20:43:11,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 20:43:12,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:43:12,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:43:12,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 20:43:13,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 20:43:13,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:43:17,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:43:20,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:20,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:43:20,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1012866.6666666666, ans=0.0 2023-10-02 20:43:25,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:43:26,014 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.73 vs. limit=22.5 2023-10-02 20:43:27,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:30,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 20:43:34,263 INFO [train.py:1046] (1/4) Epoch 29, batch 3200, loss[loss=0.1726, simple_loss=0.2557, pruned_loss=0.04477, over 23363.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2423, pruned_loss=0.04334, over 4700486.00 frames. ], batch size: 93, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:43:34,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:43:34,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:43:37,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:38,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:43:38,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 20:43:41,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:43:45,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:43:49,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:58,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:44:00,255 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.45 vs. limit=12.0 2023-10-02 20:44:09,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 20:44:09,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:44:12,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 20:44:12,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 20:44:12,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1013066.6666666666, ans=0.0 2023-10-02 20:44:16,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:44:16,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:44:17,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:44:20,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 20:44:21,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 20:44:23,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 20:44:25,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 20:44:26,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:44:28,993 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.53 vs. limit=15.0 2023-10-02 20:44:35,240 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.842e+02 2.034e+02 2.276e+02 3.426e+02, threshold=4.068e+02, percent-clipped=0.0 2023-10-02 20:44:36,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:44:36,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:44:36,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:44:37,987 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 20:44:37,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 20:44:42,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:44:42,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 20:44:43,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 20:44:44,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 20:44:44,605 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.65 vs. limit=15.0 2023-10-02 20:44:46,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 20:44:48,063 INFO [train.py:1046] (1/4) Epoch 29, batch 3250, loss[loss=0.1514, simple_loss=0.2303, pruned_loss=0.03629, over 24467.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2419, pruned_loss=0.04322, over 4704998.49 frames. ], batch size: 58, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:44:48,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:44:51,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:44:51,359 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 20:44:51,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:44:51,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:44:54,194 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 20:44:57,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=1013266.6666666666, ans=0.5 2023-10-02 20:44:59,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:45:02,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:45:02,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1013333.3333333334, ans=0.125 2023-10-02 20:45:10,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:45:10,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 20:45:12,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:45:12,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:45:12,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:45:13,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:45:15,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 20:45:18,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:45:18,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:45:18,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:45:18,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:45:18,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:45:18,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:45:21,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:45:22,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:45:24,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:45:24,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:45:26,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:45:26,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1013400.0, ans=0.1 2023-10-02 20:45:27,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:45:27,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:45:27,924 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.65 vs. limit=15.0 2023-10-02 20:45:31,193 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.08 vs. limit=12.0 2023-10-02 20:45:31,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 20:45:32,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:45:33,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:45:34,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:45:34,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:45:39,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:45:41,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1013466.6666666666, ans=0.2 2023-10-02 20:45:45,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:45:45,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:45:45,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 20:45:45,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:45:45,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 20:45:47,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:45:48,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1013466.6666666666, ans=0.125 2023-10-02 20:45:49,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 20:45:49,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 20:45:51,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:45:53,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:45:54,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:45:55,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 20:45:55,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:45:58,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:46:00,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:46:00,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1013533.3333333334, ans=0.125 2023-10-02 20:46:02,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 20:46:02,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:46:03,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:46:03,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 20:46:06,110 INFO [train.py:1046] (1/4) Epoch 29, batch 3300, loss[loss=0.14, simple_loss=0.2159, pruned_loss=0.03201, over 24350.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2432, pruned_loss=0.04359, over 4706306.97 frames. ], batch size: 56, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:46:06,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:46:06,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 20:46:06,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1013600.0, ans=0.1 2023-10-02 20:46:08,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 20:46:09,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 20:46:09,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:46:11,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1013600.0, ans=0.125 2023-10-02 20:46:13,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:46:13,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:46:15,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:46:16,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 20:46:16,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:46:16,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1013600.0, ans=0.015 2023-10-02 20:46:19,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:46:21,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:46:25,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 20:46:27,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:46:27,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:46:28,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:46:30,052 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 20:46:31,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:46:31,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 20:46:33,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:46:33,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:46:33,301 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 20:46:36,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:46:36,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 20:46:38,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:46:38,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 20:46:40,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 20:46:40,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:46:42,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:46:43,556 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 20:46:43,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1013733.3333333334, ans=0.125 2023-10-02 20:46:46,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 20:46:46,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:46:49,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 20:46:50,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:46:52,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:46:52,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:46:56,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:46:56,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:46:56,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:46:56,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:46:58,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1013800.0, ans=0.125 2023-10-02 20:46:59,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:46:59,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:47:00,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:47:00,757 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 20:47:00,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 20:47:04,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 20:47:04,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:47:04,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:47:06,764 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.463e+02 1.852e+02 2.083e+02 2.464e+02 2.851e+02, threshold=4.166e+02, percent-clipped=0.0 2023-10-02 20:47:06,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:47:06,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:47:08,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:47:08,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:08,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1013866.6666666666, ans=0.0 2023-10-02 20:47:09,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:47:09,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:47:12,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:47:14,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 20:47:14,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:14,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:15,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:47:17,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:47:17,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:47:18,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:47:18,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:20,628 INFO [train.py:1046] (1/4) Epoch 29, batch 3350, loss[loss=0.1732, simple_loss=0.2463, pruned_loss=0.05001, over 23546.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2443, pruned_loss=0.04389, over 4709163.39 frames. ], batch size: 256, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:47:23,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:47:25,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:26,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:47:28,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1013933.3333333334, ans=0.0 2023-10-02 20:47:29,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:30,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:47:32,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:47:33,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:47:35,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 20:47:36,794 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 20:47:36,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:47:39,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 20:47:41,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 20:47:42,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:47:42,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:47:43,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:47:43,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 20:47:43,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:43,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:47:44,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1014000.0, ans=0.2 2023-10-02 20:47:45,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:45,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1014000.0, ans=0.125 2023-10-02 20:47:48,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:48,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:49,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:47:54,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:47:56,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:57,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:47:57,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1014066.6666666666, ans=0.125 2023-10-02 20:48:00,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:48:01,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:48:03,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:48:03,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:48:06,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:48:08,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 20:48:08,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:48:08,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 20:48:10,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:48:10,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 20:48:11,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:48:13,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:48:14,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1014133.3333333334, ans=0.2 2023-10-02 20:48:19,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:48:20,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 20:48:20,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:48:20,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1014200.0, ans=0.125 2023-10-02 20:48:21,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:48:22,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:48:26,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:48:28,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 20:48:28,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:48:30,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:48:30,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:48:31,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 20:48:31,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:48:31,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 20:48:33,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:48:34,762 INFO [train.py:1046] (1/4) Epoch 29, batch 3400, loss[loss=0.1507, simple_loss=0.2386, pruned_loss=0.03141, over 24659.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.245, pruned_loss=0.04421, over 4709495.44 frames. ], batch size: 68, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:48:34,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:48:34,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:48:36,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:48:36,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 20:48:37,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1014266.6666666666, ans=0.0 2023-10-02 20:48:39,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1014266.6666666666, ans=0.125 2023-10-02 20:48:40,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 20:48:41,781 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 20:48:41,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:48:44,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:48:44,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:48:44,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:48:46,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:48:49,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1014333.3333333334, ans=0.125 2023-10-02 20:48:50,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1014333.3333333334, ans=0.125 2023-10-02 20:48:54,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:48:55,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 20:48:59,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:49:01,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:49:01,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:49:02,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 20:49:08,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:49:11,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 20:49:14,514 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:49:18,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:49:20,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:49:20,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 20:49:21,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:49:21,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:49:21,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:49:21,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:49:26,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:49:29,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:49:29,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:49:31,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1014466.6666666666, ans=0.125 2023-10-02 20:49:33,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:49:35,397 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.869e+02 2.116e+02 2.414e+02 3.746e+02, threshold=4.233e+02, percent-clipped=0.0 2023-10-02 20:49:35,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 20:49:37,394 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.24 vs. limit=15.0 2023-10-02 20:49:39,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 20:49:42,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 20:49:45,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 20:49:46,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:49:47,901 INFO [train.py:1046] (1/4) Epoch 29, batch 3450, loss[loss=0.1626, simple_loss=0.2478, pruned_loss=0.03868, over 24547.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.244, pruned_loss=0.04413, over 4708038.26 frames. ], batch size: 71, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:49:48,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:49:49,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 20:49:51,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:49:54,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:50:01,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:50:01,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:50:03,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:50:03,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:50:04,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:50:09,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 20:50:09,700 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1014666.6666666666, ans=0.0 2023-10-02 20:50:13,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1014666.6666666666, ans=0.035 2023-10-02 20:50:14,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 20:50:14,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 20:50:16,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:50:16,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1014733.3333333334, ans=0.0 2023-10-02 20:50:17,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:50:23,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 20:50:25,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:50:30,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:50:30,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:50:31,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:50:33,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:50:34,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 20:50:34,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:50:34,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:50:35,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1014800.0, ans=0.1 2023-10-02 20:50:37,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:50:41,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 20:50:44,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:50:48,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:50:50,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:50:53,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:50:53,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1014866.6666666666, ans=0.04949747468305833 2023-10-02 20:50:57,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:50:57,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:50:58,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:50:59,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:51:02,655 INFO [train.py:1046] (1/4) Epoch 29, batch 3500, loss[loss=0.1547, simple_loss=0.2336, pruned_loss=0.03789, over 23310.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2429, pruned_loss=0.04373, over 4715720.77 frames. ], batch size: 105, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:51:03,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1014933.3333333334, ans=0.125 2023-10-02 20:51:04,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:51:06,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:51:08,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 20:51:10,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 20:51:12,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 20:51:15,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:51:15,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 20:51:16,152 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.38 vs. limit=22.5 2023-10-02 20:51:21,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:51:21,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:51:22,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:51:22,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:51:22,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:51:22,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:22,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1015000.0, ans=0.0 2023-10-02 20:51:24,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:51:24,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 20:51:27,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:27,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:51:28,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:51:32,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:33,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 20:51:33,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:51:36,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:51:39,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:51:40,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:42,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:51:42,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:51:45,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 20:51:45,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 20:51:45,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 20:51:46,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:51:48,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:48,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:51:48,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:51:52,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:51:52,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:51:55,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:51:57,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 20:51:57,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 20:51:57,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:51:58,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:51:59,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:52:01,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:52:03,219 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.823e+02 1.998e+02 2.225e+02 3.593e+02, threshold=3.995e+02, percent-clipped=0.0 2023-10-02 20:52:05,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 20:52:06,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:52:07,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:52:09,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 20:52:11,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 20:52:13,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:52:13,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1015200.0, ans=0.125 2023-10-02 20:52:14,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:52:14,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:52:14,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:52:16,036 INFO [train.py:1046] (1/4) Epoch 29, batch 3550, loss[loss=0.1606, simple_loss=0.2404, pruned_loss=0.04039, over 24658.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2427, pruned_loss=0.04316, over 4724764.77 frames. ], batch size: 65, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:52:18,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:52:28,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:52:29,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 20:52:32,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:52:32,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:52:34,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:52:35,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:52:35,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:52:39,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:52:39,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:52:40,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:52:40,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:52:40,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:52:46,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:52:46,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:52:47,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:52:47,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:52:48,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:52:48,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 20:52:48,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:52:50,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:52:51,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 20:52:53,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1015400.0, ans=0.0 2023-10-02 20:52:56,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:52:57,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:52:59,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:53:00,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 20:53:01,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:53:03,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 20:53:03,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:53:05,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:53:05,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:53:08,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 20:53:11,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:53:15,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:53:16,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 20:53:17,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:53:20,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:53:22,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 20:53:28,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1015600.0, ans=0.125 2023-10-02 20:53:29,816 INFO [train.py:1046] (1/4) Epoch 29, batch 3600, loss[loss=0.1782, simple_loss=0.2484, pruned_loss=0.05395, over 23887.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2422, pruned_loss=0.04317, over 4718051.60 frames. ], batch size: 195, lr: 3.51e-03, grad_scale: 16.0 2023-10-02 20:53:29,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 20:53:29,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:53:31,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:53:32,011 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.05 vs. limit=15.0 2023-10-02 20:53:34,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:53:34,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:53:36,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:53:38,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:53:39,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:53:41,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:53:42,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:53:42,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:53:42,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 20:53:42,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1015600.0, ans=0.0 2023-10-02 20:53:46,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:53:47,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:53:50,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:53:52,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:53:53,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:53:54,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:53:54,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 20:53:56,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:53:59,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:54:01,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:54:03,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:54:04,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:54:04,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:54:06,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 20:54:10,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1015733.3333333334, ans=0.125 2023-10-02 20:54:12,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:54:12,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 20:54:13,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 20:54:15,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1015800.0, ans=0.015 2023-10-02 20:54:18,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:54:23,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:54:26,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:54:31,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:54:31,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:54:31,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 20:54:31,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1015866.6666666666, ans=0.0 2023-10-02 20:54:32,622 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.883e+02 2.051e+02 2.269e+02 3.379e+02, threshold=4.102e+02, percent-clipped=0.0 2023-10-02 20:54:34,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 20:54:36,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 20:54:38,147 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=13.06 vs. limit=15.0 2023-10-02 20:54:38,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:54:38,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:54:38,922 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1015866.6666666666, ans=0.125 2023-10-02 20:54:40,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 20:54:41,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:54:41,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:54:41,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:54:41,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 20:54:43,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 20:54:44,815 INFO [train.py:1046] (1/4) Epoch 29, batch 3650, loss[loss=0.1747, simple_loss=0.2452, pruned_loss=0.05208, over 23794.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2429, pruned_loss=0.04327, over 4723853.12 frames. ], batch size: 179, lr: 3.51e-03, grad_scale: 16.0 2023-10-02 20:54:46,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:54:46,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 20:54:46,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1015933.3333333334, ans=0.0 2023-10-02 20:54:50,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 20:54:51,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:54:53,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1015933.3333333334, ans=0.125 2023-10-02 20:54:56,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 20:54:57,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 20:55:02,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:55:02,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:55:03,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 20:55:05,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1016000.0, ans=0.1 2023-10-02 20:55:06,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:55:08,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:55:08,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 20:55:10,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:55:10,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:55:10,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 20:55:10,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1016000.0, ans=0.0 2023-10-02 20:55:11,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:55:13,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:55:13,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:55:15,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:55:16,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 20:55:17,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 20:55:17,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:55:19,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 20:55:20,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:55:20,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:55:26,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:55:27,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:55:27,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:55:30,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:55:30,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:55:33,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:55:37,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:55:37,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1016133.3333333334, ans=0.0 2023-10-02 20:55:38,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:55:38,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:55:40,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:55:41,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:55:42,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:55:48,898 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 20:55:49,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1016200.0, ans=0.125 2023-10-02 20:55:52,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:55:52,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:55:54,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:55:54,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:55:55,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:55:58,311 INFO [train.py:1046] (1/4) Epoch 29, batch 3700, loss[loss=0.144, simple_loss=0.2216, pruned_loss=0.03321, over 24595.00 frames. ], tot_loss[loss=0.166, simple_loss=0.244, pruned_loss=0.04396, over 4725585.44 frames. ], batch size: 60, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:55:58,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:55:59,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 20:55:59,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:56:02,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 20:56:03,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:56:04,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:56:05,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:56:05,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 20:56:05,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:56:07,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 20:56:07,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:56:11,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 20:56:12,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:56:13,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:56:15,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:56:15,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:56:16,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:56:16,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1016333.3333333334, ans=0.2 2023-10-02 20:56:19,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:56:20,993 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 20:56:26,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1016400.0, ans=0.125 2023-10-02 20:56:27,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:56:29,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 20:56:30,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:56:30,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 20:56:30,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:56:33,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:56:35,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 20:56:36,290 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.10 vs. limit=22.5 2023-10-02 20:56:36,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:56:37,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:56:37,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=1016400.0, ans=10.0 2023-10-02 20:56:39,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1016400.0, ans=0.2 2023-10-02 20:56:40,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:56:40,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:56:42,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 20:56:42,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1016466.6666666666, ans=0.125 2023-10-02 20:56:46,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:56:46,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 20:56:46,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:56:46,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 20:56:50,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:56:50,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:56:53,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:56:55,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 20:56:57,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:56:59,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:56:59,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:56:59,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:57:01,747 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.813e+02 1.991e+02 2.176e+02 3.248e+02, threshold=3.981e+02, percent-clipped=0.0 2023-10-02 20:57:03,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:57:03,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 20:57:03,926 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.20 vs. limit=22.5 2023-10-02 20:57:05,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 20:57:05,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:57:05,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:57:08,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:57:09,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:57:12,504 INFO [train.py:1046] (1/4) Epoch 29, batch 3750, loss[loss=0.1753, simple_loss=0.2643, pruned_loss=0.04314, over 24614.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2444, pruned_loss=0.04348, over 4730870.47 frames. ], batch size: 68, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:57:12,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:57:12,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:57:14,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:57:17,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 20:57:18,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 20:57:22,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:57:22,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 20:57:22,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:57:23,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:57:26,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:57:27,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:57:29,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1016666.6666666666, ans=0.125 2023-10-02 20:57:30,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:57:31,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:57:34,133 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.37 vs. limit=5.0 2023-10-02 20:57:34,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:57:35,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:57:39,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:57:41,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 20:57:42,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:57:42,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:57:42,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1016733.3333333334, ans=0.1 2023-10-02 20:57:43,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:57:47,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 20:57:49,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 20:57:51,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:57:53,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:57:54,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:57:56,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1016800.0, ans=0.2 2023-10-02 20:57:57,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:57:57,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1016800.0, ans=0.125 2023-10-02 20:57:58,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:58:01,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 20:58:04,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:58:07,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:58:07,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:58:11,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:58:16,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:58:16,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=1016866.6666666666, ans=22.5 2023-10-02 20:58:17,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 20:58:18,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:58:20,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:58:23,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:58:27,236 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.82 vs. limit=12.0 2023-10-02 20:58:27,706 INFO [train.py:1046] (1/4) Epoch 29, batch 3800, loss[loss=0.1675, simple_loss=0.2405, pruned_loss=0.04724, over 23873.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2446, pruned_loss=0.04378, over 4707399.84 frames. ], batch size: 195, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:58:29,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:58:33,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:58:33,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:58:35,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 20:58:35,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1016933.3333333334, ans=0.125 2023-10-02 20:58:36,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:58:39,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:58:40,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:58:44,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 20:58:44,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:58:44,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:58:45,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:58:45,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:58:47,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:58:48,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 20:58:50,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 20:58:51,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:58:54,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:58:54,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1017000.0, ans=0.0 2023-10-02 20:58:56,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:58:57,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 20:58:57,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 20:58:57,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:59:00,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:59:01,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:59:06,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:59:06,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 20:59:07,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:59:15,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:59:19,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:59:22,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 20:59:24,764 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.72 vs. limit=6.0 2023-10-02 20:59:25,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 20:59:26,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:59:29,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:59:29,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:59:30,958 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.830e+02 2.115e+02 2.392e+02 3.412e+02, threshold=4.230e+02, percent-clipped=0.0 2023-10-02 20:59:31,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 20:59:32,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1017200.0, ans=0.0 2023-10-02 20:59:33,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 20:59:33,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 20:59:33,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:59:35,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:59:38,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1017200.0, ans=0.2 2023-10-02 20:59:41,234 INFO [train.py:1046] (1/4) Epoch 29, batch 3850, loss[loss=0.1478, simple_loss=0.1983, pruned_loss=0.04868, over 19633.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2425, pruned_loss=0.04355, over 4697126.89 frames. ], batch size: 388, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:59:41,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:59:43,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:59:48,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:59:49,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 20:59:50,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:59:52,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:59:52,479 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.70 vs. limit=10.0 2023-10-02 20:59:54,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:59:56,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:59:59,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:59:59,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 21:00:05,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:06,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:00:08,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:00:08,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:00:13,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.whiten.whitening_limit, batch_count=1017400.0, ans=15.0 2023-10-02 21:00:13,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:14,463 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.07 vs. limit=22.5 2023-10-02 21:00:15,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:00:15,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:00:15,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:00:16,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:17,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1017400.0, ans=0.025 2023-10-02 21:00:18,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:19,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:19,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:00:19,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 21:00:21,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 21:00:21,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:00:22,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:23,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:00:23,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:23,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 21:00:27,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 21:00:28,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:00:31,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 21:00:33,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 21:00:38,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:00:38,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:43,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:00:44,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 21:00:46,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 21:00:46,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1017533.3333333334, ans=0.0 2023-10-02 21:00:46,795 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1017533.3333333334, ans=0.1 2023-10-02 21:00:48,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1017533.3333333334, ans=0.125 2023-10-02 21:00:49,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:00:49,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:00:52,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 21:00:52,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:00:52,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:54,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:54,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:00:54,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 21:00:55,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:00:56,969 INFO [train.py:1046] (1/4) Epoch 29, batch 3900, loss[loss=0.1699, simple_loss=0.2544, pruned_loss=0.04272, over 23801.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2421, pruned_loss=0.04334, over 4708105.74 frames. ], batch size: 85, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:00:57,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 21:00:57,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:57,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:00:58,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:00:59,228 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.93 vs. limit=15.0 2023-10-02 21:00:59,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:59,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1017600.0, ans=0.125 2023-10-02 21:01:01,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:01:01,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:01:01,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:01:02,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:01:02,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 21:01:03,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:01:06,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:01:06,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 21:01:06,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:01:08,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:01:10,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 21:01:10,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:01:12,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:01:14,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 21:01:14,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:01:16,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 21:01:17,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:01:19,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 21:01:19,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 21:01:22,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1017666.6666666666, ans=0.125 2023-10-02 21:01:23,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:01:23,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=1017666.6666666666, ans=10.0 2023-10-02 21:01:25,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:01:25,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:01:26,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:01:28,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.whiten.whitening_limit, batch_count=1017733.3333333334, ans=12.0 2023-10-02 21:01:32,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:01:33,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:01:35,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1017733.3333333334, ans=0.125 2023-10-02 21:01:36,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:01:36,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:01:37,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:01:41,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:01:42,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:01:48,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1017800.0, ans=0.125 2023-10-02 21:01:49,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:01:52,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:01:57,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1017866.6666666666, ans=0.125 2023-10-02 21:02:00,140 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.908e+02 2.090e+02 2.279e+02 3.319e+02, threshold=4.180e+02, percent-clipped=0.0 2023-10-02 21:02:01,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:02:04,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:02:04,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 21:02:04,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 21:02:04,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:02:05,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 21:02:07,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:02:07,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 21:02:10,061 INFO [train.py:1046] (1/4) Epoch 29, batch 3950, loss[loss=0.1645, simple_loss=0.2446, pruned_loss=0.04225, over 23273.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2421, pruned_loss=0.04329, over 4721389.93 frames. ], batch size: 93, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:02:13,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1017933.3333333334, ans=0.125 2023-10-02 21:02:14,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:02:14,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 21:02:15,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:02:18,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:02:20,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:02:26,550 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 21:02:26,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:02:26,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 21:02:28,050 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 21:02:28,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:02:30,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:02:31,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:02:31,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:02:32,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 21:02:35,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:02:35,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:02:35,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:02:36,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:02:37,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:02:38,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1018066.6666666666, ans=0.125 2023-10-02 21:02:48,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:02:48,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:02:54,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 21:02:59,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 21:02:59,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 21:03:00,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:03:02,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:03:07,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:03:07,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:03:09,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:03:09,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:03:10,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 21:03:17,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:03:17,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:03:20,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 21:03:23,930 INFO [train.py:1046] (1/4) Epoch 29, batch 4000, loss[loss=0.1639, simple_loss=0.2573, pruned_loss=0.03522, over 24682.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2428, pruned_loss=0.0435, over 4718160.75 frames. ], batch size: 73, lr: 3.51e-03, grad_scale: 16.0 2023-10-02 21:03:28,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:03:37,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:03:38,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1018333.3333333334, ans=10.0 2023-10-02 21:03:41,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:03:41,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:03:43,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:03:43,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 21:03:43,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:03:44,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 21:03:44,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:03:46,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 21:03:47,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:03:50,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:03:50,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:03:50,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:03:50,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:03:50,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:03:52,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:03:53,893 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 21:03:55,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:03:55,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:03:58,547 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 21:03:59,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 21:03:59,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:04:01,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1018400.0, ans=0.1 2023-10-02 21:04:07,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 21:04:07,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:04:10,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:04:10,109 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 21:04:12,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:04:13,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 21:04:13,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:04:14,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:04:16,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:04:17,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:04:19,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:04:19,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:04:20,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 21:04:20,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:04:22,008 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 21:04:25,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:04:26,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1018533.3333333334, ans=0.125 2023-10-02 21:04:28,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 21:04:30,111 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.895e+02 2.160e+02 2.522e+02 3.518e+02, threshold=4.320e+02, percent-clipped=0.0 2023-10-02 21:04:30,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:04:30,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:04:30,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:04:33,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:04:37,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:04:38,760 INFO [train.py:1046] (1/4) Epoch 29, batch 4050, loss[loss=0.1726, simple_loss=0.2512, pruned_loss=0.04703, over 23307.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2439, pruned_loss=0.0434, over 4710037.56 frames. ], batch size: 105, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:04:40,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 21:04:41,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 21:04:43,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:04:43,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:04:43,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:04:45,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:04:46,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:04:50,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:04:52,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:04:54,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 21:04:55,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:04:55,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:04:58,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:05:01,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:05:04,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 21:05:07,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 21:05:07,568 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 21:05:10,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:05:16,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 21:05:17,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:05:19,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:05:22,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:05:24,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:05:24,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:05:26,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:05:27,366 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.56 vs. limit=15.0 2023-10-02 21:05:29,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 21:05:29,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:05:31,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:05:32,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 21:05:37,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:05:37,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1018866.6666666666, ans=0.0 2023-10-02 21:05:45,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 21:05:45,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:05:45,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:05:49,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 21:05:49,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 21:05:49,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:05:50,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:05:52,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:05:52,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:05:53,894 INFO [train.py:1046] (1/4) Epoch 29, batch 4100, loss[loss=0.1641, simple_loss=0.2547, pruned_loss=0.03672, over 24450.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2445, pruned_loss=0.04379, over 4709945.72 frames. ], batch size: 69, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:05:58,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1018933.3333333334, ans=10.0 2023-10-02 21:05:58,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1018933.3333333334, ans=0.0 2023-10-02 21:05:59,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 21:05:59,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 21:06:01,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1018933.3333333334, ans=0.2 2023-10-02 21:06:02,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 21:06:05,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 21:06:05,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:06:05,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:06:05,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:06:05,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:06:07,111 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 21:06:07,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1019000.0, ans=0.125 2023-10-02 21:06:07,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.80 vs. limit=22.5 2023-10-02 21:06:11,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:06:11,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:06:13,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:06:13,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:06:14,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:06:17,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:06:17,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:06:17,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 21:06:19,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:06:19,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:06:19,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:06:19,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:06:20,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 21:06:22,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:06:24,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 21:06:25,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:06:27,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:06:27,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 21:06:28,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:06:28,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:06:28,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:06:31,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1019066.6666666666, ans=0.0 2023-10-02 21:06:32,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 21:06:32,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:06:32,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:06:35,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 21:06:35,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:06:35,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:06:39,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:06:39,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1019133.3333333334, ans=0.125 2023-10-02 21:06:43,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:06:46,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:06:47,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:06:56,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:06:56,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:07:00,154 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.365e+02 1.918e+02 2.139e+02 2.602e+02 3.703e+02, threshold=4.278e+02, percent-clipped=0.0 2023-10-02 21:07:00,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:07:01,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:07:06,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:07:07,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:07:08,936 INFO [train.py:1046] (1/4) Epoch 29, batch 4150, loss[loss=0.1615, simple_loss=0.2429, pruned_loss=0.04007, over 24681.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2449, pruned_loss=0.04422, over 4699370.46 frames. ], batch size: 65, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:07:08,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:07:08,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:07:10,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 21:07:12,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:07:12,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 21:07:12,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 21:07:12,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 21:07:12,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1019266.6666666666, ans=0.125 2023-10-02 21:07:13,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:07:15,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1019266.6666666666, ans=0.0 2023-10-02 21:07:19,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:07:19,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:07:24,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:07:25,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:07:25,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:07:28,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:07:28,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:07:29,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 21:07:30,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1019333.3333333334, ans=0.125 2023-10-02 21:07:32,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:07:38,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:07:38,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 21:07:42,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 21:07:42,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:07:42,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 21:07:42,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:07:42,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:07:45,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:07:46,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:07:50,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 21:07:53,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:07:55,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:07:57,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 21:07:57,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:07:58,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 21:08:01,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1019466.6666666666, ans=0.125 2023-10-02 21:08:02,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:08:02,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:08:03,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:08:05,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 21:08:05,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:05,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 21:08:06,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:08:06,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1019533.3333333334, ans=0.2 2023-10-02 21:08:09,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 21:08:10,561 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.29 vs. limit=10.0 2023-10-02 21:08:11,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:08:11,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:08:11,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 21:08:11,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 21:08:13,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:08:13,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 21:08:14,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:08:16,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:08:16,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 21:08:16,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:08:21,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:08:22,982 INFO [train.py:1046] (1/4) Epoch 29, batch 4200, loss[loss=0.1641, simple_loss=0.2213, pruned_loss=0.05346, over 22721.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.244, pruned_loss=0.04389, over 4704036.01 frames. ], batch size: 322, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:08:23,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 21:08:23,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:08:27,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:08:28,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:08:28,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:08:28,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:08:29,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 21:08:33,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 21:08:34,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:36,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:08:40,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:08:43,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 21:08:44,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:08:44,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:46,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 21:08:46,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:08:47,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:47,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:08:47,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:08:49,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:08:51,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 21:08:51,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:55,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 21:08:55,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1019733.3333333334, ans=0.125 2023-10-02 21:08:56,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:08:58,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:08:59,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:09:02,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:09:02,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 21:09:04,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:09:05,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:09:07,232 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:09:09,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:09:12,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1019800.0, ans=15.0 2023-10-02 21:09:13,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:09:18,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:09:21,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 21:09:23,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:09:24,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1019866.6666666666, ans=0.125 2023-10-02 21:09:27,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1019866.6666666666, ans=0.0 2023-10-02 21:09:28,929 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.836e+02 2.050e+02 2.312e+02 3.476e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-02 21:09:29,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:09:29,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:09:31,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 21:09:36,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:09:38,218 INFO [train.py:1046] (1/4) Epoch 29, batch 4250, loss[loss=0.1711, simple_loss=0.2422, pruned_loss=0.04999, over 23786.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2437, pruned_loss=0.04368, over 4718814.84 frames. ], batch size: 179, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:09:39,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:09:39,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:09:42,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:09:47,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:09:47,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 21:09:48,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:09:50,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:09:50,759 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.59 vs. limit=10.0 2023-10-02 21:09:51,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:09:57,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:09:57,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:09:59,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:09:59,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:10:01,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:10:02,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:10:04,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:10:05,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:10:08,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:10:10,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 21:10:13,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 21:10:13,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:10:14,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:10:14,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:10:15,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:10:15,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:10:17,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:10:20,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:10:21,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:10:24,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1020133.3333333334, ans=0.1 2023-10-02 21:10:25,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:10:28,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:10:28,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 21:10:28,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:10:30,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 21:10:30,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:10:31,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:10:35,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:10:35,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:10:37,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 21:10:39,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:10:40,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:10:43,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:10:47,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:10:48,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:10:49,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:10:49,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:10:51,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:10:52,706 INFO [train.py:1046] (1/4) Epoch 29, batch 4300, loss[loss=0.1655, simple_loss=0.2369, pruned_loss=0.04701, over 20844.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2427, pruned_loss=0.04325, over 4727210.47 frames. ], batch size: 45, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:10:52,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:10:52,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 21:10:54,810 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.89 vs. limit=6.0 2023-10-02 21:10:55,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:10:58,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:10:58,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:11:04,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:11:11,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:11:11,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 21:11:13,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:11:14,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:11:14,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:11:14,527 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 21:11:17,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 21:11:19,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:11:22,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 21:11:22,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:11:22,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 21:11:23,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 21:11:25,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:11:29,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:11:29,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:11:31,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:11:32,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:11:32,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:11:32,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 21:11:34,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 21:11:36,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:11:37,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:11:37,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 21:11:37,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:11:38,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:11:38,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 21:11:38,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 21:11:38,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 21:11:40,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:11:40,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 21:11:40,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 21:11:44,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:11:46,168 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 21:11:46,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:11:49,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:11:49,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:11:50,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 21:11:52,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:11:52,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:11:53,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:11:53,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:11:53,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:11:55,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:11:56,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:11:57,714 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.918e+02 2.229e+02 2.683e+02 3.939e+02, threshold=4.458e+02, percent-clipped=0.0 2023-10-02 21:11:57,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1020533.3333333334, ans=0.2 2023-10-02 21:11:59,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:11:59,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:12:04,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 21:12:04,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1020533.3333333334, ans=0.035 2023-10-02 21:12:05,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 21:12:07,101 INFO [train.py:1046] (1/4) Epoch 29, batch 4350, loss[loss=0.1713, simple_loss=0.2561, pruned_loss=0.04327, over 24544.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2435, pruned_loss=0.0436, over 4722905.17 frames. ], batch size: 71, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:12:08,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:12:11,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:12:14,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:12:14,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:12:19,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:12:21,657 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.12 vs. limit=22.5 2023-10-02 21:12:22,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:12:26,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:12:26,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:12:28,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:12:31,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:12:32,126 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.32 vs. limit=15.0 2023-10-02 21:12:33,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:12:37,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 21:12:38,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:12:39,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:12:43,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:12:45,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 21:12:46,434 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.58 vs. limit=15.0 2023-10-02 21:12:48,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:12:50,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:12:54,893 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 21:12:56,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:12:56,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:12:57,789 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 21:12:57,858 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 21:12:57,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:12:57,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:12:59,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:12:59,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1020800.0, ans=0.125 2023-10-02 21:13:01,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:13:01,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:13:01,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:13:05,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 21:13:05,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:05,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:13:05,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:05,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 21:13:07,307 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 21:13:07,311 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 21:13:07,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 21:13:11,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:13:11,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:13:11,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:13:12,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:13:14,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 21:13:17,270 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 21:13:17,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:21,808 INFO [train.py:1046] (1/4) Epoch 29, batch 4400, loss[loss=0.1755, simple_loss=0.252, pruned_loss=0.04957, over 23404.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2442, pruned_loss=0.04364, over 4732968.60 frames. ], batch size: 106, lr: 3.51e-03, grad_scale: 16.0 2023-10-02 21:13:21,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:13:21,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:23,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:13:25,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 21:13:26,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 21:13:26,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 21:13:27,358 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 21:13:27,994 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.92 vs. limit=15.0 2023-10-02 21:13:28,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:13:28,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:13:30,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 21:13:33,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:34,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:13:34,675 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 21:13:36,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1021000.0, ans=0.05 2023-10-02 21:13:37,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:13:37,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 21:13:39,189 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 21:13:41,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 21:13:41,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 21:13:42,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 21:13:42,551 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.63 vs. limit=15.0 2023-10-02 21:13:43,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:13:44,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:13:44,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:13:44,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:13:46,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 21:13:46,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 21:13:47,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:13:50,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:13:50,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:52,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:13:52,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:13:52,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 21:13:55,254 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 21:13:59,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:14:04,744 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.74 vs. limit=15.0 2023-10-02 21:14:05,116 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.20 vs. limit=12.0 2023-10-02 21:14:05,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:14:08,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 21:14:11,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:14:11,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:14:14,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:14:15,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 21:14:15,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:14:16,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:14:16,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:14:16,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:14:18,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 21:14:22,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 21:14:23,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 21:14:23,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:14:23,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 21:14:24,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:14:26,533 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.432e+02 1.930e+02 2.073e+02 2.466e+02 3.633e+02, threshold=4.147e+02, percent-clipped=0.0 2023-10-02 21:14:28,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:14:29,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 21:14:29,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1021200.0, ans=0.0 2023-10-02 21:14:33,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:14:35,495 INFO [train.py:1046] (1/4) Epoch 29, batch 4450, loss[loss=0.1586, simple_loss=0.2522, pruned_loss=0.03249, over 24296.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2451, pruned_loss=0.04444, over 4716227.05 frames. ], batch size: 74, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:14:35,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:14:37,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:14:37,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1021266.6666666666, ans=0.0 2023-10-02 21:14:41,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:14:41,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:14:46,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:14:48,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:14:52,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1021333.3333333334, ans=0.125 2023-10-02 21:14:53,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:14:53,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:14:54,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 21:14:54,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:14:54,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:14:54,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:14:54,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:14:57,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 21:15:02,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:15:02,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:15:03,141 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.27 vs. limit=15.0 2023-10-02 21:15:05,242 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.48 vs. limit=15.0 2023-10-02 21:15:06,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:15:06,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:15:06,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:15:10,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 21:15:10,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 21:15:10,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 21:15:10,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:15:14,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:15:17,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 21:15:20,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:15:22,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:15:24,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 21:15:26,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:15:26,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:15:26,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:15:26,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:15:27,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:15:31,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:15:33,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 21:15:34,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:15:36,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:15:37,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:15:39,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:15:39,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 21:15:42,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:15:44,813 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.51 vs. limit=22.5 2023-10-02 21:15:45,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 21:15:46,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:15:49,637 INFO [train.py:1046] (1/4) Epoch 29, batch 4500, loss[loss=0.1397, simple_loss=0.2216, pruned_loss=0.02891, over 17729.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2453, pruned_loss=0.04428, over 4720826.25 frames. ], batch size: 38, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:15:52,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:15:53,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 21:15:53,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 21:15:55,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:16:00,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:16:01,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:16:01,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:16:02,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:16:02,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:16:04,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:16:16,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:16:16,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:16:19,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:16:19,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:16:20,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:16:26,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 21:16:31,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:16:34,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:16:36,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1021800.0, ans=0.125 2023-10-02 21:16:37,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:16:38,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 21:16:39,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:16:39,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:16:40,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:16:40,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:16:42,685 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.00 vs. limit=15.0 2023-10-02 21:16:44,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:16:44,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 21:16:44,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:16:44,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:16:50,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:16:50,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:16:53,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:16:54,476 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.911e+02 2.155e+02 2.344e+02 3.258e+02, threshold=4.309e+02, percent-clipped=0.0 2023-10-02 21:16:54,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:16:54,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:16:56,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 21:16:58,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 21:16:58,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 21:17:02,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 21:17:03,823 INFO [train.py:1046] (1/4) Epoch 29, batch 4550, loss[loss=0.1568, simple_loss=0.2186, pruned_loss=0.04755, over 23651.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.244, pruned_loss=0.04387, over 4732688.76 frames. ], batch size: 256, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:17:03,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 21:17:05,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:17:06,112 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.65 vs. limit=22.5 2023-10-02 21:17:07,447 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.49 vs. limit=15.0 2023-10-02 21:17:09,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:17:10,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:17:12,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:17:18,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:17:18,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:17:19,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:17:19,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:17:19,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:19,876 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1022000.0, ans=0.1 2023-10-02 21:17:22,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1022000.0, ans=0.2 2023-10-02 21:17:23,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:17:23,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:17:26,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:17:29,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 21:17:29,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 21:17:31,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:17:33,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 21:17:35,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 21:17:37,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:17:40,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 21:17:41,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:17:44,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:45,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:46,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:17:47,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 21:17:48,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1022133.3333333334, ans=10.0 2023-10-02 21:17:49,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:17:52,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:52,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:17:52,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:17:54,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 21:17:54,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 21:17:54,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:17:55,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 21:17:56,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 21:17:58,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:17:58,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:17:59,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:17:59,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:59,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:18:01,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 21:18:02,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 21:18:05,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:18:05,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 21:18:05,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 21:18:05,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:18:05,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 21:18:08,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:18:08,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:18:08,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1022200.0, ans=0.5 2023-10-02 21:18:10,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:18:11,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:18:11,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 21:18:15,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:18:16,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:18:17,925 INFO [train.py:1046] (1/4) Epoch 29, batch 4600, loss[loss=0.1776, simple_loss=0.2498, pruned_loss=0.05276, over 23821.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2425, pruned_loss=0.0432, over 4726815.36 frames. ], batch size: 179, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:18:19,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:18:19,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:18:19,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1022266.6666666666, ans=0.2 2023-10-02 21:18:22,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:18:22,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:18:22,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1022266.6666666666, ans=0.1 2023-10-02 21:18:23,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:18:24,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 21:18:26,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:18:30,160 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.58 vs. limit=15.0 2023-10-02 21:18:30,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:18:32,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:18:34,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:18:42,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 21:18:44,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:18:45,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:18:49,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:18:49,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:18:49,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1022400.0, ans=0.125 2023-10-02 21:18:53,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 21:18:53,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 21:18:53,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1022400.0, ans=0.125 2023-10-02 21:18:54,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:18:59,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:19:00,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:19:02,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:19:06,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 21:19:07,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:19:09,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1022466.6666666666, ans=0.0 2023-10-02 21:19:11,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:12,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:19:17,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:17,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 21:19:17,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:19:17,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 21:19:17,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:19,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:19:20,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:20,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:19:21,279 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.48 vs. limit=15.0 2023-10-02 21:19:22,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:19:23,160 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 1.896e+02 2.092e+02 2.608e+02 4.694e+02, threshold=4.185e+02, percent-clipped=1.0 2023-10-02 21:19:23,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 21:19:23,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 21:19:23,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 21:19:23,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:19:24,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:19:26,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:19:27,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:19:31,865 INFO [train.py:1046] (1/4) Epoch 29, batch 4650, loss[loss=0.1629, simple_loss=0.2523, pruned_loss=0.03679, over 24450.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2429, pruned_loss=0.0428, over 4737264.00 frames. ], batch size: 69, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:19:38,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:19:40,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:19:41,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:19:41,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:19:43,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:19:43,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:19:43,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:19:46,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 21:19:49,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:19:51,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 21:19:51,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:19:53,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 21:19:53,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:19:54,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 21:19:54,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 21:19:54,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:54,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:19:57,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:19:59,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:20:00,578 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 21:20:02,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:20:02,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1022733.3333333334, ans=0.1 2023-10-02 21:20:03,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 21:20:07,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:20:07,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:20:07,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 21:20:08,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:20:11,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:20:14,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:20:18,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:20:21,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:20:22,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:20:22,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:20:26,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 21:20:26,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 21:20:26,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 21:20:26,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 21:20:29,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:20:35,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:20:36,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:20:36,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 21:20:36,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:20:38,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:20:38,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:20:39,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:20:41,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1022866.6666666666, ans=0.1 2023-10-02 21:20:42,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:20:42,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:20:43,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:20:45,713 INFO [train.py:1046] (1/4) Epoch 29, batch 4700, loss[loss=0.1421, simple_loss=0.2184, pruned_loss=0.03293, over 24348.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2431, pruned_loss=0.04313, over 4741616.04 frames. ], batch size: 56, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:20:46,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1022933.3333333334, ans=0.125 2023-10-02 21:20:47,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:20:49,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:20:49,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:20:50,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 21:20:51,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:20:51,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 21:20:59,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:20:59,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:21:01,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:21:01,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:21:02,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:21:07,842 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.60 vs. limit=6.0 2023-10-02 21:21:08,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 21:21:08,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 21:21:11,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:21:11,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:21:12,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:21:14,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:21:20,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:21:21,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 21:21:25,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:21:29,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 21:21:29,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:21:30,155 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.83 vs. limit=15.0 2023-10-02 21:21:33,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:36,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 21:21:38,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:21:43,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:21:43,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 21:21:44,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:44,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:21:48,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:21:48,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:21:48,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 21:21:50,423 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 21:21:51,670 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.637e+02 1.909e+02 2.162e+02 2.551e+02 3.062e+02, threshold=4.324e+02, percent-clipped=0.0 2023-10-02 21:21:51,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:21:55,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:55,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:55,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 21:21:55,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:59,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 21:22:00,439 INFO [train.py:1046] (1/4) Epoch 29, batch 4750, loss[loss=0.1726, simple_loss=0.2467, pruned_loss=0.04926, over 23298.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2439, pruned_loss=0.04355, over 4727441.66 frames. ], batch size: 105, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:22:01,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:22:01,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:22:06,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:22:06,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:22:07,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 21:22:09,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:22:11,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 21:22:13,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:22:14,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:22:14,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:22:18,043 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:22:20,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 21:22:25,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:22:27,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 21:22:28,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:22:32,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:22:32,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:22:32,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:22:33,906 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 21:22:33,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 21:22:38,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 21:22:41,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:22:42,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:22:45,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:22:45,602 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 21:22:45,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:22:46,203 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.96 vs. limit=22.5 2023-10-02 21:22:47,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:22:48,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:22:50,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 21:22:50,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 21:22:52,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:22:52,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:22:52,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:22:55,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 21:22:55,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 21:22:58,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 21:22:59,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:01,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:23:01,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 21:23:02,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:23:04,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:23:05,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:23:05,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:06,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1023533.3333333334, ans=0.125 2023-10-02 21:23:07,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:23:10,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:23:11,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 21:23:11,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 21:23:11,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 21:23:14,251 INFO [train.py:1046] (1/4) Epoch 29, batch 4800, loss[loss=0.1677, simple_loss=0.2403, pruned_loss=0.04752, over 23346.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2446, pruned_loss=0.04373, over 4723513.38 frames. ], batch size: 119, lr: 3.50e-03, grad_scale: 32.0 2023-10-02 21:23:15,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:23:15,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:23:17,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 21:23:20,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:21,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:27,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:23:28,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:23:28,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:29,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 21:23:30,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:23:30,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:23:31,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:23:34,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:23:37,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:23:37,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:23:37,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:23:37,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 21:23:37,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:38,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:23:41,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:23:44,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:44,886 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.48 vs. limit=22.5 2023-10-02 21:23:45,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:45,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:23:47,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 21:23:47,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:48,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 21:23:48,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 21:23:50,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:50,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:23:52,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:23:52,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:23:52,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:23:54,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:23:54,977 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.82 vs. limit=22.5 2023-10-02 21:23:55,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:23:58,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:23:59,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:02,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:24:07,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 21:24:07,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:24:08,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:08,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:24:09,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:24:14,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:24:14,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:24:14,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:15,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:24:16,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:24:16,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:24:21,851 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 1.889e+02 2.092e+02 2.309e+02 3.142e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-02 21:24:21,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:24:21,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:21,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:24:23,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 21:24:26,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 21:24:26,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:24:26,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:24:26,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:24:26,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:29,475 INFO [train.py:1046] (1/4) Epoch 29, batch 4850, loss[loss=0.1464, simple_loss=0.2285, pruned_loss=0.03219, over 24478.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2443, pruned_loss=0.04364, over 4718975.54 frames. ], batch size: 63, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:24:30,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:24:38,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 21:24:39,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:24:40,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff2.min_abs, batch_count=1023933.3333333334, ans=0.1 2023-10-02 21:24:43,380 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.83 vs. limit=15.0 2023-10-02 21:24:44,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:24:45,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 21:24:45,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:48,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:24:48,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:24:49,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:24:49,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 21:24:53,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:24:56,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:24:56,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 21:24:57,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:24:57,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 21:25:00,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:25:01,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:25:06,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:25:06,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 21:25:06,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 21:25:07,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:25:13,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:25:13,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 21:25:15,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:25:15,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:25:15,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:25:16,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 21:25:16,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:25:16,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1024133.3333333334, ans=0.125 2023-10-02 21:25:18,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 21:25:18,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:25:20,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:25:20,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 21:25:29,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1024200.0, ans=0.2 2023-10-02 21:25:29,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1024200.0, ans=0.125 2023-10-02 21:25:31,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:25:37,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:25:37,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:25:41,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 21:25:41,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:25:42,575 INFO [train.py:1046] (1/4) Epoch 29, batch 4900, loss[loss=0.1683, simple_loss=0.255, pruned_loss=0.04081, over 24294.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2429, pruned_loss=0.04326, over 4714563.15 frames. ], batch size: 74, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:25:42,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1024266.6666666666, ans=0.1 2023-10-02 21:25:45,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:25:46,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:25:48,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:25:51,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 21:25:58,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 21:26:03,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 21:26:04,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 21:26:04,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:26:04,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:26:05,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:26:05,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:26:05,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:26:05,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 21:26:06,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1024333.3333333334, ans=0.125 2023-10-02 21:26:08,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 21:26:08,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:26:10,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:26:12,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:26:13,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:26:13,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:26:16,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:26:16,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 21:26:17,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:26:18,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:26:19,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 21:26:19,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 21:26:19,598 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.33 vs. limit=10.0 2023-10-02 21:26:22,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1024400.0, ans=0.0 2023-10-02 21:26:23,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 21:26:25,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:26:25,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:26:26,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:26:26,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:26:26,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 21:26:26,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:26:28,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 21:26:29,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:26:29,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1024466.6666666666, ans=0.0 2023-10-02 21:26:31,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:26:32,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:26:37,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 21:26:37,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:26:39,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 21:26:39,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 21:26:41,345 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.48 vs. limit=15.0 2023-10-02 21:26:45,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:26:47,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:26:48,528 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.893e+02 2.056e+02 2.305e+02 3.001e+02, threshold=4.112e+02, percent-clipped=0.0 2023-10-02 21:26:48,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 21:26:48,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 21:26:48,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:26:50,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:26:50,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1024533.3333333334, ans=0.125 2023-10-02 21:26:54,162 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.18 vs. limit=15.0 2023-10-02 21:26:54,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:26:54,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:26:54,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:26:54,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 21:26:56,267 INFO [train.py:1046] (1/4) Epoch 29, batch 4950, loss[loss=0.1367, simple_loss=0.216, pruned_loss=0.02874, over 24443.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2417, pruned_loss=0.043, over 4723418.28 frames. ], batch size: 58, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:26:56,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 21:26:59,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:26:59,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 21:27:03,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 21:27:03,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 21:27:03,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:27:05,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 21:27:05,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:05,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:27:07,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:27:07,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:10,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:27:10,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:27:11,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:27:13,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:27:14,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:14,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:27:18,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:27:22,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:23,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:27:26,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:26,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:27,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:27:30,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 21:27:32,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 21:27:35,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:36,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:27:36,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:27:37,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:27:37,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:27:39,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:27:41,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:27:42,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:27:43,483 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.02 vs. limit=10.0 2023-10-02 21:27:45,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:27:45,941 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.65 vs. limit=15.0 2023-10-02 21:27:46,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:48,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:48,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 21:27:48,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:27:49,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:27:52,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:27:54,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:27:54,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:27:54,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:56,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:27:56,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:27:57,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:27:58,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:27:58,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:28:00,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 21:28:00,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1024866.6666666666, ans=0.0 2023-10-02 21:28:03,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:28:06,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1024866.6666666666, ans=0.125 2023-10-02 21:28:08,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 21:28:08,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 21:28:11,014 INFO [train.py:1046] (1/4) Epoch 29, batch 5000, loss[loss=0.1529, simple_loss=0.2361, pruned_loss=0.03486, over 23656.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2421, pruned_loss=0.04274, over 4734492.83 frames. ], batch size: 149, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:28:12,700 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1024933.3333333334, ans=0.0 2023-10-02 21:28:15,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:28:15,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:28:16,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 21:28:17,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 21:28:21,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:28:23,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 21:28:23,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:28:23,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:28:25,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 21:28:25,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:28:26,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:28:26,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 21:28:26,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:28:27,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:28:29,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 21:28:29,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 21:28:29,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:28:30,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 21:28:30,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:28:30,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:28:32,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:28:32,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 21:28:32,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 21:28:33,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 21:28:33,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:28:35,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:28:36,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 21:28:36,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:28:38,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:28:40,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:28:42,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 21:28:43,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 21:28:43,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:28:45,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:28:47,914 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 21:28:51,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:28:53,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:28:53,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:28:58,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 21:28:58,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:28:58,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:28:58,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:28:59,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 21:29:01,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:29:03,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:29:05,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:29:11,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 21:29:14,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:29:17,602 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.798e+02 1.947e+02 2.210e+02 2.765e+02, threshold=3.894e+02, percent-clipped=0.0 2023-10-02 21:29:25,082 INFO [train.py:1046] (1/4) Epoch 29, batch 5050, loss[loss=0.1818, simple_loss=0.2633, pruned_loss=0.05018, over 23952.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2423, pruned_loss=0.04283, over 4725826.28 frames. ], batch size: 86, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:29:25,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:29:26,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:29:26,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:29:26,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:29:26,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:29:26,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:29:28,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:29:30,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:29:30,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 21:29:32,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:29:33,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:29:35,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:29:35,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 21:29:35,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:29:35,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1025266.6666666666, ans=0.125 2023-10-02 21:29:36,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:29:39,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:29:41,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:29:41,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:29:43,597 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.47 vs. limit=15.0 2023-10-02 21:29:51,414 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.83 vs. limit=10.0 2023-10-02 21:29:53,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 21:29:53,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:29:53,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:29:54,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 21:29:54,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:29:56,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:29:56,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:29:56,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:29:56,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 21:29:56,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1025400.0, ans=0.1 2023-10-02 21:29:58,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 21:29:59,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:30:02,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:30:02,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1025400.0, ans=0.09899494936611666 2023-10-02 21:30:05,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:30:05,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 21:30:06,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:30:09,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 21:30:11,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:30:11,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:30:11,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:30:13,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:30:15,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:30:17,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:30:19,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:30:19,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:30:19,890 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.50 vs. limit=15.0 2023-10-02 21:30:20,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:30:20,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 21:30:20,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:30:20,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1025466.6666666666, ans=0.0 2023-10-02 21:30:23,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:30:26,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:30:26,529 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 21:30:26,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 21:30:27,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:30:29,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:30:29,284 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 21:30:29,527 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1025533.3333333334, ans=0.1 2023-10-02 21:30:32,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:30:32,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 21:30:32,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:30:36,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:30:36,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:30:36,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 21:30:38,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 21:30:39,564 INFO [train.py:1046] (1/4) Epoch 29, batch 5100, loss[loss=0.1605, simple_loss=0.2468, pruned_loss=0.03715, over 24562.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2438, pruned_loss=0.04368, over 4713023.01 frames. ], batch size: 71, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:30:40,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:30:40,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:30:41,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:30:43,739 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 21:30:46,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:30:47,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 21:30:47,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1025600.0, ans=0.1 2023-10-02 21:30:49,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 21:30:51,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:30:51,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:30:54,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:30:54,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 21:30:55,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 21:31:00,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:31:01,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:31:04,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:31:08,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 21:31:08,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:31:10,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:31:10,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 21:31:11,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:31:11,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:31:11,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 21:31:14,791 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 21:31:14,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:31:16,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 21:31:16,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 21:31:19,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:31:26,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:31:28,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 21:31:28,782 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 21:31:28,790 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 21:31:31,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 21:31:31,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:31:32,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 21:31:37,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 21:31:39,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1025866.6666666666, ans=0.0 2023-10-02 21:31:40,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 21:31:41,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:31:42,408 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.80 vs. limit=15.0 2023-10-02 21:31:44,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 21:31:46,093 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.819e+02 1.985e+02 2.208e+02 2.966e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-02 21:31:47,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:31:47,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 21:31:52,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:31:52,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:31:52,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:31:53,620 INFO [train.py:1046] (1/4) Epoch 29, batch 5150, loss[loss=0.1647, simple_loss=0.254, pruned_loss=0.03767, over 24532.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2451, pruned_loss=0.04431, over 4714253.83 frames. ], batch size: 71, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:31:53,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:31:53,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:31:55,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:31:56,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 21:31:56,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 21:31:56,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1025933.3333333334, ans=0.1 2023-10-02 21:31:56,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1025933.3333333334, ans=0.0 2023-10-02 21:31:57,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 21:31:57,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:31:57,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 21:31:59,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:32:00,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 21:32:02,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:32:05,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:32:08,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1026000.0, ans=0.1 2023-10-02 21:32:09,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:32:09,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 21:32:10,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:32:10,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:32:12,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:32:12,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:32:12,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:32:12,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:32:12,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:32:13,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 21:32:15,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:32:15,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:32:18,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:32:20,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 21:32:23,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:32:25,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1026066.6666666666, ans=0.95 2023-10-02 21:32:28,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:32:29,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 21:32:33,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:32:36,883 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:32:38,526 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.33 vs. limit=22.5 2023-10-02 21:32:39,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:32:40,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:32:43,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:32:44,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:32:47,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 21:32:48,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1026133.3333333334, ans=0.0 2023-10-02 21:32:50,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:32:51,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:32:51,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:32:52,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1026200.0, ans=0.05 2023-10-02 21:32:54,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:32:55,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:32:55,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 21:33:01,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:33:03,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 21:33:05,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:33:05,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:33:06,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 21:33:06,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:33:07,677 INFO [train.py:1046] (1/4) Epoch 29, batch 5200, loss[loss=0.1578, simple_loss=0.2328, pruned_loss=0.04143, over 23573.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2446, pruned_loss=0.04411, over 4724224.74 frames. ], batch size: 149, lr: 3.50e-03, grad_scale: 32.0 2023-10-02 21:33:07,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:33:07,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:33:10,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:33:12,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:33:14,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:33:15,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1026266.6666666666, ans=0.125 2023-10-02 21:33:19,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 21:33:20,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:33:22,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:33:24,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:33:25,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:33:25,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:33:25,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1026333.3333333334, ans=10.0 2023-10-02 21:33:28,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 21:33:31,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:33:31,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:33:32,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 21:33:34,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:33:37,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:33:37,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 21:33:37,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 21:33:37,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1026400.0, ans=0.0 2023-10-02 21:33:39,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 21:33:39,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1026400.0, ans=0.125 2023-10-02 21:33:39,872 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.84 vs. limit=6.0 2023-10-02 21:33:40,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:33:40,519 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 21:33:40,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:33:41,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:33:43,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:33:43,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 21:33:44,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:33:46,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:33:50,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 21:33:51,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 21:33:51,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 21:33:56,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 21:33:56,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:34:01,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1026466.6666666666, ans=0.2 2023-10-02 21:34:02,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:34:03,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:34:03,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 21:34:04,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:34:04,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 21:34:04,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:34:05,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:34:09,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:34:11,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:34:12,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:34:12,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:34:12,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:34:14,167 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.881e+02 2.113e+02 2.405e+02 3.374e+02, threshold=4.225e+02, percent-clipped=0.0 2023-10-02 21:34:17,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:34:19,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 21:34:20,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:34:20,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:34:20,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:34:22,026 INFO [train.py:1046] (1/4) Epoch 29, batch 5250, loss[loss=0.1569, simple_loss=0.2445, pruned_loss=0.03467, over 24553.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2443, pruned_loss=0.04407, over 4719073.59 frames. ], batch size: 71, lr: 3.50e-03, grad_scale: 32.0 2023-10-02 21:34:22,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:34:23,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:34:25,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:34:26,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:34:26,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:34:29,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:34:33,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:34:37,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:34:37,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1026666.6666666666, ans=0.0 2023-10-02 21:34:38,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:34:39,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:34:42,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 21:34:42,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:34:43,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:34:45,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1026666.6666666666, ans=0.125 2023-10-02 21:35:04,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1026800.0, ans=0.0 2023-10-02 21:35:22,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1026866.6666666666, ans=0.125 2023-10-02 21:35:27,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1026866.6666666666, ans=0.125 2023-10-02 21:35:30,809 INFO [train.py:1046] (1/4) Epoch 29, batch 5300, loss[loss=0.1728, simple_loss=0.2512, pruned_loss=0.0472, over 23282.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2438, pruned_loss=0.04383, over 4721549.59 frames. ], batch size: 119, lr: 3.49e-03, grad_scale: 32.0 2023-10-02 21:35:36,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1026933.3333333334, ans=0.05 2023-10-02 21:35:45,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:35:45,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 21:35:45,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 21:35:45,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:35:45,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:35:45,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:35:45,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:35:45,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:35:45,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:35:45,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:35:45,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:35:46,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:35:46,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 21:35:46,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 21:35:46,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 21:35:46,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 21:35:46,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 21:35:46,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 21:35:46,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:35:47,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:35:47,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:35:47,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:35:47,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:35:47,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:35:47,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:35:47,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:35:48,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:35:48,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:35:48,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:35:48,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:35:48,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:35:48,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 21:35:48,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:35:49,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:35:49,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 21:35:49,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 21:35:49,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:35:49,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:35:49,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 21:35:49,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 21:35:49,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:35:49,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:35:49,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:35:49,995 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 21:35:50,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 21:35:50,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:35:50,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:35:50,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 21:35:50,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 21:35:50,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 21:35:51,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:35:52,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1027013.3333333334, ans=0.09899494936611666 2023-10-02 21:35:57,020 INFO [train.py:1046] (1/4) Epoch 30, batch 0, loss[loss=0.1632, simple_loss=0.2397, pruned_loss=0.04336, over 23466.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2397, pruned_loss=0.04336, over 23466.00 frames. ], batch size: 285, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:35:57,021 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 21:36:04,150 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.1475, 1.9064, 2.4866, 2.6643, 2.5156, 2.6705, 2.7166, 2.7806], device='cuda:1') 2023-10-02 21:36:06,840 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.1257, 2.7526, 3.0500, 3.3537, 3.0010, 3.4563, 3.3650, 3.6706], device='cuda:1') 2023-10-02 21:36:08,958 INFO [train.py:1078] (1/4) Epoch 30, validation: loss=0.3201, simple_loss=0.2693, pruned_loss=0.1854, over 1125622.00 frames. 2023-10-02 21:36:08,959 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 21:36:10,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 21:36:11,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:36:14,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:36:16,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1027013.3333333334, ans=0.0 2023-10-02 21:36:20,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:36:20,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:36:22,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:36:23,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 21:36:24,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 21:36:26,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:36:27,158 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.95 vs. limit=22.5 2023-10-02 21:36:27,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:36:31,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:36:31,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:36:33,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:36:33,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:36:33,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 21:36:36,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:36:38,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1027146.6666666666, ans=0.0 2023-10-02 21:36:41,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:36:41,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:36:42,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 21:36:46,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:36:46,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:36:47,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:36:50,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:36:52,973 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.39 vs. limit=10.0 2023-10-02 21:36:54,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:36:58,860 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 2.060e+02 2.425e+02 2.930e+02 5.326e+02, threshold=4.849e+02, percent-clipped=3.0 2023-10-02 21:37:00,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 21:37:04,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 21:37:05,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:37:05,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:37:06,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:37:06,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:37:09,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 21:37:10,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:37:11,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:37:13,210 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.78 vs. limit=15.0 2023-10-02 21:37:17,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:37:19,805 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 21:37:20,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1027280.0, ans=0.09899494936611666 2023-10-02 21:37:21,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:37:22,490 INFO [train.py:1046] (1/4) Epoch 30, batch 50, loss[loss=0.1759, simple_loss=0.2479, pruned_loss=0.05191, over 23797.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2465, pruned_loss=0.04509, over 1057869.91 frames. ], batch size: 212, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:37:23,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:37:25,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:37:25,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 21:37:26,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:37:26,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:37:29,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:37:30,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:37:33,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:37:33,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1027346.6666666666, ans=0.0 2023-10-02 21:37:34,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 21:37:36,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:37:42,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 21:37:43,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 21:37:45,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 21:37:47,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:37:50,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:37:50,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:37:52,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:37:52,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 21:37:52,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1027480.0, ans=0.125 2023-10-02 21:37:53,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 21:37:53,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:37:58,543 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.15 vs. limit=22.5 2023-10-02 21:38:00,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:38:00,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:38:01,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:38:03,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 21:38:04,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:38:04,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:38:04,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 21:38:06,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:38:06,613 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.97 vs. limit=22.5 2023-10-02 21:38:08,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 21:38:13,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:38:13,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:38:15,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:38:18,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:38:18,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:38:21,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 21:38:22,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 21:38:23,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:38:23,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:38:24,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:38:24,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:38:26,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 21:38:26,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 21:38:27,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 21:38:30,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:38:30,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:38:31,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 21:38:31,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 21:38:31,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:38:33,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:38:34,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:38:34,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:38:36,015 INFO [train.py:1046] (1/4) Epoch 30, batch 100, loss[loss=0.1839, simple_loss=0.2576, pruned_loss=0.05513, over 23298.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2456, pruned_loss=0.04464, over 1866432.72 frames. ], batch size: 93, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:38:37,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:38:40,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:38:40,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1027680.0, ans=0.0 2023-10-02 21:38:43,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:38:44,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 21:38:44,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:38:44,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1027680.0, ans=10.0 2023-10-02 21:38:49,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:38:49,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:38:49,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:38:49,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:38:49,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:38:51,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 21:38:52,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:38:52,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:38:52,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:38:52,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:38:54,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1027746.6666666666, ans=0.125 2023-10-02 21:38:57,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 21:38:58,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:38:58,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:39:00,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:39:01,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:39:05,862 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 21:39:05,878 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 21:39:06,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:39:06,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:39:11,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 21:39:13,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:39:14,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:19,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:19,449 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 21:39:22,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 21:39:25,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:39:25,939 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.51 vs. limit=22.5 2023-10-02 21:39:26,744 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.810e+02 1.965e+02 2.263e+02 3.377e+02, threshold=3.931e+02, percent-clipped=0.0 2023-10-02 21:39:26,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:39:29,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:32,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:39:33,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:39:35,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:39:36,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1027946.6666666666, ans=0.1 2023-10-02 21:39:37,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:39,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:39:40,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:39:40,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:39:40,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:40,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 21:39:42,666 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 21:39:42,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:39:42,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:39:42,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:39:42,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:39:44,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 21:39:44,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:39:45,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:39:45,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:39:45,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:39:47,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:39:47,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:39:49,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:39:50,455 INFO [train.py:1046] (1/4) Epoch 30, batch 150, loss[loss=0.1716, simple_loss=0.2442, pruned_loss=0.04946, over 22718.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2449, pruned_loss=0.0444, over 2496090.20 frames. ], batch size: 322, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:39:50,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:39:53,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:39:53,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:39:53,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:39:56,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:39:56,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:39:59,814 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.96 vs. limit=15.0 2023-10-02 21:40:00,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:40:00,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:40:00,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1028013.3333333334, ans=0.2 2023-10-02 21:40:04,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 21:40:04,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 21:40:04,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 21:40:07,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:40:07,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:40:08,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:40:08,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:40:08,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:40:10,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:40:12,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:40:13,588 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 21:40:14,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:40:21,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:40:25,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:40:27,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 21:40:30,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:40:30,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:40:30,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:40:31,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:40:33,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:40:34,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:40:34,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:40:36,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 21:40:41,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:40:43,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:40:43,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:40:43,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:40:46,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:40:47,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 21:40:49,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:40:50,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:40:52,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:40:52,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1028280.0, ans=0.125 2023-10-02 21:40:53,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.25 vs. limit=15.0 2023-10-02 21:40:53,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:40:53,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 21:40:53,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:40:53,797 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 21:40:54,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1028280.0, ans=0.125 2023-10-02 21:40:57,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:40:58,498 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.86 vs. limit=15.0 2023-10-02 21:40:59,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:40:59,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1028280.0, ans=0.125 2023-10-02 21:41:00,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:41:02,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 21:41:03,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:41:04,559 INFO [train.py:1046] (1/4) Epoch 30, batch 200, loss[loss=0.2069, simple_loss=0.2756, pruned_loss=0.06916, over 19689.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2459, pruned_loss=0.04469, over 2992022.40 frames. ], batch size: 389, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:41:04,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:41:06,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 21:41:08,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 21:41:09,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1028346.6666666666, ans=0.0 2023-10-02 21:41:10,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:41:10,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1028346.6666666666, ans=0.0 2023-10-02 21:41:11,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:41:16,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:41:17,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:41:17,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:41:22,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1028413.3333333334, ans=0.125 2023-10-02 21:41:22,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1028413.3333333334, ans=0.125 2023-10-02 21:41:35,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:41:37,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:41:37,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:41:37,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:41:39,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 21:41:39,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:41:41,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:41:42,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:41:43,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:41:43,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:41:45,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 21:41:46,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:41:46,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:41:51,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:41:53,897 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.828e+02 1.977e+02 2.173e+02 2.870e+02, threshold=3.954e+02, percent-clipped=0.0 2023-10-02 21:41:55,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:42:00,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:00,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:42:07,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:10,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 21:42:10,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:42:11,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:42:11,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:42:13,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:42:13,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1028613.3333333334, ans=0.125 2023-10-02 21:42:14,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 21:42:14,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:42:15,823 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 21:42:15,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1028680.0, ans=0.125 2023-10-02 21:42:17,148 INFO [train.py:1046] (1/4) Epoch 30, batch 250, loss[loss=0.1831, simple_loss=0.2641, pruned_loss=0.05109, over 23744.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2452, pruned_loss=0.04373, over 3379008.40 frames. ], batch size: 85, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:42:17,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:19,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:42:19,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:19,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:42:22,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:42:23,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:24,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:42:28,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:42:38,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:42:41,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:42:41,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:42:46,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1028813.3333333334, ans=0.2 2023-10-02 21:42:47,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 21:42:48,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:42:48,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1028813.3333333334, ans=0.0 2023-10-02 21:42:49,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:42:49,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:42:51,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:42:51,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:42:51,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:42:54,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:42:56,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 21:42:57,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:42:59,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:42:59,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:42:59,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:43:00,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:43:03,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:43:03,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:43:04,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:43:05,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:43:05,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:43:10,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:43:12,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:43:16,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:43:19,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:43:21,121 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1028946.6666666666, ans=0.1 2023-10-02 21:43:22,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:43:28,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 21:43:30,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:43:30,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:43:31,452 INFO [train.py:1046] (1/4) Epoch 30, batch 300, loss[loss=0.1478, simple_loss=0.2262, pruned_loss=0.03466, over 24269.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2432, pruned_loss=0.04336, over 3678760.74 frames. ], batch size: 56, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:43:31,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 21:43:31,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:43:33,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:43:34,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 21:43:37,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:43:39,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:43:42,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:43:42,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1029013.3333333334, ans=0.1 2023-10-02 21:43:43,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 21:43:44,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:43:44,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 21:43:46,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 21:43:46,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:43:46,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1029080.0, ans=0.125 2023-10-02 21:43:49,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1029080.0, ans=0.125 2023-10-02 21:43:50,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:43:51,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1029080.0, ans=0.125 2023-10-02 21:43:54,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:43:54,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 21:43:57,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 21:43:57,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:01,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:44:02,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:02,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 21:44:02,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:44:04,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1029146.6666666666, ans=0.125 2023-10-02 21:44:06,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:44:07,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:44:07,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:44:09,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1029146.6666666666, ans=0.125 2023-10-02 21:44:11,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 21:44:11,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 21:44:11,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:44:14,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:15,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 21:44:16,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:44:21,388 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.890e+02 2.076e+02 2.319e+02 2.784e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-02 21:44:22,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:44:24,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:44:24,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 21:44:25,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1029213.3333333334, ans=0.1 2023-10-02 21:44:27,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1029213.3333333334, ans=0.2 2023-10-02 21:44:28,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:28,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:44:31,306 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.99 vs. limit=15.0 2023-10-02 21:44:32,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:33,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:44:33,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 21:44:35,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:44:35,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:44:37,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 21:44:38,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:38,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:44:40,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1029280.0, ans=0.125 2023-10-02 21:44:41,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:44:41,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:44:41,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:44:43,205 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:44:43,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1029280.0, ans=0.0 2023-10-02 21:44:45,538 INFO [train.py:1046] (1/4) Epoch 30, batch 350, loss[loss=0.1608, simple_loss=0.2279, pruned_loss=0.04687, over 23784.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2409, pruned_loss=0.04253, over 3906447.93 frames. ], batch size: 195, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:44:46,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:44:46,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 21:44:49,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:44:55,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:44:56,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:44:58,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:44:59,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 21:45:01,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:45:01,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1029413.3333333334, ans=0.125 2023-10-02 21:45:02,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 21:45:05,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:45:05,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 21:45:07,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:45:10,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 21:45:11,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:45:13,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:45:14,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:45:14,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:45:14,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1029480.0, ans=0.2 2023-10-02 21:45:16,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:45:16,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:45:16,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:45:17,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:45:18,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:45:18,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:45:24,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:45:24,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:45:25,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:45:27,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:45:31,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 21:45:31,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:45:34,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1029546.6666666666, ans=0.2 2023-10-02 21:45:37,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1029546.6666666666, ans=0.2 2023-10-02 21:45:38,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:45:38,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:45:38,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:45:38,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 21:45:40,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1029546.6666666666, ans=0.0 2023-10-02 21:45:41,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:45:43,188 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 21:45:43,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 21:45:44,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:45:46,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:45:46,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 21:45:47,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:45:50,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:45:51,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:45:52,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:45:52,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:45:54,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:45:56,523 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.97 vs. limit=15.0 2023-10-02 21:45:57,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:45:58,322 INFO [train.py:1046] (1/4) Epoch 30, batch 400, loss[loss=0.1584, simple_loss=0.2468, pruned_loss=0.03495, over 24661.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.24, pruned_loss=0.04244, over 4075030.72 frames. ], batch size: 68, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:45:59,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1029680.0, ans=0.0 2023-10-02 21:46:00,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:46:01,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 21:46:01,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:46:01,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:46:03,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:46:05,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:46:06,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:46:06,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:46:10,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 21:46:11,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 21:46:11,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:46:12,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 21:46:14,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:46:18,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:46:18,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:46:19,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 21:46:19,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:46:19,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:46:19,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:46:21,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:46:22,791 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 21:46:22,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 21:46:27,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:46:29,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:46:29,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 21:46:31,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 21:46:31,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1029813.3333333334, ans=0.0 2023-10-02 21:46:35,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:46:37,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:46:44,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 21:46:46,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1029880.0, ans=0.0 2023-10-02 21:46:48,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:46:49,948 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.876e+02 2.050e+02 2.332e+02 4.522e+02, threshold=4.101e+02, percent-clipped=1.0 2023-10-02 21:46:50,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 21:46:52,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:46:55,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:46:55,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 21:46:55,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1029880.0, ans=0.1 2023-10-02 21:46:59,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:47:01,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:47:02,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:47:03,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:47:05,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 21:47:06,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:47:07,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 21:47:10,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:47:10,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:47:11,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 21:47:13,227 INFO [train.py:1046] (1/4) Epoch 30, batch 450, loss[loss=0.1627, simple_loss=0.2565, pruned_loss=0.03444, over 24577.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2414, pruned_loss=0.04246, over 4215834.35 frames. ], batch size: 71, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:47:15,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:47:15,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:47:15,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 21:47:16,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 21:47:16,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:47:18,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:47:18,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:47:18,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 21:47:19,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:47:20,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:47:23,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:47:32,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:47:33,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:47:35,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 21:47:35,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 21:47:36,609 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.62 vs. limit=15.0 2023-10-02 21:47:39,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:47:42,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:47:45,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:47:48,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:47:50,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:47:50,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1030146.6666666666, ans=0.2 2023-10-02 21:47:51,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 21:47:52,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 21:47:53,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 21:47:53,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:47:55,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:47:55,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:47:56,429 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.77 vs. limit=15.0 2023-10-02 21:47:57,236 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 21:47:57,246 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 21:47:57,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:47:58,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:47:59,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 21:48:02,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:48:04,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:48:04,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 21:48:04,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 21:48:07,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:48:09,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:48:10,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:48:12,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 21:48:14,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:48:16,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 21:48:16,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 21:48:18,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:48:22,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:48:23,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:48:25,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:48:25,117 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 21:48:26,421 INFO [train.py:1046] (1/4) Epoch 30, batch 500, loss[loss=0.1617, simple_loss=0.2532, pruned_loss=0.03516, over 24623.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2428, pruned_loss=0.04381, over 4323145.58 frames. ], batch size: 68, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:48:29,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:48:30,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:48:30,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:48:30,885 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 21:48:32,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 21:48:32,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:48:32,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1030346.6666666666, ans=0.0 2023-10-02 21:48:35,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:48:40,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:48:41,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:48:44,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:48:44,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:48:44,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:48:52,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1030413.3333333334, ans=0.0 2023-10-02 21:48:55,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:48:55,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 21:48:56,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:48:57,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:48:57,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 21:48:57,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:49:00,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:49:01,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:49:01,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:49:01,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:49:01,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1030480.0, ans=0.0 2023-10-02 21:49:02,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 21:49:03,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1030480.0, ans=0.1 2023-10-02 21:49:04,890 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.12 vs. limit=15.0 2023-10-02 21:49:07,011 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 21:49:08,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:49:11,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:49:12,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:49:12,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:49:12,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:49:15,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 21:49:16,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:49:18,450 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.804e+02 1.945e+02 2.150e+02 2.589e+02, threshold=3.890e+02, percent-clipped=0.0 2023-10-02 21:49:18,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:49:19,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1030546.6666666666, ans=15.0 2023-10-02 21:49:21,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:49:23,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1030546.6666666666, ans=0.125 2023-10-02 21:49:25,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:49:29,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:49:31,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 21:49:31,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:49:31,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:49:35,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 21:49:35,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 21:49:38,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:49:39,880 INFO [train.py:1046] (1/4) Epoch 30, batch 550, loss[loss=0.1786, simple_loss=0.2526, pruned_loss=0.05233, over 23755.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2444, pruned_loss=0.04419, over 4413251.87 frames. ], batch size: 179, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:49:42,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 21:49:44,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 21:49:44,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:49:44,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 21:49:44,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:49:44,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:49:46,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:49:46,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:49:47,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:49:47,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:49:49,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1030680.0, ans=0.1 2023-10-02 21:49:50,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:49:52,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 21:49:52,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:49:56,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:49:56,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:49:58,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1030746.6666666666, ans=0.2 2023-10-02 21:49:59,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:50:00,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:50:05,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 21:50:07,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 21:50:08,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:50:08,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1030813.3333333334, ans=0.1 2023-10-02 21:50:14,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:50:15,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:50:16,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:50:18,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:50:18,929 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 21:50:20,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:50:22,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 21:50:24,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:50:24,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:50:24,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:50:25,974 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.74 vs. limit=15.0 2023-10-02 21:50:26,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:50:27,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 21:50:29,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 21:50:30,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:50:30,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:50:30,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:50:30,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:50:30,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1030880.0, ans=0.125 2023-10-02 21:50:32,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:50:33,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:50:36,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:50:36,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:50:38,127 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.05 vs. limit=6.0 2023-10-02 21:50:38,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 21:50:39,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:50:40,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:50:40,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:50:42,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:50:44,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:50:44,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 21:50:47,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1030946.6666666666, ans=0.125 2023-10-02 21:50:50,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 21:50:50,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1030946.6666666666, ans=0.1 2023-10-02 21:50:52,105 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.55 vs. limit=15.0 2023-10-02 21:50:54,248 INFO [train.py:1046] (1/4) Epoch 30, batch 600, loss[loss=0.1293, simple_loss=0.2046, pruned_loss=0.02706, over 24293.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2445, pruned_loss=0.04399, over 4485548.50 frames. ], batch size: 56, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:50:54,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 21:50:54,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:50:55,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:50:55,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:51:00,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1031013.3333333334, ans=0.125 2023-10-02 21:51:02,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:51:05,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:51:05,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 21:51:06,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 21:51:09,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:51:11,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:51:14,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 21:51:14,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:51:20,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 21:51:24,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:51:24,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:51:26,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:51:30,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:51:30,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:51:31,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:51:34,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1031146.6666666666, ans=0.0 2023-10-02 21:51:37,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:51:41,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:51:41,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:51:42,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:51:44,206 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.50 vs. limit=10.0 2023-10-02 21:51:44,815 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.850e+02 2.032e+02 2.229e+02 3.048e+02, threshold=4.063e+02, percent-clipped=0.0 2023-10-02 21:51:49,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 21:51:55,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:51:55,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:51:58,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 21:51:59,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:52:01,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 21:52:02,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:52:02,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:52:06,830 INFO [train.py:1046] (1/4) Epoch 30, batch 650, loss[loss=0.1409, simple_loss=0.1968, pruned_loss=0.04253, over 19600.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.243, pruned_loss=0.04371, over 4534030.08 frames. ], batch size: 388, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:52:07,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 21:52:08,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:52:10,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:52:11,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:52:13,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:16,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 21:52:17,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:52:22,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:52:22,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:52:26,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:52:29,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 21:52:30,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:52:32,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:52:34,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:52:35,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 21:52:38,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:52:38,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:40,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:52:41,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:42,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:52:44,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1031480.0, ans=0.0 2023-10-02 21:52:45,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:52:45,499 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 21:52:45,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:52:45,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:52:47,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:47,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1031480.0, ans=0.025 2023-10-02 21:52:48,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:52:48,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:52:50,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:52:50,936 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.33 vs. limit=12.0 2023-10-02 21:52:51,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 21:52:53,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:52:53,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:52:53,845 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.62 vs. limit=15.0 2023-10-02 21:52:54,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 21:52:54,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:52:56,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 21:52:56,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 21:52:56,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1031546.6666666666, ans=0.07 2023-10-02 21:52:57,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 21:52:58,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:58,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:52:59,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:52:59,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:53:01,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:53:06,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:53:06,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:53:08,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:53:11,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:53:11,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 21:53:12,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:53:19,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:53:19,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:53:19,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:53:19,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:53:21,522 INFO [train.py:1046] (1/4) Epoch 30, batch 700, loss[loss=0.177, simple_loss=0.266, pruned_loss=0.04398, over 24068.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2427, pruned_loss=0.0434, over 4571047.06 frames. ], batch size: 80, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:53:24,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 21:53:24,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 21:53:27,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 21:53:28,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:53:29,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:53:31,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 21:53:35,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:53:39,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:53:41,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:53:43,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 21:53:44,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:53:44,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1031746.6666666666, ans=0.0 2023-10-02 21:53:46,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:53:48,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 21:53:48,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:53:48,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1031813.3333333334, ans=0.2 2023-10-02 21:53:52,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 21:53:54,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 21:53:58,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:53:58,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:53:59,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:53:59,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1031813.3333333334, ans=0.09899494936611666 2023-10-02 21:54:03,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1031880.0, ans=0.0 2023-10-02 21:54:04,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:54:05,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 21:54:08,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1031880.0, ans=0.04949747468305833 2023-10-02 21:54:09,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:54:09,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:54:10,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 21:54:12,184 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.810e+02 2.003e+02 2.229e+02 3.158e+02, threshold=4.006e+02, percent-clipped=0.0 2023-10-02 21:54:14,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:54:15,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:54:18,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:54:22,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:54:22,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 21:54:24,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 21:54:26,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 21:54:27,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:54:29,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:54:29,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1031946.6666666666, ans=0.125 2023-10-02 21:54:29,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1031946.6666666666, ans=0.2 2023-10-02 21:54:30,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:54:33,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:54:33,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 21:54:34,948 INFO [train.py:1046] (1/4) Epoch 30, batch 750, loss[loss=0.1657, simple_loss=0.2254, pruned_loss=0.05298, over 19298.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2415, pruned_loss=0.04252, over 4610670.39 frames. ], batch size: 388, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:54:37,104 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.41 vs. limit=22.5 2023-10-02 21:54:38,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 21:54:39,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 21:54:39,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 21:54:41,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 21:54:42,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 21:54:43,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:54:43,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 21:54:45,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:54:45,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:54:46,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1032013.3333333334, ans=0.125 2023-10-02 21:54:47,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:54:49,418 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.83 vs. limit=22.5 2023-10-02 21:54:50,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:54:50,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:54:51,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:54:52,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:54:53,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:54:55,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:54:58,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:54:58,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:55:00,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 21:55:01,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:55:01,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:55:03,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:55:04,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 21:55:06,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 21:55:06,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:55:07,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 21:55:07,476 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 21:55:08,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 21:55:08,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:55:08,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:55:10,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:55:16,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:55:18,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:55:18,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:55:20,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:55:22,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:55:22,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 21:55:24,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:55:24,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 21:55:25,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:55:28,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:55:28,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 21:55:29,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:55:34,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:55:35,182 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.31 vs. limit=6.0 2023-10-02 21:55:35,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:55:37,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:55:38,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:55:43,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 21:55:43,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:55:44,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:55:47,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:55:47,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:55:48,921 INFO [train.py:1046] (1/4) Epoch 30, batch 800, loss[loss=0.1721, simple_loss=0.264, pruned_loss=0.04015, over 24590.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2424, pruned_loss=0.0425, over 4642417.97 frames. ], batch size: 71, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:55:49,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:55:49,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:55:58,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:55:58,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:00,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:56:00,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:56:00,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:02,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:02,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1032413.3333333334, ans=10.0 2023-10-02 21:56:04,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:08,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:56:08,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:56:12,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 21:56:12,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:13,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:56:13,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:56:15,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:56:15,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 21:56:15,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:56:16,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 21:56:19,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:21,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:56:24,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:56:24,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:56:27,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:27,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:31,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:56:31,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:56:31,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 21:56:32,826 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 21:56:32,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 21:56:32,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:56:32,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:56:35,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:35,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:56:40,117 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.899e+02 2.201e+02 2.718e+02 4.038e+02, threshold=4.403e+02, percent-clipped=2.0 2023-10-02 21:56:40,293 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 21:56:41,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 21:56:43,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:56:44,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:56:49,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:56:51,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:51,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 21:56:52,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:56:56,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 21:57:00,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:57:01,367 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.53 vs. limit=22.5 2023-10-02 21:57:02,201 INFO [train.py:1046] (1/4) Epoch 30, batch 850, loss[loss=0.1713, simple_loss=0.2585, pruned_loss=0.04203, over 24545.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2436, pruned_loss=0.0431, over 4663927.90 frames. ], batch size: 71, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:57:02,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:57:03,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 21:57:04,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:57:05,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:57:05,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 21:57:06,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:57:07,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:57:08,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:09,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:57:10,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1032680.0, ans=0.0 2023-10-02 21:57:11,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:57:12,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 21:57:12,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 21:57:12,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 21:57:15,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:57:15,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:57:19,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:19,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:57:19,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:57:22,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1032746.6666666666, ans=0.125 2023-10-02 21:57:24,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:57:24,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:57:24,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 21:57:26,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 21:57:28,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:57:29,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 21:57:33,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 21:57:35,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 21:57:35,254 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 21:57:35,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:57:36,074 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.54 vs. limit=12.0 2023-10-02 21:57:37,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:57:37,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 21:57:38,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:39,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:39,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 21:57:41,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1032813.3333333334, ans=0.125 2023-10-02 21:57:43,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:57:43,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:57:46,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:57:46,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:57:47,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:57:49,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:57:49,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 21:57:53,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:57:53,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:57:54,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:57:55,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:57:55,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:57:56,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:59,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:58:00,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 21:58:02,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:58:02,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:58:09,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:58:09,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:58:10,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 21:58:10,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:58:10,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:58:13,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 21:58:17,200 INFO [train.py:1046] (1/4) Epoch 30, batch 900, loss[loss=0.1613, simple_loss=0.2404, pruned_loss=0.0411, over 24439.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2447, pruned_loss=0.04382, over 4672661.76 frames. ], batch size: 58, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 21:58:22,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:58:24,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:58:24,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 21:58:26,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1033013.3333333334, ans=0.0 2023-10-02 21:58:27,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:58:27,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 21:58:28,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 21:58:30,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:58:30,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:58:30,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:58:31,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:58:33,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1033080.0, ans=0.125 2023-10-02 21:58:41,390 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.81 vs. limit=6.0 2023-10-02 21:58:41,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:58:41,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:58:41,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:58:42,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1033080.0, ans=0.0 2023-10-02 21:58:45,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:58:49,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 21:58:52,058 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.97 vs. limit=15.0 2023-10-02 21:58:54,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:58:58,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 21:58:58,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:58:58,460 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 21:58:59,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 21:59:05,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:59:05,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:59:06,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:59:11,419 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.769e+02 1.935e+02 2.161e+02 3.184e+02, threshold=3.870e+02, percent-clipped=0.0 2023-10-02 21:59:12,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:59:12,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:59:13,724 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.02 vs. limit=6.0 2023-10-02 21:59:14,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 21:59:14,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:59:17,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 21:59:17,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1033280.0, ans=0.125 2023-10-02 21:59:19,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1033280.0, ans=0.125 2023-10-02 21:59:20,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:59:20,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:59:22,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:59:22,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:59:25,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 21:59:25,476 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 21:59:25,952 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.47 vs. limit=22.5 2023-10-02 21:59:27,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1033280.0, ans=0.1 2023-10-02 21:59:28,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 21:59:28,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 21:59:30,957 INFO [train.py:1046] (1/4) Epoch 30, batch 950, loss[loss=0.2168, simple_loss=0.2796, pruned_loss=0.07702, over 19713.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2448, pruned_loss=0.04372, over 4680899.35 frames. ], batch size: 389, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 21:59:31,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:59:31,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1033346.6666666666, ans=0.0 2023-10-02 21:59:31,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1033346.6666666666, ans=0.0 2023-10-02 21:59:35,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 21:59:39,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:59:41,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:59:42,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:59:42,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:59:46,062 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 21:59:47,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:59:48,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:59:50,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:59:50,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:59:50,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 21:59:51,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1033413.3333333334, ans=0.125 2023-10-02 21:59:52,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 21:59:53,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:59:54,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 21:59:56,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:59:58,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:59:58,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:59:58,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:59:58,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 21:59:59,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 22:00:01,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:00:02,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:00:07,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:00:07,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:00:11,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 22:00:14,116 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.00 vs. limit=15.0 2023-10-02 22:00:14,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 22:00:14,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:00:14,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:00:14,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=1033546.6666666666, ans=0.5 2023-10-02 22:00:16,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:00:16,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:00:16,624 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.17 vs. limit=15.0 2023-10-02 22:00:19,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1033546.6666666666, ans=0.125 2023-10-02 22:00:22,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 22:00:23,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:00:26,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:00:26,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:00:28,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 22:00:28,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:00:28,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:00:28,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 22:00:32,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:00:33,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:00:39,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:00:39,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 22:00:40,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 22:00:43,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:00:45,204 INFO [train.py:1046] (1/4) Epoch 30, batch 1000, loss[loss=0.1588, simple_loss=0.2353, pruned_loss=0.0411, over 23429.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2436, pruned_loss=0.04374, over 4677485.17 frames. ], batch size: 105, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:00:48,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 22:00:48,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:00:53,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:00:55,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 22:00:55,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 22:00:59,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:01:00,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:01:00,743 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1033746.6666666666, ans=0.125 2023-10-02 22:01:00,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1033746.6666666666, ans=0.0 2023-10-02 22:01:01,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:01:02,434 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.92 vs. limit=15.0 2023-10-02 22:01:03,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 22:01:06,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 22:01:07,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 22:01:07,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:01:09,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 22:01:10,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 22:01:11,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 22:01:13,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:01:14,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:01:24,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:01:26,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:01:26,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:01:27,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:01:27,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 22:01:27,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:01:27,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1033813.3333333334, ans=0.5 2023-10-02 22:01:29,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:01:30,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:01:31,781 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 22:01:32,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1033880.0, ans=0.0 2023-10-02 22:01:33,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 22:01:36,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 22:01:36,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 22:01:38,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:01:40,105 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.798e+02 1.994e+02 2.237e+02 2.934e+02, threshold=3.988e+02, percent-clipped=0.0 2023-10-02 22:01:40,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1033880.0, ans=0.1 2023-10-02 22:01:44,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:01:44,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:01:45,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:01:45,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:01:48,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 22:01:51,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:01:52,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 22:01:52,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 22:01:54,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:01:54,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:01:56,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:01:59,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:02:00,386 INFO [train.py:1046] (1/4) Epoch 30, batch 1050, loss[loss=0.1696, simple_loss=0.2342, pruned_loss=0.05254, over 23410.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2419, pruned_loss=0.0432, over 4677724.36 frames. ], batch size: 285, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:02:00,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:02:03,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:02:03,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:02:04,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 22:02:06,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:02:07,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:02:10,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:02:11,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:02:13,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:02:14,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:02:14,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:02:15,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:02:17,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 22:02:17,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:02:17,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 22:02:20,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:02:20,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 22:02:20,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:02:21,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1034080.0, ans=0.07 2023-10-02 22:02:28,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:02:28,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:02:28,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:02:31,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 22:02:31,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 22:02:32,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:02:33,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 22:02:36,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 22:02:38,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:02:40,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 22:02:42,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:02:42,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:02:43,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:02:46,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:02:53,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 22:02:54,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 22:02:56,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 22:02:56,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:02:56,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:02:57,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 22:03:00,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:03:02,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:03:02,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:03:03,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:03:03,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:03:07,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:03:07,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 22:03:09,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:03:09,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 22:03:09,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 22:03:10,143 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.77 vs. limit=6.0 2023-10-02 22:03:10,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:03:10,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1034280.0, ans=0.09899494936611666 2023-10-02 22:03:13,110 INFO [train.py:1046] (1/4) Epoch 30, batch 1100, loss[loss=0.1654, simple_loss=0.2496, pruned_loss=0.04057, over 24655.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2415, pruned_loss=0.04268, over 4703789.63 frames. ], batch size: 73, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:03:13,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:03:19,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:03:22,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1034346.6666666666, ans=0.0 2023-10-02 22:03:23,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:03:25,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:03:26,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:03:27,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 22:03:29,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:03:31,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 22:03:32,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:03:35,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:03:36,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 22:03:38,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 22:03:38,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:03:39,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:03:40,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:03:43,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:03:43,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1034480.0, ans=0.09899494936611666 2023-10-02 22:03:47,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:03:48,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1034480.0, ans=0.125 2023-10-02 22:03:51,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 22:03:52,411 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 22:03:52,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:03:52,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1034480.0, ans=0.2 2023-10-02 22:03:55,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:03:57,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:03:57,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:03:57,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 22:03:59,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:04:00,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:04:00,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:04:00,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:04:00,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 22:04:06,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:04:07,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 22:04:08,726 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.832e+02 2.001e+02 2.230e+02 3.107e+02, threshold=4.001e+02, percent-clipped=0.0 2023-10-02 22:04:08,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:04:12,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:04:15,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 22:04:15,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 22:04:18,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:04:20,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:04:20,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:04:21,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 22:04:23,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:04:23,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:04:25,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 22:04:25,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:04:26,788 INFO [train.py:1046] (1/4) Epoch 30, batch 1150, loss[loss=0.1681, simple_loss=0.2442, pruned_loss=0.04607, over 23652.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2422, pruned_loss=0.04279, over 4701650.89 frames. ], batch size: 149, lr: 3.42e-03, grad_scale: 4.0 2023-10-02 22:04:26,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 22:04:27,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1034680.0, ans=0.025 2023-10-02 22:04:28,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:04:28,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:04:29,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:04:34,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:04:35,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:04:37,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:04:37,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:04:38,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 22:04:39,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:04:40,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1034746.6666666666, ans=0.125 2023-10-02 22:04:41,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 22:04:42,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:04:42,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:04:44,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1034746.6666666666, ans=0.2 2023-10-02 22:04:47,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1034746.6666666666, ans=0.0 2023-10-02 22:04:48,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 22:04:49,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:04:54,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:04:54,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:04:55,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 22:04:55,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:04:55,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:04:59,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 22:05:01,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:05:01,753 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.87 vs. limit=15.0 2023-10-02 22:05:02,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:05:04,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1034813.3333333334, ans=0.125 2023-10-02 22:05:06,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1034813.3333333334, ans=0.125 2023-10-02 22:05:12,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:05:16,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:05:17,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 22:05:17,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:05:19,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:05:23,564 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 22:05:24,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:05:33,475 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 22:05:36,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:05:37,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:05:37,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:05:39,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:05:40,420 INFO [train.py:1046] (1/4) Epoch 30, batch 1200, loss[loss=0.1618, simple_loss=0.2406, pruned_loss=0.04151, over 23303.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2427, pruned_loss=0.04288, over 4706229.48 frames. ], batch size: 105, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:05:40,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:05:44,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:05:44,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:05:46,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:05:46,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:05:46,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:05:47,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:05:48,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:05:50,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:05:52,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:05:53,494 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 22:05:58,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 22:06:02,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:06:04,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:06:06,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:06:07,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:06:07,707 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 22:06:09,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:06:16,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:06:16,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:06:16,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 22:06:17,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:06:19,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1035146.6666666666, ans=0.125 2023-10-02 22:06:20,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 22:06:26,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 22:06:26,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:06:26,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:06:27,211 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.77 vs. limit=15.0 2023-10-02 22:06:28,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:06:29,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:06:29,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:06:31,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:06:31,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:06:31,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 22:06:33,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:06:33,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:06:33,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:06:35,870 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.50 vs. limit=15.0 2023-10-02 22:06:36,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:06:36,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:06:37,824 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.850e+02 2.052e+02 2.209e+02 3.387e+02, threshold=4.104e+02, percent-clipped=0.0 2023-10-02 22:06:40,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 22:06:42,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:06:44,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 22:06:47,627 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 22:06:50,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:06:52,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:06:53,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:06:55,169 INFO [train.py:1046] (1/4) Epoch 30, batch 1250, loss[loss=0.1712, simple_loss=0.2629, pruned_loss=0.03979, over 24350.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2424, pruned_loss=0.04308, over 4701425.92 frames. ], batch size: 74, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:06:55,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:06:57,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 22:06:58,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1035346.6666666666, ans=0.0 2023-10-02 22:07:00,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:07:01,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:07:02,281 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.50 vs. limit=12.0 2023-10-02 22:07:02,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 22:07:04,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:07:04,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:07:10,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 22:07:10,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:07:11,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:07:11,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:07:14,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:07:18,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 22:07:18,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 22:07:18,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:07:21,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:07:21,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:07:23,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:07:24,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:07:27,921 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:07:30,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 22:07:32,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:07:32,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:07:34,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 22:07:34,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:07:34,330 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 22:07:35,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:07:35,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:07:40,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:07:44,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:07:45,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:07:46,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 22:07:46,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 22:07:46,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 22:07:50,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:07:51,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 22:07:51,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:07:55,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 22:07:55,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:07:56,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 22:07:56,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 22:07:56,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:07:56,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:07:58,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:07:59,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 22:08:02,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:08:04,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:08:04,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1035613.3333333334, ans=0.125 2023-10-02 22:08:06,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:08:08,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 22:08:09,415 INFO [train.py:1046] (1/4) Epoch 30, batch 1300, loss[loss=0.1803, simple_loss=0.2408, pruned_loss=0.05987, over 22708.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2436, pruned_loss=0.04385, over 4693493.81 frames. ], batch size: 322, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:08:12,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:08:12,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 22:08:15,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1035680.0, ans=0.1 2023-10-02 22:08:16,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:08:17,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:08:19,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:08:20,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:08:20,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:08:21,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 22:08:25,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:08:26,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:08:27,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1035746.6666666666, ans=0.0 2023-10-02 22:08:28,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 22:08:32,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:08:34,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:08:36,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:08:38,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:08:38,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:08:39,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:08:40,418 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.56 vs. limit=15.0 2023-10-02 22:08:40,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 22:08:41,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 22:08:45,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:08:45,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:08:47,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 22:08:47,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 22:08:50,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:08:53,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:08:53,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 22:08:53,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:08:55,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 22:08:56,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:09:00,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:09:00,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:09:04,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 22:09:05,531 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.946e+02 2.321e+02 2.829e+02 4.906e+02, threshold=4.642e+02, percent-clipped=3.0 2023-10-02 22:09:05,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 22:09:07,002 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.89 vs. limit=6.0 2023-10-02 22:09:07,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 22:09:08,037 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.76 vs. limit=15.0 2023-10-02 22:09:13,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:09:13,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1035946.6666666666, ans=0.125 2023-10-02 22:09:14,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 22:09:14,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1035946.6666666666, ans=0.09899494936611666 2023-10-02 22:09:17,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:09:23,159 INFO [train.py:1046] (1/4) Epoch 30, batch 1350, loss[loss=0.1617, simple_loss=0.2261, pruned_loss=0.04867, over 23789.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2425, pruned_loss=0.04356, over 4702238.45 frames. ], batch size: 164, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:09:23,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 22:09:27,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:09:29,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:09:33,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:09:33,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:09:34,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:09:36,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:09:40,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:09:41,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 22:09:43,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 22:09:43,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:09:46,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 22:09:46,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:09:47,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:09:47,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 22:09:48,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 22:09:51,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 22:09:51,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:09:53,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 22:09:53,816 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.37 vs. limit=15.0 2023-10-02 22:09:59,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1036146.6666666666, ans=0.125 2023-10-02 22:10:04,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:10:11,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1036213.3333333334, ans=0.125 2023-10-02 22:10:14,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:10:15,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:10:15,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 22:10:18,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:10:19,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 22:10:19,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 22:10:21,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:10:24,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:10:25,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 22:10:27,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:10:30,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 22:10:31,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 22:10:37,767 INFO [train.py:1046] (1/4) Epoch 30, batch 1400, loss[loss=0.1714, simple_loss=0.2631, pruned_loss=0.03984, over 24539.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2424, pruned_loss=0.04361, over 4711013.56 frames. ], batch size: 71, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:10:39,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 22:10:41,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:10:43,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:10:44,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:10:50,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 22:10:50,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1036346.6666666666, ans=0.125 2023-10-02 22:10:51,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 22:11:01,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:11:03,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:11:04,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:11:04,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:11:07,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:11:09,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 22:11:18,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:11:19,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:11:20,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1036480.0, ans=0.2 2023-10-02 22:11:22,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1036546.6666666666, ans=0.0 2023-10-02 22:11:23,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 22:11:25,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:11:26,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:11:26,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:11:26,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:11:28,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:11:28,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:11:29,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:11:30,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1036546.6666666666, ans=0.125 2023-10-02 22:11:31,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 22:11:32,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:11:33,815 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.837e+02 2.123e+02 2.555e+02 3.782e+02, threshold=4.246e+02, percent-clipped=0.0 2023-10-02 22:11:35,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:11:38,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:11:41,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1036613.3333333334, ans=0.1 2023-10-02 22:11:44,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 22:11:46,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 22:11:47,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:11:49,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1036613.3333333334, ans=10.0 2023-10-02 22:11:50,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 22:11:51,969 INFO [train.py:1046] (1/4) Epoch 30, batch 1450, loss[loss=0.1539, simple_loss=0.2401, pruned_loss=0.03385, over 24515.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2423, pruned_loss=0.04314, over 4722731.91 frames. ], batch size: 63, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:11:52,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:11:53,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:11:54,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:11:56,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:11:56,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:11:56,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 22:12:02,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:12:02,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:12:03,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:12:03,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 22:12:03,922 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1036680.0, ans=0.0 2023-10-02 22:12:04,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:12:05,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 22:12:06,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:12:06,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:12:06,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 22:12:09,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:12:09,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:12:11,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 22:12:11,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:12:12,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:12:12,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1036746.6666666666, ans=0.2 2023-10-02 22:12:13,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:12:17,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:12:20,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:12:20,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:12:22,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:12:22,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:12:26,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:12:26,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:12:26,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:12:27,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:12:28,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 22:12:31,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:12:33,077 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 22:12:35,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:12:37,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:12:39,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:12:39,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 22:12:44,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:12:44,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 22:12:48,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 22:12:48,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:12:52,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:12:53,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:12:54,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 22:12:55,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 22:12:57,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 22:12:57,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1036946.6666666666, ans=0.2 2023-10-02 22:12:58,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:12:59,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:13:05,509 INFO [train.py:1046] (1/4) Epoch 30, batch 1500, loss[loss=0.1376, simple_loss=0.2164, pruned_loss=0.02942, over 24444.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2424, pruned_loss=0.04302, over 4728368.05 frames. ], batch size: 58, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:13:05,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1037013.3333333334, ans=0.0 2023-10-02 22:13:08,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 22:13:10,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:13:10,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:13:11,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:13:12,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:13:12,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:13:13,776 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.85 vs. limit=22.5 2023-10-02 22:13:14,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 22:13:14,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:13:15,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:13:15,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:13:15,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:13:17,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:13:19,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:13:20,471 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.42 vs. limit=6.0 2023-10-02 22:13:24,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:13:24,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 22:13:24,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1037080.0, ans=0.125 2023-10-02 22:13:25,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:13:26,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:13:28,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:13:29,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 22:13:31,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1037080.0, ans=0.2 2023-10-02 22:13:32,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 22:13:35,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:13:35,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 22:13:37,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1037146.6666666666, ans=0.0 2023-10-02 22:13:38,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 22:13:41,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:13:41,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:13:41,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:13:41,926 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.08 vs. limit=6.0 2023-10-02 22:13:42,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 22:13:43,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:13:43,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:13:44,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 22:13:45,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:13:51,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:13:51,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 22:13:59,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:14:00,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:14:02,134 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.861e+02 2.140e+02 2.435e+02 4.119e+02, threshold=4.280e+02, percent-clipped=0.0 2023-10-02 22:14:04,976 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 22:14:05,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:14:05,042 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 22:14:06,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:14:07,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:14:09,040 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 22:14:09,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:14:11,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 22:14:15,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:14:18,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:14:18,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:14:18,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:14:18,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:14:18,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1037346.6666666666, ans=0.1 2023-10-02 22:14:19,536 INFO [train.py:1046] (1/4) Epoch 30, batch 1550, loss[loss=0.1771, simple_loss=0.2625, pruned_loss=0.0458, over 24662.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2429, pruned_loss=0.04302, over 4736336.44 frames. ], batch size: 73, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:14:19,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:14:19,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 22:14:21,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 22:14:21,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:14:22,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 22:14:24,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 22:14:26,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:14:27,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:14:27,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:14:27,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:14:27,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:14:29,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:14:31,818 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 22:14:31,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:14:31,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:14:33,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:14:35,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:14:35,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 22:14:37,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:14:38,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 22:14:38,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 22:14:38,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 22:14:38,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:14:40,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:14:44,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:14:46,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 22:14:46,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 22:14:46,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1037413.3333333334, ans=0.1 2023-10-02 22:14:55,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:15:00,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:15:00,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:15:00,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:15:01,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 22:15:03,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1037546.6666666666, ans=0.125 2023-10-02 22:15:04,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:15:05,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:15:07,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:15:08,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:15:10,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:15:10,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 22:15:10,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:15:13,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:15:13,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:15:14,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 22:15:16,037 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 22:15:18,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:15:22,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 22:15:27,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1037613.3333333334, ans=0.1 2023-10-02 22:15:28,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:15:30,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:15:30,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 22:15:32,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:15:32,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:15:32,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:15:32,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:15:34,336 INFO [train.py:1046] (1/4) Epoch 30, batch 1600, loss[loss=0.1764, simple_loss=0.2656, pruned_loss=0.04361, over 24567.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2433, pruned_loss=0.04305, over 4738635.10 frames. ], batch size: 71, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:15:34,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:15:37,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:15:37,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 22:15:39,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 22:15:41,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 22:15:43,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:15:45,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 22:15:45,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1037680.0, ans=0.1 2023-10-02 22:15:47,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:15:49,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:15:54,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:15:57,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 22:16:00,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:16:00,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 22:16:02,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:02,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 22:16:07,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 22:16:14,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:16:16,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 22:16:16,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:16:16,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:16:16,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:16:19,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 22:16:24,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 22:16:25,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:16:27,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:27,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:27,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:16:28,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:16:30,398 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.857e+02 2.011e+02 2.284e+02 3.695e+02, threshold=4.022e+02, percent-clipped=0.0 2023-10-02 22:16:31,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:16:33,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:16:38,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:39,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:16:39,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1037946.6666666666, ans=0.125 2023-10-02 22:16:40,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 22:16:40,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:16:42,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 22:16:46,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:16:48,078 INFO [train.py:1046] (1/4) Epoch 30, batch 1650, loss[loss=0.1692, simple_loss=0.2587, pruned_loss=0.03981, over 24531.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2438, pruned_loss=0.0432, over 4724039.01 frames. ], batch size: 71, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:16:49,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:16:49,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:16:49,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 22:16:50,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 22:16:50,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 22:16:50,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 22:16:51,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1038013.3333333334, ans=0.0 2023-10-02 22:16:56,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:56,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:16:58,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:16:58,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:16:59,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:17:01,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 22:17:03,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:17:05,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:17:05,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:17:05,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:17:05,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 22:17:05,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1038080.0, ans=0.95 2023-10-02 22:17:06,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 22:17:12,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:17:14,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:17:19,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1038146.6666666666, ans=0.125 2023-10-02 22:17:23,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 22:17:23,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:17:25,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 22:17:28,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:17:31,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:17:31,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:17:32,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:17:32,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:17:32,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:17:36,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:17:36,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:17:37,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:17:38,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:17:38,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:17:38,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1038213.3333333334, ans=0.1 2023-10-02 22:17:40,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:17:44,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:17:44,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 22:17:44,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1038213.3333333334, ans=0.125 2023-10-02 22:17:45,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:17:46,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 22:17:47,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 22:17:47,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 22:17:47,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:17:48,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:17:48,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:17:50,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:17:50,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 22:17:53,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:17:56,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:17:56,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:17:59,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 22:18:01,897 INFO [train.py:1046] (1/4) Epoch 30, batch 1700, loss[loss=0.1666, simple_loss=0.2222, pruned_loss=0.05549, over 22702.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2433, pruned_loss=0.04336, over 4720416.35 frames. ], batch size: 322, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:18:03,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:18:03,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:18:04,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 22:18:05,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:18:05,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:18:05,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:18:07,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:18:07,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:18:07,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 22:18:11,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:18:19,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:18:22,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:18:25,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1038413.3333333334, ans=0.125 2023-10-02 22:18:27,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:18:27,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:18:28,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:18:28,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1038413.3333333334, ans=0.0 2023-10-02 22:18:29,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:18:31,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 22:18:32,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:18:32,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:18:34,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:18:35,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:18:36,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1038480.0, ans=0.125 2023-10-02 22:18:39,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 22:18:39,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 22:18:41,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:18:43,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 22:18:43,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:18:47,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1038546.6666666666, ans=0.1 2023-10-02 22:18:50,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:18:51,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:18:53,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:18:54,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 22:18:54,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 22:18:54,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:18:57,825 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.894e+02 2.111e+02 2.412e+02 3.601e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-02 22:18:57,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:18:57,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 22:18:58,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:18:58,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:18:59,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:18:59,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:19:00,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:19:00,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:19:02,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:19:02,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:19:02,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:19:06,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:19:08,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 22:19:10,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:19:13,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:19:14,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 22:19:15,700 INFO [train.py:1046] (1/4) Epoch 30, batch 1750, loss[loss=0.1525, simple_loss=0.1999, pruned_loss=0.05249, over 18939.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2422, pruned_loss=0.04293, over 4724617.42 frames. ], batch size: 389, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:19:21,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:19:22,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:19:22,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:19:24,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 22:19:24,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:19:26,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:19:27,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:19:31,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 22:19:34,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:19:36,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 22:19:36,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:19:38,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:19:39,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1038746.6666666666, ans=0.2 2023-10-02 22:19:40,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 22:19:40,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 22:19:41,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=1038746.6666666666, ans=10.0 2023-10-02 22:19:43,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:19:43,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 22:19:51,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:19:52,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:19:52,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:19:55,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:19:55,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:19:58,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:19:58,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:20:01,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:20:01,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:20:01,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 22:20:02,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:20:05,505 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.02 vs. limit=22.5 2023-10-02 22:20:06,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 22:20:06,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:20:06,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1038880.0, ans=0.0 2023-10-02 22:20:08,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:20:08,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:20:13,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:20:13,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 22:20:15,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:20:16,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:20:19,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:20:22,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:20:24,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:20:25,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 22:20:25,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:20:27,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:20:27,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:20:27,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:20:27,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:20:28,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:20:30,047 INFO [train.py:1046] (1/4) Epoch 30, batch 1800, loss[loss=0.1524, simple_loss=0.2315, pruned_loss=0.03666, over 24247.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2423, pruned_loss=0.04289, over 4734517.72 frames. ], batch size: 56, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:20:31,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:20:31,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:20:32,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 22:20:37,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:20:42,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:20:43,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:20:44,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:20:46,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1039080.0, ans=0.125 2023-10-02 22:20:48,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:20:48,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:20:49,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:20:51,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:20:52,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 22:20:53,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:20:56,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:21:02,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 22:21:02,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1039146.6666666666, ans=0.125 2023-10-02 22:21:03,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 22:21:03,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 22:21:03,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:21:06,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:21:06,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:21:06,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:21:07,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1039146.6666666666, ans=0.125 2023-10-02 22:21:12,803 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 22:21:14,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:21:17,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:21:19,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 22:21:19,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 22:21:20,265 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.67 vs. limit=10.0 2023-10-02 22:21:20,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:21:22,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:21:22,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:21:23,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1039213.3333333334, ans=0.1 2023-10-02 22:21:26,132 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.833e+02 1.992e+02 2.212e+02 2.839e+02, threshold=3.984e+02, percent-clipped=0.0 2023-10-02 22:21:26,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 22:21:31,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:21:32,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1039280.0, ans=0.0 2023-10-02 22:21:33,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 22:21:33,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:21:33,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:21:33,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:21:35,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 22:21:38,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:21:38,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:21:40,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 22:21:40,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:21:43,967 INFO [train.py:1046] (1/4) Epoch 30, batch 1850, loss[loss=0.1608, simple_loss=0.2382, pruned_loss=0.04164, over 23307.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2432, pruned_loss=0.04303, over 4739673.62 frames. ], batch size: 119, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:21:44,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:21:44,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:21:44,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:21:46,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:21:46,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:21:48,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:21:48,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:21:52,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:21:53,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:21:57,158 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=7.83 vs. limit=15.0 2023-10-02 22:22:00,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:22:00,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 22:22:00,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1039413.3333333334, ans=0.125 2023-10-02 22:22:02,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 22:22:05,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 22:22:07,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:22:07,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 22:22:07,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 22:22:19,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:22:21,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 22:22:23,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1039480.0, ans=0.1 2023-10-02 22:22:24,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:22:25,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:22:28,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 22:22:30,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:22:30,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:22:31,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:22:34,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:22:37,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:22:40,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:22:40,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:22:41,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 22:22:41,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:22:42,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1039613.3333333334, ans=0.1 2023-10-02 22:22:43,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:22:44,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:22:46,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 22:22:46,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:22:50,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:22:52,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:22:52,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 22:22:52,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 22:22:53,667 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 22:22:55,579 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 22:22:55,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:22:56,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:22:56,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:22:56,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:22:58,253 INFO [train.py:1046] (1/4) Epoch 30, batch 1900, loss[loss=0.1664, simple_loss=0.2398, pruned_loss=0.04648, over 23740.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2442, pruned_loss=0.04335, over 4735399.01 frames. ], batch size: 179, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:22:58,297 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 22:22:58,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:22:58,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:22:58,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1039680.0, ans=0.0 2023-10-02 22:22:59,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:23:01,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:23:01,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:23:02,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 22:23:06,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:23:06,027 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 22:23:06,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:23:07,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:23:07,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=1039680.0, ans=0.05 2023-10-02 22:23:10,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:23:12,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:23:13,003 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 22:23:13,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1039746.6666666666, ans=0.0 2023-10-02 22:23:14,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 22:23:14,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=1039746.6666666666, ans=10.0 2023-10-02 22:23:15,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:23:17,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:23:17,750 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 22:23:17,787 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 22:23:21,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 22:23:24,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:23:28,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 22:23:30,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 22:23:37,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1039813.3333333334, ans=0.0 2023-10-02 22:23:39,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 22:23:41,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 22:23:41,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:23:41,202 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 22:23:41,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 22:23:42,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 22:23:42,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 22:23:42,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:23:45,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 22:23:50,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:23:51,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:23:51,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 22:23:54,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:23:56,206 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.917e+02 2.185e+02 2.667e+02 3.803e+02, threshold=4.369e+02, percent-clipped=0.0 2023-10-02 22:23:57,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 22:23:57,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:24:01,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1039946.6666666666, ans=0.1 2023-10-02 22:24:04,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1039946.6666666666, ans=0.0 2023-10-02 22:24:05,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:24:05,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:24:05,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:24:07,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:24:07,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:24:08,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 22:24:08,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:24:14,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:24:14,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:24:15,630 INFO [train.py:1046] (1/4) Epoch 30, batch 1950, loss[loss=0.1755, simple_loss=0.2613, pruned_loss=0.04483, over 24397.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2451, pruned_loss=0.04367, over 4721163.99 frames. ], batch size: 77, lr: 3.41e-03, grad_scale: 8.0 2023-10-02 22:24:15,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:24:15,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:24:17,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:24:19,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:24:20,568 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.87 vs. limit=15.0 2023-10-02 22:24:21,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:24:24,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:24:24,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:24:24,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:24:27,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 22:24:28,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 22:24:28,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:24:29,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:24:32,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:24:32,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:24:32,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:24:35,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:24:35,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1040080.0, ans=0.125 2023-10-02 22:24:38,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:24:38,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:24:40,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:24:40,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:24:42,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:24:44,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:24:44,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:24:44,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 22:24:44,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 22:24:45,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:24:45,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:24:47,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:24:49,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:24:51,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:24:53,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1040146.6666666666, ans=15.0 2023-10-02 22:24:54,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:24:54,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1040146.6666666666, ans=0.1 2023-10-02 22:24:57,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:24:59,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:24:59,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 22:24:59,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:25:02,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:25:04,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:25:05,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:25:13,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:25:13,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:25:16,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:25:18,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:25:20,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:25:22,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:25:22,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 22:25:22,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:25:23,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:25:25,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 22:25:26,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:25:29,861 INFO [train.py:1046] (1/4) Epoch 30, batch 2000, loss[loss=0.205, simple_loss=0.2729, pruned_loss=0.06856, over 19720.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2459, pruned_loss=0.04401, over 4723152.48 frames. ], batch size: 389, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:25:30,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:25:31,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:25:31,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:25:33,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:25:36,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:25:36,838 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.02 vs. limit=15.0 2023-10-02 22:25:40,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 22:25:40,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:25:42,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1040346.6666666666, ans=0.1 2023-10-02 22:25:43,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:25:44,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 22:25:46,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:25:47,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:25:49,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:25:51,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 22:25:53,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:25:53,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1040413.3333333334, ans=0.0 2023-10-02 22:25:54,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:25:55,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:25:55,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 22:25:57,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:25:59,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1040480.0, ans=0.125 2023-10-02 22:26:00,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 22:26:00,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:26:01,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:26:05,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 22:26:05,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:05,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:26:06,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:26:08,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 22:26:09,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 22:26:11,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:26:11,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:26:15,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:26:16,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:26:16,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:26:18,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:26:18,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:26:18,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1040546.6666666666, ans=0.125 2023-10-02 22:26:19,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:26:19,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:26:19,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:26:22,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:25,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:26:26,527 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 2.011e+02 2.210e+02 2.489e+02 4.189e+02, threshold=4.420e+02, percent-clipped=0.0 2023-10-02 22:26:26,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 22:26:31,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:26:32,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:26:34,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:26:34,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:26:38,795 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1040613.3333333334, ans=0.2 2023-10-02 22:26:39,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:41,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:26:41,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:41,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:26:41,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:26:43,990 INFO [train.py:1046] (1/4) Epoch 30, batch 2050, loss[loss=0.1476, simple_loss=0.2109, pruned_loss=0.04214, over 23596.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2447, pruned_loss=0.04337, over 4727793.72 frames. ], batch size: 256, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:26:44,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:26:45,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:48,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:26:49,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:52,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:26:53,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:26:53,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:55,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:26:58,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 22:26:58,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:26:59,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:27:00,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:27:10,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:27:11,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:27:13,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 22:27:15,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:27:15,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 22:27:16,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:27:16,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1040813.3333333334, ans=0.125 2023-10-02 22:27:17,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1040813.3333333334, ans=0.125 2023-10-02 22:27:18,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:27:20,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:27:21,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:27:23,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:27:24,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:27:25,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:27:25,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:27:29,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:27:32,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:27:33,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:27:34,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:27:36,339 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.57 vs. limit=15.0 2023-10-02 22:27:38,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:27:42,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:27:44,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 22:27:48,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:27:49,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:27:52,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:27:53,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 22:27:55,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1041013.3333333334, ans=0.0 2023-10-02 22:27:56,625 INFO [train.py:1046] (1/4) Epoch 30, batch 2100, loss[loss=0.1439, simple_loss=0.2262, pruned_loss=0.03082, over 24448.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2429, pruned_loss=0.04298, over 4713932.81 frames. ], batch size: 58, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:27:56,719 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 22:27:56,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:27:57,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:27:58,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:28:00,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:28:00,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 22:28:00,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 22:28:01,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:28:04,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:28:06,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:28:09,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:28:10,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:28:10,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 22:28:12,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:28:12,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 22:28:12,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 22:28:13,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:28:13,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:28:13,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 22:28:13,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 22:28:19,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 22:28:19,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:28:22,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:28:23,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:28:26,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:28:26,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 22:28:28,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:28:28,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 22:28:29,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 22:28:29,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:28:29,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 22:28:29,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1041146.6666666666, ans=0.125 2023-10-02 22:28:31,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 22:28:31,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 22:28:34,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:28:35,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:28:38,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:28:40,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:28:41,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:28:42,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:28:42,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 22:28:43,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:28:43,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:28:43,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:28:43,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 22:28:44,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 22:28:46,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 22:28:50,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:28:50,727 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.71 vs. limit=10.0 2023-10-02 22:28:51,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1041213.3333333334, ans=0.0 2023-10-02 22:28:52,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:28:52,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 22:28:53,986 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.874e+02 2.097e+02 2.519e+02 4.862e+02, threshold=4.194e+02, percent-clipped=1.0 2023-10-02 22:28:57,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:29:00,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1041280.0, ans=0.2 2023-10-02 22:29:01,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:29:01,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:29:01,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:29:01,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 22:29:03,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:29:04,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:29:06,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:29:06,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:29:06,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:07,111 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=9.80 vs. limit=12.0 2023-10-02 22:29:08,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 22:29:11,604 INFO [train.py:1046] (1/4) Epoch 30, batch 2150, loss[loss=0.1635, simple_loss=0.2506, pruned_loss=0.0382, over 24667.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2421, pruned_loss=0.04275, over 4702545.77 frames. ], batch size: 68, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:29:11,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 22:29:11,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:29:14,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:29:14,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:29:15,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:29:15,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:29:21,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 22:29:22,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:29:24,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:26,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:29:26,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:29:26,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:29:28,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:29,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:29:29,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:29:31,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1041413.3333333334, ans=0.0 2023-10-02 22:29:32,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:29:32,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 22:29:37,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:29:39,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:29:40,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:29:41,356 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.35 vs. limit=10.0 2023-10-02 22:29:42,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:29:42,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:29:43,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:29:43,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:29:44,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:29:45,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:29:46,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 22:29:48,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:29:48,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:49,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:29:49,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:29:51,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:29:53,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:53,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:29:56,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:29:56,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 22:29:56,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:29:56,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1041546.6666666666, ans=0.0 2023-10-02 22:29:59,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:29:59,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:00,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:30:00,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:30:02,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:02,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:02,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 22:30:05,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 22:30:05,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:30:05,194 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 22:30:05,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:05,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:30:05,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1041546.6666666666, ans=0.1 2023-10-02 22:30:07,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 22:30:07,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:30:07,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 22:30:07,132 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 22:30:07,132 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 22:30:09,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 22:30:09,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1041613.3333333334, ans=0.125 2023-10-02 22:30:10,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:10,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:30:10,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:30:12,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:13,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 22:30:14,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:15,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:16,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1041613.3333333334, ans=0.0 2023-10-02 22:30:23,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:30:23,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 22:30:24,671 INFO [train.py:1046] (1/4) Epoch 30, batch 2200, loss[loss=0.1573, simple_loss=0.2422, pruned_loss=0.03621, over 24475.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2425, pruned_loss=0.04265, over 4708759.13 frames. ], batch size: 66, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:30:27,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:30:30,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:32,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:30:32,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:30:34,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:30:38,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:38,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:30:38,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 22:30:38,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1041746.6666666666, ans=0.125 2023-10-02 22:30:42,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 22:30:44,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:30:44,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1041746.6666666666, ans=0.125 2023-10-02 22:30:49,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 22:30:52,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:54,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:30:54,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:30:54,985 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.79 vs. limit=15.0 2023-10-02 22:30:57,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:30:57,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 22:31:00,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:31:01,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:31:03,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 22:31:05,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:31:07,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:31:10,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:31:12,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:31:14,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 22:31:15,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:31:15,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 22:31:18,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:31:19,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 22:31:19,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:31:21,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:31:22,477 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.843e+02 2.023e+02 2.325e+02 3.252e+02, threshold=4.046e+02, percent-clipped=0.0 2023-10-02 22:31:22,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:31:22,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:31:22,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:31:24,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:31:25,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:31:28,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:31:30,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 22:31:30,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:31:33,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1041946.6666666666, ans=0.2 2023-10-02 22:31:35,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:31:35,503 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 22:31:38,059 INFO [train.py:1046] (1/4) Epoch 30, batch 2250, loss[loss=0.1788, simple_loss=0.254, pruned_loss=0.05184, over 23672.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2428, pruned_loss=0.04316, over 4710569.52 frames. ], batch size: 232, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:31:38,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:31:38,170 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 22:31:39,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:31:40,899 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 22:31:42,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:31:42,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 22:31:44,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:31:46,175 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 22:31:46,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1042013.3333333334, ans=0.125 2023-10-02 22:31:48,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:31:50,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:31:55,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:31:56,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:31:59,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:31:59,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:32:00,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:32:02,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 22:32:02,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:32:02,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:32:02,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1042080.0, ans=0.125 2023-10-02 22:32:05,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 22:32:06,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:32:06,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:32:07,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:32:08,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1042146.6666666666, ans=0.05 2023-10-02 22:32:12,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:32:15,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 22:32:15,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:32:16,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 22:32:18,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:32:20,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:32:24,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:32:25,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:32:26,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:32:27,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:32:29,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:32:31,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:32:35,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:32:37,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:32:37,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1042280.0, ans=0.125 2023-10-02 22:32:41,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 22:32:41,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:32:42,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:32:47,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 22:32:49,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:32:49,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 22:32:50,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:32:50,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:32:52,154 INFO [train.py:1046] (1/4) Epoch 30, batch 2300, loss[loss=0.1517, simple_loss=0.2344, pruned_loss=0.0345, over 24665.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2441, pruned_loss=0.04336, over 4720038.87 frames. ], batch size: 65, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:32:53,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 22:32:55,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:32:56,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:32:58,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1042346.6666666666, ans=0.125 2023-10-02 22:33:01,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:33:01,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:33:03,419 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 22:33:06,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:33:13,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:33:13,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 22:33:15,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:33:15,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:33:15,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 22:33:16,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:33:19,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:33:19,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:33:24,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:33:27,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:33:28,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:33:32,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:33:32,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:33:35,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:33:38,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:33:42,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:33:42,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:33:42,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:33:44,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 22:33:49,142 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.877e+02 2.187e+02 2.420e+02 3.762e+02, threshold=4.375e+02, percent-clipped=0.0 2023-10-02 22:33:49,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 22:33:49,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:33:49,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:33:49,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:33:49,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:33:50,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 22:33:50,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:33:50,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 22:33:50,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:33:52,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:33:53,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 22:33:58,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:34:02,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:34:05,043 INFO [train.py:1046] (1/4) Epoch 30, batch 2350, loss[loss=0.169, simple_loss=0.2369, pruned_loss=0.05056, over 23485.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2448, pruned_loss=0.04367, over 4718808.70 frames. ], batch size: 134, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:34:05,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:34:05,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:34:05,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1042680.0, ans=0.125 2023-10-02 22:34:06,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:34:06,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:34:06,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:34:07,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:34:08,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 22:34:16,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:34:16,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 22:34:16,463 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:34:19,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 22:34:21,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=1042746.6666666666, ans=15.0 2023-10-02 22:34:24,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:34:25,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:34:25,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:34:25,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:34:26,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:34:28,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 22:34:31,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:34:35,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 22:34:36,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1042813.3333333334, ans=10.0 2023-10-02 22:34:37,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:34:41,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:34:41,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:34:42,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:34:44,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 22:34:44,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:34:48,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:34:48,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:34:48,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:34:50,313 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.98 vs. limit=6.0 2023-10-02 22:34:52,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:34:53,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 22:34:55,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:34:56,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:34:56,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:34:57,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1042880.0, ans=0.0 2023-10-02 22:34:58,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 22:34:59,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:35:02,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 22:35:02,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:35:02,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1042880.0, ans=0.125 2023-10-02 22:35:06,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 22:35:11,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 22:35:12,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:35:12,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 22:35:12,700 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 22:35:12,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 22:35:15,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 22:35:17,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:35:19,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1043013.3333333334, ans=0.1 2023-10-02 22:35:20,521 INFO [train.py:1046] (1/4) Epoch 30, batch 2400, loss[loss=0.1651, simple_loss=0.2405, pruned_loss=0.04489, over 23392.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2444, pruned_loss=0.04354, over 4711776.05 frames. ], batch size: 93, lr: 3.41e-03, grad_scale: 32.0 2023-10-02 22:35:23,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:35:26,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:35:27,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:35:27,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 22:35:29,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 22:35:36,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 22:35:36,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:35:38,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 22:35:38,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:35:39,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:35:39,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 22:35:45,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:35:46,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 22:35:47,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1043080.0, ans=0.125 2023-10-02 22:35:51,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:35:54,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 22:35:57,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:35:59,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:36:00,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1043146.6666666666, ans=0.125 2023-10-02 22:36:03,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:36:03,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 22:36:03,906 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.80 vs. limit=22.5 2023-10-02 22:36:04,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:36:04,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1043213.3333333334, ans=0.125 2023-10-02 22:36:11,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1043213.3333333334, ans=0.0 2023-10-02 22:36:12,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:36:14,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:36:16,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:36:18,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:36:18,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 22:36:18,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:36:18,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:36:19,280 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.363e+02 1.815e+02 2.119e+02 2.399e+02 3.814e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-02 22:36:19,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:36:19,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:36:25,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:36:25,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:36:25,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 22:36:26,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 22:36:29,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:36:29,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:36:29,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 22:36:29,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 22:36:31,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 22:36:31,509 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 22:36:31,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 22:36:32,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:36:34,345 INFO [train.py:1046] (1/4) Epoch 30, batch 2450, loss[loss=0.1526, simple_loss=0.2315, pruned_loss=0.03682, over 19069.00 frames. ], tot_loss[loss=0.164, simple_loss=0.242, pruned_loss=0.04296, over 4704357.71 frames. ], batch size: 41, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:36:34,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:36:34,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:36:35,872 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 22:36:36,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1043346.6666666666, ans=0.125 2023-10-02 22:36:37,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:36:38,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 22:36:41,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:36:41,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:36:45,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:36:45,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:36:46,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 22:36:52,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:36:52,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:36:55,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:36:55,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:36:55,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:36:55,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 22:36:55,865 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:36:59,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:37:01,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:37:02,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:37:05,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:37:05,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:37:07,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:37:07,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:37:07,953 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.56 vs. limit=6.0 2023-10-02 22:37:10,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 22:37:10,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:37:17,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:37:19,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:37:20,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:37:20,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:37:20,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:37:22,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:37:22,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 22:37:24,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1043546.6666666666, ans=0.125 2023-10-02 22:37:25,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:37:26,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:37:29,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:37:29,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:37:34,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:37:34,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 22:37:34,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1043613.3333333334, ans=0.1 2023-10-02 22:37:36,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:37:36,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:37:36,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 22:37:37,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:37:37,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:37:41,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:37:43,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:37:44,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:37:45,713 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.30 vs. limit=22.5 2023-10-02 22:37:47,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 22:37:47,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:37:48,583 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.09 vs. limit=22.5 2023-10-02 22:37:49,179 INFO [train.py:1046] (1/4) Epoch 30, batch 2500, loss[loss=0.1571, simple_loss=0.2412, pruned_loss=0.03649, over 24473.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2414, pruned_loss=0.04255, over 4709008.22 frames. ], batch size: 63, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:37:53,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1043680.0, ans=0.0 2023-10-02 22:37:55,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:37:57,020 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.13 vs. limit=15.0 2023-10-02 22:38:03,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:38:05,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:38:05,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:38:05,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 22:38:05,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1043746.6666666666, ans=0.0 2023-10-02 22:38:09,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1043746.6666666666, ans=0.125 2023-10-02 22:38:09,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1043746.6666666666, ans=0.2 2023-10-02 22:38:12,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:38:12,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:38:13,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 22:38:13,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 22:38:15,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 22:38:15,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:38:16,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:38:16,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 22:38:18,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:38:18,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 22:38:18,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:38:24,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:38:25,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:38:28,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:38:28,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 22:38:28,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:38:30,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:38:34,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:38:38,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:38:41,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:38:45,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1043880.0, ans=0.0 2023-10-02 22:38:46,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 22:38:48,328 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.895e+02 2.033e+02 2.318e+02 3.238e+02, threshold=4.066e+02, percent-clipped=0.0 2023-10-02 22:38:48,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1043946.6666666666, ans=0.0 2023-10-02 22:38:49,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 22:38:49,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:38:49,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 22:38:50,601 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=13.10 vs. limit=15.0 2023-10-02 22:38:51,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:38:51,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 22:38:53,825 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 22:38:53,825 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 22:38:53,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 22:38:56,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:38:58,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 22:38:58,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 22:39:00,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:39:00,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 22:39:02,812 INFO [train.py:1046] (1/4) Epoch 30, batch 2550, loss[loss=0.1832, simple_loss=0.2508, pruned_loss=0.05781, over 23823.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2422, pruned_loss=0.04284, over 4714741.87 frames. ], batch size: 212, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:39:04,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 22:39:06,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:39:07,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:39:09,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:39:11,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:39:11,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 22:39:13,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:39:18,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 22:39:18,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:39:21,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:39:21,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1044080.0, ans=0.125 2023-10-02 22:39:23,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:39:23,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 22:39:25,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:39:25,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:39:25,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:39:26,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:39:26,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 22:39:28,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 22:39:28,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:39:28,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 22:39:40,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:39:43,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:39:44,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:39:44,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:39:46,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:39:52,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:39:56,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:39:56,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:39:56,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:39:56,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:39:56,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1044213.3333333334, ans=0.1 2023-10-02 22:39:57,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:40:02,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:40:02,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:40:05,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:40:05,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 22:40:05,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:40:05,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:40:07,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:40:08,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:40:11,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:40:16,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:40:17,449 INFO [train.py:1046] (1/4) Epoch 30, batch 2600, loss[loss=0.1634, simple_loss=0.2454, pruned_loss=0.04069, over 24661.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2422, pruned_loss=0.0429, over 4703650.02 frames. ], batch size: 65, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:40:19,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:40:22,087 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 22:40:23,488 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 22:40:23,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:40:23,542 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 22:40:24,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 22:40:24,894 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 22:40:27,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:40:29,108 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 22:40:29,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 22:40:31,104 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 22:40:31,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1044413.3333333334, ans=0.125 2023-10-02 22:40:31,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1044413.3333333334, ans=0.125 2023-10-02 22:40:32,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:40:33,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 22:40:36,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 22:40:37,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:40:39,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 22:40:41,269 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 22:40:41,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 22:40:47,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:40:47,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:40:47,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:40:47,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 22:40:49,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:40:52,254 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 22:40:56,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1044480.0, ans=0.125 2023-10-02 22:40:59,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:40:59,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:41:00,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 22:41:00,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:41:00,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:41:02,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 22:41:04,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:41:04,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:41:06,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1044546.6666666666, ans=0.125 2023-10-02 22:41:07,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:41:12,274 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 22:41:12,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:41:12,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:41:16,856 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.854e+02 2.030e+02 2.260e+02 3.001e+02, threshold=4.060e+02, percent-clipped=0.0 2023-10-02 22:41:17,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:41:20,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:41:20,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 22:41:20,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:41:23,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:41:23,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:41:28,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 22:41:29,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:41:31,752 INFO [train.py:1046] (1/4) Epoch 30, batch 2650, loss[loss=0.1632, simple_loss=0.2517, pruned_loss=0.03739, over 24446.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2425, pruned_loss=0.04291, over 4715267.88 frames. ], batch size: 69, lr: 3.41e-03, grad_scale: 8.0 2023-10-02 22:41:31,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 22:41:36,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 22:41:36,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:41:37,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:41:37,484 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 22:41:37,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:41:37,811 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:41:40,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:41:42,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 22:41:42,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:41:44,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:41:46,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 22:41:46,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:41:48,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:41:48,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1044746.6666666666, ans=0.0 2023-10-02 22:41:49,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 22:41:51,671 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 22:41:54,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:41:57,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 22:41:57,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:41:57,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 22:41:58,157 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.85 vs. limit=22.5 2023-10-02 22:42:00,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:42:01,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 22:42:01,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:42:01,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:07,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 22:42:07,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 22:42:11,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:42:16,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 22:42:16,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:42:17,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:17,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:42:17,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1044880.0, ans=0.07 2023-10-02 22:42:18,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:42:18,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:42:20,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:42:22,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:42:22,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1044880.0, ans=0.125 2023-10-02 22:42:23,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:42:23,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:42:25,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:42:26,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:42:28,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:42:29,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:42:29,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:42:29,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 22:42:29,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1044946.6666666666, ans=0.125 2023-10-02 22:42:33,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:33,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:42:33,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:42:35,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 22:42:38,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:42:40,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:42,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:43,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:42:44,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:42:44,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:42:46,081 INFO [train.py:1046] (1/4) Epoch 30, batch 2700, loss[loss=0.162, simple_loss=0.2305, pruned_loss=0.04679, over 23389.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2426, pruned_loss=0.04292, over 4723856.92 frames. ], batch size: 285, lr: 3.41e-03, grad_scale: 8.0 2023-10-02 22:42:47,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:42:47,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 22:42:48,232 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.76 vs. limit=15.0 2023-10-02 22:42:48,316 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.83 vs. limit=12.0 2023-10-02 22:42:49,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1045013.3333333334, ans=0.2 2023-10-02 22:42:50,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:42:52,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 22:42:53,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:42:53,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:42:53,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:42:55,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:42:56,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:42:56,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:42:57,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:42:58,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 22:42:58,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:43:01,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:43:01,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:43:02,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:43:05,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:43:06,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 22:43:06,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:43:11,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:43:11,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:43:18,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:43:18,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:43:18,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:43:18,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:43:21,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:43:21,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1045146.6666666666, ans=0.125 2023-10-02 22:43:26,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:43:26,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:43:26,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:43:29,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:43:29,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:43:36,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:43:36,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:43:38,682 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.80 vs. limit=15.0 2023-10-02 22:43:41,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:43:41,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:43:45,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:43:45,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:43:46,892 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.911e+02 2.096e+02 2.404e+02 3.352e+02, threshold=4.192e+02, percent-clipped=0.0 2023-10-02 22:43:46,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:43:48,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:43:49,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:43:49,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:43:52,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:43:53,791 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.75 vs. limit=15.0 2023-10-02 22:43:54,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:43:54,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:43:57,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 22:43:59,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:43:59,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1045346.6666666666, ans=0.0 2023-10-02 22:44:00,639 INFO [train.py:1046] (1/4) Epoch 30, batch 2750, loss[loss=0.1572, simple_loss=0.246, pruned_loss=0.03421, over 24436.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2426, pruned_loss=0.04269, over 4735299.41 frames. ], batch size: 69, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:44:00,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:44:00,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 22:44:03,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 22:44:03,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:44:04,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:44:04,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:44:08,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:09,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:44:09,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:11,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:44:12,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 22:44:12,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:44:12,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:12,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 22:44:12,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:44:12,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:44:18,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 22:44:21,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:44:21,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:21,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:44:22,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 22:44:22,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:44:26,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:44:26,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:44:26,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:44:30,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:44:30,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 22:44:30,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:44:31,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1045480.0, ans=0.1 2023-10-02 22:44:32,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:32,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:44:32,385 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:44:39,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:44:41,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 22:44:42,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:44:46,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:46,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:44:47,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:44:49,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1045546.6666666666, ans=0.2 2023-10-02 22:44:52,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:44:52,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:44:52,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 22:44:56,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:44:58,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 22:45:01,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 22:45:05,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:45:05,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 22:45:07,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:45:08,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:45:09,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 22:45:09,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:45:12,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 22:45:12,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:45:12,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:45:14,074 INFO [train.py:1046] (1/4) Epoch 30, batch 2800, loss[loss=0.1471, simple_loss=0.2258, pruned_loss=0.0342, over 22008.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2411, pruned_loss=0.04212, over 4720189.62 frames. ], batch size: 48, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 22:45:14,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 22:45:14,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:45:14,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:45:15,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:45:17,086 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 22:45:17,089 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 22:45:21,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:45:22,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.02 vs. limit=22.5 2023-10-02 22:45:23,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:45:23,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:45:28,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:45:29,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 22:45:31,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 22:45:33,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 22:45:34,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:45:34,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:45:34,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:45:38,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:45:39,415 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.86 vs. limit=22.5 2023-10-02 22:45:40,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:45:40,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:45:41,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:45:41,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1045746.6666666666, ans=0.125 2023-10-02 22:45:48,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:45:51,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:45:54,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:45:54,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:45:55,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:45:59,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:45:59,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 22:46:00,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:46:02,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:46:02,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:46:06,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:46:06,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:46:09,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:46:12,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:46:13,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:46:13,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:46:13,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:46:13,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:46:14,908 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.953e+02 2.195e+02 2.519e+02 3.830e+02, threshold=4.390e+02, percent-clipped=0.0 2023-10-02 22:46:14,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:46:15,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 22:46:15,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:46:17,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:46:17,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:46:18,822 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.94 vs. limit=8.0 2023-10-02 22:46:19,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 22:46:20,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:46:20,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:46:20,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:46:22,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 22:46:28,649 INFO [train.py:1046] (1/4) Epoch 30, batch 2850, loss[loss=0.1599, simple_loss=0.2293, pruned_loss=0.04529, over 23568.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2409, pruned_loss=0.0419, over 4720017.31 frames. ], batch size: 256, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 22:46:28,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:46:30,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:46:30,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:46:32,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:46:35,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:46:35,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:46:35,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:46:38,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:46:39,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:46:42,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:46:42,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 22:46:46,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1046080.0, ans=0.125 2023-10-02 22:46:49,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 22:46:49,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:46:50,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 22:46:50,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:46:53,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 22:46:53,844 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:46:54,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 22:46:56,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:47:07,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:47:09,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:47:09,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:47:09,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:47:10,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:47:10,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:47:10,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1046146.6666666666, ans=0.0 2023-10-02 22:47:11,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:47:12,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 22:47:14,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:47:14,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:47:16,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:47:17,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:47:18,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:47:18,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:47:20,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:47:24,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:47:24,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:47:25,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:47:27,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:47:29,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:47:34,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:47:36,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 22:47:36,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 22:47:39,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 22:47:40,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:47:40,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 22:47:40,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:47:41,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:47:41,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:47:41,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:47:41,956 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 22:47:41,993 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 22:47:41,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:47:43,298 INFO [train.py:1046] (1/4) Epoch 30, batch 2900, loss[loss=0.1545, simple_loss=0.2265, pruned_loss=0.04122, over 23453.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2413, pruned_loss=0.04201, over 4731008.72 frames. ], batch size: 134, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:47:43,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:47:47,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:47:47,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:47:47,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:47:49,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 22:47:54,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:47:54,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 22:47:55,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 22:47:57,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:47:57,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:47:58,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:47:59,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:48:03,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:48:03,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1046413.3333333334, ans=0.125 2023-10-02 22:48:05,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:48:06,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:48:07,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 22:48:07,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:48:10,443 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.35 vs. limit=15.0 2023-10-02 22:48:12,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:48:13,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 22:48:13,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 22:48:15,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1046480.0, ans=0.1 2023-10-02 22:48:16,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:48:16,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 22:48:16,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:48:18,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:48:18,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:48:20,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:48:22,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:48:24,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:48:26,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:48:27,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 22:48:29,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 22:48:29,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:48:33,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:48:36,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 22:48:36,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1046546.6666666666, ans=0.0 2023-10-02 22:48:37,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:48:43,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:48:45,104 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 1.984e+02 2.334e+02 2.809e+02 4.390e+02, threshold=4.669e+02, percent-clipped=1.0 2023-10-02 22:48:50,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:48:50,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:48:52,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 22:48:54,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:48:54,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 22:48:55,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:48:55,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:48:55,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1046680.0, ans=0.0 2023-10-02 22:48:56,368 INFO [train.py:1046] (1/4) Epoch 30, batch 2950, loss[loss=0.1809, simple_loss=0.2676, pruned_loss=0.04707, over 23991.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2422, pruned_loss=0.04242, over 4719641.13 frames. ], batch size: 86, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:48:56,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1046680.0, ans=0.0 2023-10-02 22:49:02,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:49:02,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1046680.0, ans=0.1 2023-10-02 22:49:03,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 22:49:05,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:49:05,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:49:07,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:49:07,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:49:10,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 22:49:11,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 22:49:11,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 22:49:11,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:49:17,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:49:19,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:49:20,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:49:21,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:49:24,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:49:24,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:49:24,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:49:26,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:49:26,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:49:28,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 22:49:35,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 22:49:35,188 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 22:49:35,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:49:37,956 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 22:49:39,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 22:49:39,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:49:41,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:49:41,262 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 22:49:41,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 22:49:43,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 22:49:45,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:49:45,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:49:47,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:49:48,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:49:48,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:49:50,022 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 22:49:50,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:49:51,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 22:49:55,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:49:57,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:49:57,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 22:49:57,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:49:58,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 22:49:59,472 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.48 vs. limit=15.0 2023-10-02 22:50:01,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:50:03,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:50:03,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:50:06,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:50:06,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 22:50:06,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:50:07,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:50:07,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:50:07,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 22:50:08,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:50:09,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:50:10,829 INFO [train.py:1046] (1/4) Epoch 30, batch 3000, loss[loss=0.1396, simple_loss=0.2215, pruned_loss=0.02881, over 24474.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2435, pruned_loss=0.04316, over 4712747.02 frames. ], batch size: 58, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:50:10,830 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 22:50:22,558 INFO [train.py:1078] (1/4) Epoch 30, validation: loss=0.3782, simple_loss=0.2831, pruned_loss=0.2366, over 1125622.00 frames. 2023-10-02 22:50:22,559 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 22:50:22,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:50:22,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 22:50:24,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:50:25,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:50:25,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:50:30,162 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 22:50:31,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 22:50:33,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:50:33,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:50:33,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 22:50:34,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:50:40,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:50:48,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:50:55,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 22:50:56,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:50:59,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:50:59,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:51:01,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:51:02,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:51:02,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 22:51:06,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 22:51:07,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:51:07,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 22:51:10,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:51:10,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:51:12,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:51:12,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:51:16,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:51:17,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:51:17,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:51:19,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:51:20,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 22:51:20,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:51:22,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:51:22,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:51:24,945 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.821e+02 2.047e+02 2.427e+02 4.890e+02, threshold=4.095e+02, percent-clipped=1.0 2023-10-02 22:51:26,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:51:26,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:51:27,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 22:51:29,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 22:51:29,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:51:30,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 22:51:30,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:51:31,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 22:51:33,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 22:51:35,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 22:51:35,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 22:51:35,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 22:51:35,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 22:51:35,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:51:37,315 INFO [train.py:1046] (1/4) Epoch 30, batch 3050, loss[loss=0.1655, simple_loss=0.2543, pruned_loss=0.03841, over 24016.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2445, pruned_loss=0.04339, over 4715682.55 frames. ], batch size: 80, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:51:38,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:51:38,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 22:51:38,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:51:38,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:51:38,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1047346.6666666666, ans=0.125 2023-10-02 22:51:41,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 22:51:44,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:51:47,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:51:47,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:51:50,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:51:52,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 22:51:53,539 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.62 vs. limit=15.0 2023-10-02 22:51:58,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 22:51:58,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 22:51:58,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:51:58,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1047413.3333333334, ans=0.1 2023-10-02 22:52:02,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:52:06,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:52:06,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:52:06,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:52:06,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1047480.0, ans=0.125 2023-10-02 22:52:11,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:52:11,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 22:52:12,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:52:12,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:52:12,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:52:13,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:52:15,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:52:18,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:52:18,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 22:52:19,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:52:19,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:52:20,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:52:21,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:52:22,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:52:23,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:52:25,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1047546.6666666666, ans=0.2 2023-10-02 22:52:29,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:52:29,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:52:30,639 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:52:34,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:52:34,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:52:34,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:52:37,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:52:38,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 22:52:38,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:52:39,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 22:52:41,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:52:42,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:52:43,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 22:52:44,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:52:49,793 INFO [train.py:1046] (1/4) Epoch 30, batch 3100, loss[loss=0.1731, simple_loss=0.2584, pruned_loss=0.04392, over 23373.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2438, pruned_loss=0.0431, over 4723806.67 frames. ], batch size: 93, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:52:49,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:52:51,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:52:52,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:52:54,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 22:52:56,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 22:52:58,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 22:53:00,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:53:04,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:53:04,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:53:05,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 22:53:10,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:53:15,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 22:53:18,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1047813.3333333334, ans=0.2 2023-10-02 22:53:19,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 22:53:19,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:20,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:53:20,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:53:22,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 22:53:22,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:53:22,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 22:53:22,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:53:23,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:53:25,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 22:53:26,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:53:26,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1047813.3333333334, ans=0.04949747468305833 2023-10-02 22:53:27,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1047813.3333333334, ans=0.05 2023-10-02 22:53:30,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:53:32,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 22:53:32,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 22:53:34,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:34,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:53:36,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:53:36,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:36,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:53:39,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:53:39,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:53:41,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:53:41,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:53:41,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:41,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 22:53:45,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:53:46,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 22:53:48,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:53:48,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 22:53:49,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:53:49,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:49,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 22:53:51,051 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.847e+02 2.082e+02 2.389e+02 3.344e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-02 22:53:54,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1047946.6666666666, ans=0.2 2023-10-02 22:53:58,925 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.34 vs. limit=22.5 2023-10-02 22:54:02,411 INFO [train.py:1046] (1/4) Epoch 30, batch 3150, loss[loss=0.1587, simple_loss=0.2373, pruned_loss=0.04005, over 24482.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2423, pruned_loss=0.04308, over 4721389.06 frames. ], batch size: 63, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:54:03,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 22:54:03,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1048013.3333333334, ans=0.0 2023-10-02 22:54:04,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:54:04,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:54:07,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:54:07,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:54:07,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 22:54:09,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:54:09,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 22:54:10,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 22:54:12,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:54:14,216 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 22:54:18,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 22:54:18,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:54:18,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1048080.0, ans=0.125 2023-10-02 22:54:19,652 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 22:54:19,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 22:54:19,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1048080.0, ans=0.125 2023-10-02 22:54:22,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 22:54:22,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 22:54:22,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 22:54:22,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:54:22,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:54:23,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:54:25,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 22:54:25,399 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:54:26,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:54:26,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1048080.0, ans=0.0 2023-10-02 22:54:27,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:54:29,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:54:32,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:54:35,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 22:54:36,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:54:39,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:54:39,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:54:39,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 22:54:43,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 22:54:43,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:54:44,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 22:54:44,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 22:54:44,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:54:44,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:54:46,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:54:46,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 22:54:48,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 22:54:48,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:54:49,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:54:50,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:54:50,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:54:51,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 22:54:53,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:54:54,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 22:54:54,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:54:56,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 22:54:57,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 22:54:58,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:54:58,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:55:01,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 22:55:01,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 22:55:03,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:55:04,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:55:06,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:55:08,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:55:11,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:55:11,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:55:14,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 22:55:17,443 INFO [train.py:1046] (1/4) Epoch 30, batch 3200, loss[loss=0.1471, simple_loss=0.2292, pruned_loss=0.03247, over 24645.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2418, pruned_loss=0.04274, over 4718448.93 frames. ], batch size: 65, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 22:55:18,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:55:18,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:55:22,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:55:23,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1048346.6666666666, ans=0.0 2023-10-02 22:55:24,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:55:24,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 22:55:25,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:55:29,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:55:34,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:55:36,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1048413.3333333334, ans=0.1 2023-10-02 22:55:39,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1048413.3333333334, ans=0.125 2023-10-02 22:55:42,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:55:52,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 22:55:52,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:55:55,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 22:55:57,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 22:55:59,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:56:01,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 22:56:01,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:56:06,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 22:56:07,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 22:56:08,637 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.58 vs. limit=10.0 2023-10-02 22:56:10,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 22:56:14,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 22:56:14,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 22:56:19,390 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.819e+02 2.030e+02 2.429e+02 3.151e+02, threshold=4.060e+02, percent-clipped=0.0 2023-10-02 22:56:19,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:56:21,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:56:21,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:56:21,412 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 22:56:21,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 22:56:23,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1048613.3333333333, ans=0.0 2023-10-02 22:56:24,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:56:26,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 22:56:26,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 22:56:28,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 22:56:30,792 INFO [train.py:1046] (1/4) Epoch 30, batch 3250, loss[loss=0.1516, simple_loss=0.2341, pruned_loss=0.03458, over 24303.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2413, pruned_loss=0.04259, over 4714380.12 frames. ], batch size: 61, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 22:56:30,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 22:56:32,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:56:34,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:56:34,281 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 22:56:35,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:56:35,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:56:37,711 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 22:56:41,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:56:43,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:56:46,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1048746.6666666667, ans=0.0 2023-10-02 22:56:50,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:56:50,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 22:56:52,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:56:53,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:56:53,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:56:55,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:56:55,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:56:58,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:56:58,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:56:59,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:56:59,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:56:59,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:57:00,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:57:02,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:57:02,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:57:05,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:57:05,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:57:06,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:57:06,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:57:06,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:57:07,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1048813.3333333333, ans=0.125 2023-10-02 22:57:11,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 22:57:13,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:57:13,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:57:14,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:57:14,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:57:21,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:57:26,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:57:26,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:57:26,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 22:57:26,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:57:26,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 22:57:28,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:57:30,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 22:57:30,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 22:57:30,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:57:32,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:57:32,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1048946.6666666667, ans=0.1 2023-10-02 22:57:34,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:57:34,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 22:57:35,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:57:35,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1048946.6666666667, ans=0.1 2023-10-02 22:57:38,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:57:38,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:57:41,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 22:57:41,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:57:44,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:57:44,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 22:57:45,605 INFO [train.py:1046] (1/4) Epoch 30, batch 3300, loss[loss=0.1624, simple_loss=0.2367, pruned_loss=0.04403, over 23600.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2427, pruned_loss=0.04288, over 4713557.97 frames. ], batch size: 256, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:57:47,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:57:47,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 22:57:48,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 22:57:49,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 22:57:49,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:57:53,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1049013.3333333333, ans=0.125 2023-10-02 22:57:55,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:57:56,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:57:56,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:57:59,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 22:57:59,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:57:59,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1049080.0, ans=0.125 2023-10-02 22:58:01,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:58:02,919 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.96 vs. limit=15.0 2023-10-02 22:58:03,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:58:06,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 22:58:08,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:58:09,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:58:10,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:10,627 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 22:58:12,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:58:13,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 22:58:15,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:58:15,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:58:15,212 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 22:58:19,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:58:19,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:58:20,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:20,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 22:58:22,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 22:58:22,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:24,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:58:26,896 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 22:58:27,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 22:58:27,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:58:30,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 22:58:32,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:58:35,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:58:36,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:58:38,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:58:40,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:58:40,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:58:40,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:58:41,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:58:41,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:43,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:58:45,112 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 22:58:46,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 22:58:47,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 22:58:49,138 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.469e+02 1.888e+02 2.006e+02 2.224e+02 2.991e+02, threshold=4.012e+02, percent-clipped=0.0 2023-10-02 22:58:49,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:58:49,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:58:49,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:58:49,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:58:49,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1049280.0, ans=0.0 2023-10-02 22:58:50,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 22:58:50,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:58:50,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:58:52,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:52,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1049280.0, ans=0.0 2023-10-02 22:58:54,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:58:58,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 22:58:58,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:00,262 INFO [train.py:1046] (1/4) Epoch 30, batch 3350, loss[loss=0.1646, simple_loss=0.2567, pruned_loss=0.03623, over 24463.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2435, pruned_loss=0.04281, over 4719287.21 frames. ], batch size: 69, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:59:00,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:01,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:59:01,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:59:03,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:59:04,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:59:04,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:08,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:59:10,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:10,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1049346.6666666667, ans=0.125 2023-10-02 22:59:12,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:59:14,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:14,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:59:16,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:59:18,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:59:19,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 22:59:19,771 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 22:59:19,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1049413.3333333333, ans=0.0 2023-10-02 22:59:21,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:59:24,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 22:59:24,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 22:59:24,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:59:24,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:59:25,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:59:25,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 22:59:25,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:25,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:59:29,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:31,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:31,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:31,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:59:33,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1049480.0, ans=0.125 2023-10-02 22:59:34,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:59:36,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:36,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:59:40,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:59:42,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:43,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:43,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:59:43,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1049546.6666666667, ans=0.07 2023-10-02 22:59:44,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:59:46,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 22:59:48,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:59:48,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 22:59:48,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:59:49,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 22:59:50,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:59:51,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1049546.6666666667, ans=0.125 2023-10-02 22:59:52,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:53,818 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.11 vs. limit=15.0 2023-10-02 22:59:54,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1049546.6666666667, ans=0.125 2023-10-02 23:00:00,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:00:00,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 23:00:01,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:00:03,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:00:04,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:00:08,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1049613.3333333333, ans=0.1 2023-10-02 23:00:10,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:00:11,069 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.40 vs. limit=6.0 2023-10-02 23:00:11,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 23:00:12,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:00:13,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:00:13,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1049680.0, ans=0.1 2023-10-02 23:00:13,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1049680.0, ans=0.125 2023-10-02 23:00:14,783 INFO [train.py:1046] (1/4) Epoch 30, batch 3400, loss[loss=0.1581, simple_loss=0.2432, pruned_loss=0.03647, over 24480.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2433, pruned_loss=0.04272, over 4720036.00 frames. ], batch size: 66, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 23:00:14,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:00:14,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 23:00:16,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:00:16,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 23:00:17,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:00:17,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:00:18,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:00:20,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:00:20,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 23:00:26,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 23:00:26,837 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 23:00:26,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:00:31,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:00:31,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:00:32,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:00:33,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1049746.6666666667, ans=0.0 2023-10-02 23:00:34,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:00:38,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:00:39,095 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.72 vs. limit=22.5 2023-10-02 23:00:39,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 23:00:44,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:00:44,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1049813.3333333333, ans=0.0 2023-10-02 23:00:46,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:00:47,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:00:48,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 23:00:53,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:00:54,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1049813.3333333333, ans=0.2 2023-10-02 23:00:55,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 23:01:01,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:01:01,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:01:03,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 23:01:03,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:01:04,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:01:04,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:01:05,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:01:07,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:01:11,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:01:11,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:01:17,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:01:19,053 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 1.969e+02 2.278e+02 2.618e+02 4.115e+02, threshold=4.556e+02, percent-clipped=1.0 2023-10-02 23:01:19,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 23:01:25,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:01:28,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1050013.3333333333, ans=0.125 2023-10-02 23:01:29,863 INFO [train.py:1046] (1/4) Epoch 30, batch 3450, loss[loss=0.1644, simple_loss=0.2574, pruned_loss=0.03567, over 24550.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2435, pruned_loss=0.0432, over 4706527.80 frames. ], batch size: 71, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 23:01:29,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 23:01:30,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1050013.3333333333, ans=0.0 2023-10-02 23:01:33,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 23:01:33,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:01:34,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:01:34,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 23:01:35,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:01:38,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:01:44,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:01:46,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:01:47,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:01:47,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:01:47,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1050080.0, ans=0.025 2023-10-02 23:01:49,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:01:49,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1050080.0, ans=0.1 2023-10-02 23:01:51,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1050080.0, ans=0.1 2023-10-02 23:01:53,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 23:01:57,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1050080.0, ans=0.1 2023-10-02 23:01:59,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 23:01:59,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 23:01:59,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:02:01,734 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=15.04 vs. limit=15.0 2023-10-02 23:02:02,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:02:07,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 23:02:07,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:02:11,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:02:11,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:02:12,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:02:15,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:02:17,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 23:02:17,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:02:19,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:02:21,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:02:24,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 23:02:27,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:02:32,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:02:32,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:02:36,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:02:41,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:02:41,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:02:41,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:02:43,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:02:44,603 INFO [train.py:1046] (1/4) Epoch 30, batch 3500, loss[loss=0.1422, simple_loss=0.2011, pruned_loss=0.04164, over 22723.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2422, pruned_loss=0.0427, over 4717783.69 frames. ], batch size: 322, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 23:02:45,588 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.29 vs. limit=15.0 2023-10-02 23:02:47,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:02:47,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1050346.6666666667, ans=0.1 2023-10-02 23:02:48,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:02:50,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 23:02:50,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1050346.6666666667, ans=0.2 2023-10-02 23:02:52,527 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1050346.6666666667, ans=0.2 2023-10-02 23:02:53,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:02:53,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1050346.6666666667, ans=0.125 2023-10-02 23:02:55,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1050346.6666666667, ans=0.125 2023-10-02 23:02:57,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 23:02:57,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1050413.3333333333, ans=0.0 2023-10-02 23:02:59,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:02:59,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 23:03:02,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:03:02,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:03:02,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1050413.3333333333, ans=0.2 2023-10-02 23:03:03,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:03:03,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:03:03,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 23:03:05,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:05,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:03:05,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 23:03:08,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:10,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:03:10,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:03:10,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1050413.3333333333, ans=0.0 2023-10-02 23:03:13,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:14,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 23:03:14,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:03:18,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:03:19,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:03:21,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:22,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:03:24,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:03:26,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 23:03:27,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 23:03:27,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 23:03:27,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:03:29,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1050546.6666666667, ans=0.0 2023-10-02 23:03:30,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:30,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:03:30,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:03:33,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 23:03:33,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:03:37,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:03:37,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 23:03:37,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 23:03:37,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:03:41,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:03:42,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:03:43,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:44,624 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.57 vs. limit=6.0 2023-10-02 23:03:46,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 23:03:46,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:03:47,844 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.822e+02 2.025e+02 2.259e+02 3.457e+02, threshold=4.050e+02, percent-clipped=0.0 2023-10-02 23:03:48,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:03:48,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 23:03:51,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 23:03:54,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:56,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:03:56,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:03:57,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:03:58,752 INFO [train.py:1046] (1/4) Epoch 30, batch 3550, loss[loss=0.1607, simple_loss=0.2449, pruned_loss=0.03825, over 24453.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.241, pruned_loss=0.04214, over 4727117.15 frames. ], batch size: 63, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 23:04:00,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:04:00,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1050680.0, ans=0.0 2023-10-02 23:04:08,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:04:10,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 23:04:11,325 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.53 vs. limit=15.0 2023-10-02 23:04:13,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:04:13,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:04:15,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:04:15,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1050746.6666666667, ans=0.025 2023-10-02 23:04:16,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:04:16,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:04:18,513 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.46 vs. limit=15.0 2023-10-02 23:04:19,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:04:19,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:04:20,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:04:20,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 23:04:21,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:04:26,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:04:27,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:04:29,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:04:29,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:04:29,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:04:29,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 23:04:29,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:04:31,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:04:32,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 23:04:34,830 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.62 vs. limit=12.0 2023-10-02 23:04:38,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:04:38,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:04:40,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:04:42,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 23:04:42,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:04:43,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 23:04:44,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:04:47,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:04:47,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:04:51,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 23:04:51,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:04:56,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:04:58,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 23:04:58,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:05:01,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:05:01,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1050946.6666666667, ans=0.0 2023-10-02 23:05:02,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 23:05:07,279 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.20 vs. limit=15.0 2023-10-02 23:05:08,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 23:05:08,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:05:08,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1050946.6666666667, ans=0.0 2023-10-02 23:05:09,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:05:09,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1050946.6666666667, ans=0.125 2023-10-02 23:05:11,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:05:12,512 INFO [train.py:1046] (1/4) Epoch 30, batch 3600, loss[loss=0.1642, simple_loss=0.2334, pruned_loss=0.04753, over 23749.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2414, pruned_loss=0.04238, over 4713812.55 frames. ], batch size: 179, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 23:05:12,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:05:12,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:05:17,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:05:17,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:05:19,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:05:19,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:05:20,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:05:20,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 23:05:22,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:05:23,041 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.59 vs. limit=22.5 2023-10-02 23:05:24,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:05:26,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:05:29,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:05:30,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:05:30,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:05:30,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 23:05:32,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:05:33,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:05:35,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:05:36,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:05:38,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1051080.0, ans=0.0 2023-10-02 23:05:39,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:05:41,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:05:42,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 23:05:45,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten.whitening_limit, batch_count=1051146.6666666667, ans=15.0 2023-10-02 23:05:49,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:05:50,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1051146.6666666667, ans=0.0 2023-10-02 23:05:51,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:05:53,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 23:05:56,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:05:56,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1051213.3333333333, ans=0.125 2023-10-02 23:05:59,803 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.13 vs. limit=15.0 2023-10-02 23:06:00,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:06:03,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:06:08,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:06:08,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:06:08,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 23:06:10,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 23:06:12,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 23:06:13,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:06:15,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:06:16,666 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.870e+02 2.077e+02 2.507e+02 3.555e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-02 23:06:16,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 23:06:16,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:06:18,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:06:18,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:06:20,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 23:06:21,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 23:06:24,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1051280.0, ans=0.0 2023-10-02 23:06:25,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:06:25,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 23:06:27,011 INFO [train.py:1046] (1/4) Epoch 30, batch 3650, loss[loss=0.1604, simple_loss=0.2466, pruned_loss=0.03705, over 24035.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2426, pruned_loss=0.04302, over 4705101.13 frames. ], batch size: 80, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 23:06:30,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 23:06:32,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:06:34,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 23:06:36,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 23:06:39,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:06:39,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:06:39,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:06:42,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 23:06:42,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:06:43,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 23:06:43,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:06:43,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1051413.3333333333, ans=0.0 2023-10-02 23:06:45,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:06:45,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 23:06:45,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1051413.3333333333, ans=0.0 2023-10-02 23:06:47,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 23:06:47,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:06:47,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:06:50,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:06:51,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 23:06:52,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1051413.3333333333, ans=0.125 2023-10-02 23:06:53,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 23:06:54,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:06:55,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 23:06:59,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:06:59,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:07:00,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1051480.0, ans=0.125 2023-10-02 23:07:00,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1051480.0, ans=0.125 2023-10-02 23:07:04,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 23:07:07,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:07:07,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:07:08,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:07:10,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:07:11,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:07:14,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:07:14,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1051546.6666666667, ans=0.0 2023-10-02 23:07:15,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:07:15,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:07:17,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 23:07:19,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:07:21,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:07:22,144 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.67 vs. limit=10.0 2023-10-02 23:07:25,501 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 23:07:28,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:07:28,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:07:29,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:07:31,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:07:32,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:07:34,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:07:34,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1051613.3333333333, ans=0.125 2023-10-02 23:07:34,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1051613.3333333333, ans=0.0 2023-10-02 23:07:35,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 23:07:35,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:07:37,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1051613.3333333333, ans=0.0 2023-10-02 23:07:38,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 23:07:39,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:07:41,130 INFO [train.py:1046] (1/4) Epoch 30, batch 3700, loss[loss=0.1727, simple_loss=0.2532, pruned_loss=0.04612, over 24053.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2428, pruned_loss=0.04333, over 4706309.57 frames. ], batch size: 86, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:07:41,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:07:42,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:07:42,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 23:07:42,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:07:42,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1051680.0, ans=0.1 2023-10-02 23:07:43,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:07:43,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:07:47,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:07:51,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:07:51,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:07:51,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:07:53,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:07:53,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 23:07:55,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:07:55,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1051746.6666666667, ans=0.125 2023-10-02 23:07:58,090 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 23:08:04,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:08:04,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 23:08:05,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:08:06,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 23:08:06,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:08:09,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:08:10,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 23:08:12,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:08:13,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:08:15,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1051813.3333333333, ans=0.2 2023-10-02 23:08:16,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:08:16,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:08:18,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:08:18,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1051813.3333333333, ans=0.0 2023-10-02 23:08:23,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:08:23,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 23:08:23,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:08:25,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 23:08:28,597 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.23 vs. limit=12.0 2023-10-02 23:08:29,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:08:30,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:08:32,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:08:33,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 23:08:34,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:08:35,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 23:08:35,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:08:35,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:08:38,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:08:39,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 23:08:39,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 23:08:39,743 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1051946.6666666667, ans=0.125 2023-10-02 23:08:40,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:08:40,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:08:42,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:08:42,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:08:44,988 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.850e+02 2.059e+02 2.330e+02 3.629e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-02 23:08:45,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:08:45,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1051946.6666666667, ans=0.0 2023-10-02 23:08:46,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:08:47,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:08:50,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 23:08:50,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 23:08:52,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1051946.6666666667, ans=0.125 2023-10-02 23:08:53,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:08:53,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 23:08:55,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:08:56,512 INFO [train.py:1046] (1/4) Epoch 30, batch 3750, loss[loss=0.1656, simple_loss=0.2383, pruned_loss=0.04648, over 23517.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2437, pruned_loss=0.04364, over 4711156.48 frames. ], batch size: 256, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:08:56,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:08:59,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:09:00,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:09:03,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:09:06,157 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.49 vs. limit=15.0 2023-10-02 23:09:08,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:09:09,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:09:10,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:09:15,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:09:16,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 23:09:17,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:09:19,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:09:19,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:09:21,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 23:09:21,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1052080.0, ans=0.125 2023-10-02 23:09:25,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 23:09:28,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:09:29,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:09:31,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:09:37,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:09:39,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 23:09:40,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 23:09:44,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:09:46,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:09:47,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:09:49,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:09:55,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 23:09:57,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:09:58,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:09:59,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:10:01,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:10:09,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:10:10,386 INFO [train.py:1046] (1/4) Epoch 30, batch 3800, loss[loss=0.1552, simple_loss=0.2107, pruned_loss=0.04988, over 19487.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2427, pruned_loss=0.04332, over 4719486.24 frames. ], batch size: 388, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:10:13,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:10:13,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 23:10:14,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 23:10:14,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:10:17,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:10:20,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 23:10:22,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 23:10:22,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:10:22,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:10:25,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:10:25,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:10:26,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:10:26,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1052413.3333333333, ans=0.0 2023-10-02 23:10:28,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 23:10:29,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 23:10:31,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:10:34,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:10:35,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:10:37,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:10:37,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1052413.3333333333, ans=0.0 2023-10-02 23:10:38,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 23:10:38,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:10:40,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:10:41,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:10:44,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1052480.0, ans=0.0 2023-10-02 23:10:47,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 23:10:47,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 23:10:50,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:10:56,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:11:03,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:11:04,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 23:11:08,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 23:11:08,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:11:09,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:11:09,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:11:13,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 23:11:14,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1052613.3333333333, ans=0.125 2023-10-02 23:11:15,489 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.837e+02 2.037e+02 2.244e+02 3.093e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-02 23:11:16,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 23:11:16,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 23:11:16,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:11:17,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:11:21,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:11:23,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:11:23,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1052680.0, ans=0.1 2023-10-02 23:11:24,327 INFO [train.py:1046] (1/4) Epoch 30, batch 3850, loss[loss=0.1569, simple_loss=0.2221, pruned_loss=0.04581, over 23731.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2417, pruned_loss=0.04277, over 4716253.22 frames. ], batch size: 232, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:11:30,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:11:30,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 23:11:31,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:11:32,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:11:34,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 23:11:35,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1052680.0, ans=0.125 2023-10-02 23:11:36,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:11:38,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 23:11:40,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 23:11:47,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:11:48,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:11:51,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:11:51,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:11:55,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:11:55,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:11:57,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:11:57,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:11:58,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:00,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:01,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:12:01,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:12:01,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 23:12:01,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 23:12:03,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:12:03,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:12:05,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:06,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:12:07,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 23:12:09,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 23:12:12,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:14,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 23:12:16,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 23:12:19,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:21,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:12:25,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:25,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 23:12:28,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 23:12:29,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:12:30,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:12:33,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:12:33,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 23:12:34,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:35,071 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:12:36,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:36,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:12:36,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 23:12:37,009 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.82 vs. limit=22.5 2023-10-02 23:12:37,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:12:39,443 INFO [train.py:1046] (1/4) Epoch 30, batch 3900, loss[loss=0.1652, simple_loss=0.2325, pruned_loss=0.04889, over 23812.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2404, pruned_loss=0.04245, over 4705270.08 frames. ], batch size: 212, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:12:39,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 23:12:39,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:39,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:12:40,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:12:40,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:42,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1053013.3333333333, ans=0.2 2023-10-02 23:12:44,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:12:45,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:12:45,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:46,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:12:46,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 23:12:48,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:51,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:12:51,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 23:12:51,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:12:52,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:12:55,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 23:12:55,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:56,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:12:58,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 23:12:58,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:12:59,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 23:13:01,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:13:03,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 23:13:04,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 23:13:08,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:13:08,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:13:10,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:13:10,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:13:10,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1053146.6666666667, ans=0.0 2023-10-02 23:13:13,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:13:15,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1053146.6666666667, ans=0.1 2023-10-02 23:13:16,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:13:18,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:13:18,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:13:18,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:13:23,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:13:24,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:13:31,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:13:34,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:13:44,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:13:45,997 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.451e+02 1.921e+02 2.166e+02 2.503e+02 3.662e+02, threshold=4.332e+02, percent-clipped=0.0 2023-10-02 23:13:47,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:13:47,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 23:13:47,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 23:13:47,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:13:47,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1053280.0, ans=0.125 2023-10-02 23:13:50,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 23:13:52,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:13:53,562 INFO [train.py:1046] (1/4) Epoch 30, batch 3950, loss[loss=0.167, simple_loss=0.2443, pruned_loss=0.04479, over 24497.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2392, pruned_loss=0.04211, over 4707178.25 frames. ], batch size: 66, lr: 3.39e-03, grad_scale: 4.0 2023-10-02 23:13:53,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 23:13:58,530 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.25 vs. limit=15.0 2023-10-02 23:13:59,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:13:59,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1053346.6666666667, ans=0.0 2023-10-02 23:14:00,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 23:14:01,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:14:03,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:14:05,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:14:10,943 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 23:14:11,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:14:11,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 23:14:12,332 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 23:14:12,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:14:12,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1053413.3333333333, ans=0.1 2023-10-02 23:14:13,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:14:13,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:14:13,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:14:15,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1053413.3333333333, ans=0.07 2023-10-02 23:14:18,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 23:14:19,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:14:19,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:14:19,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:14:21,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:14:22,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:14:32,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:14:32,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:14:37,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 23:14:43,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 23:14:43,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 23:14:44,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:14:44,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:14:51,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:14:51,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:14:53,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:14:53,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:14:53,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 23:14:58,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:14:58,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:15:03,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 23:15:06,721 INFO [train.py:1046] (1/4) Epoch 30, batch 4000, loss[loss=0.1695, simple_loss=0.2392, pruned_loss=0.04997, over 23810.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2397, pruned_loss=0.04186, over 4717907.50 frames. ], batch size: 212, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:15:12,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:15:18,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:15:24,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:15:25,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:15:25,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:15:25,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 23:15:26,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:15:26,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 23:15:27,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:15:28,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 23:15:29,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:15:32,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:15:32,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:15:32,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:15:32,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:15:32,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 23:15:32,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_ff2.min_abs, batch_count=1053746.6666666667, ans=0.1 2023-10-02 23:15:33,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:15:35,251 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 23:15:36,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:15:38,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:15:39,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1053813.3333333333, ans=0.0 2023-10-02 23:15:41,702 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 23:15:41,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:15:41,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:15:47,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 23:15:47,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1053813.3333333333, ans=0.125 2023-10-02 23:15:48,571 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.37 vs. limit=15.0 2023-10-02 23:15:49,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:15:49,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1053880.0, ans=0.0 2023-10-02 23:15:51,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:15:52,012 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 23:15:54,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:15:54,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 23:15:54,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:15:57,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:15:58,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:16:00,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:16:00,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:16:00,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:16:01,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 23:16:01,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:16:03,337 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 23:16:07,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:16:12,363 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.926e+02 2.122e+02 2.388e+02 3.312e+02, threshold=4.243e+02, percent-clipped=0.0 2023-10-02 23:16:12,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 23:16:15,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:16:15,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:16:16,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:16:18,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:16:20,286 INFO [train.py:1046] (1/4) Epoch 30, batch 4050, loss[loss=0.1658, simple_loss=0.2379, pruned_loss=0.0469, over 23862.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2402, pruned_loss=0.04171, over 4722685.46 frames. ], batch size: 180, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:16:20,967 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.02 vs. limit=15.0 2023-10-02 23:16:23,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:16:25,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 23:16:25,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 23:16:28,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:16:28,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:16:30,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:16:31,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:16:31,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:16:34,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:16:37,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:16:37,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 23:16:37,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1054080.0, ans=0.2 2023-10-02 23:16:39,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:16:39,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:16:41,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:16:44,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:16:46,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 23:16:49,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 23:16:49,360 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 23:16:51,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:16:58,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 23:16:59,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:17:02,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:17:03,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1054213.3333333333, ans=0.125 2023-10-02 23:17:05,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:17:05,731 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.04 vs. limit=22.5 2023-10-02 23:17:06,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:17:06,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:17:09,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:17:13,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 23:17:13,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 23:17:15,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:17:17,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 23:17:20,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:17:25,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1054280.0, ans=0.2 2023-10-02 23:17:28,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 23:17:28,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:17:28,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:17:29,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 23:17:30,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 23:17:30,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:17:33,675 INFO [train.py:1046] (1/4) Epoch 30, batch 4100, loss[loss=0.1901, simple_loss=0.2605, pruned_loss=0.05982, over 23877.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2419, pruned_loss=0.04239, over 4708241.78 frames. ], batch size: 195, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:17:33,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:17:35,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:17:35,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:17:41,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 23:17:42,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 23:17:46,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 23:17:47,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 23:17:47,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:17:47,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:17:48,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:17:48,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:17:48,868 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 23:17:51,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1054413.3333333333, ans=0.125 2023-10-02 23:17:52,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:17:53,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:17:53,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:17:54,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:17:58,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:17:59,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:17:59,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:17:59,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 23:17:59,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:17:59,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:18:00,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:18:00,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:18:02,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 23:18:05,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:18:06,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 23:18:08,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:18:08,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1054480.0, ans=0.125 2023-10-02 23:18:11,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:18:11,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 23:18:11,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:18:11,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:18:12,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:18:14,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 23:18:14,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:18:16,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:18:18,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 23:18:20,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:18:20,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:18:23,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:18:28,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:18:32,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:18:33,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:18:35,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1054613.3333333333, ans=0.2 2023-10-02 23:18:39,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:18:39,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:18:41,142 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 1.912e+02 2.226e+02 2.556e+02 3.636e+02, threshold=4.451e+02, percent-clipped=0.0 2023-10-02 23:18:41,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1054613.3333333333, ans=0.125 2023-10-02 23:18:43,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:18:46,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:18:48,265 INFO [train.py:1046] (1/4) Epoch 30, batch 4150, loss[loss=0.1571, simple_loss=0.246, pruned_loss=0.03413, over 24285.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.243, pruned_loss=0.04309, over 4711191.40 frames. ], batch size: 74, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:18:51,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:18:51,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:18:53,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:18:53,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:18:54,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 23:18:54,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:18:55,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 23:18:57,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 23:18:57,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 23:18:57,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1054680.0, ans=0.0 2023-10-02 23:18:57,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1054680.0, ans=0.1 2023-10-02 23:18:58,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:19:03,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:19:03,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:19:06,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:19:06,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:19:08,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:19:10,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:19:10,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:19:11,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 23:19:17,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:19:19,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:19:21,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 23:19:24,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 23:19:24,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:19:25,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 23:19:25,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:19:25,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:19:28,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:19:29,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:19:32,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1054880.0, ans=0.125 2023-10-02 23:19:34,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 23:19:37,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:19:37,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1054880.0, ans=0.1 2023-10-02 23:19:39,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:19:39,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 23:19:39,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:19:42,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 23:19:43,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:19:43,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:19:45,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:19:46,064 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.15 vs. limit=15.0 2023-10-02 23:19:46,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 23:19:46,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:19:46,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 23:19:49,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 23:19:51,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 23:19:51,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:19:51,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:19:51,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:19:52,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 23:19:53,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:19:54,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 23:19:54,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:19:57,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:19:57,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 23:19:57,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:20:02,503 INFO [train.py:1046] (1/4) Epoch 30, batch 4200, loss[loss=0.1585, simple_loss=0.2369, pruned_loss=0.03999, over 24610.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2422, pruned_loss=0.04278, over 4721358.55 frames. ], batch size: 60, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:20:02,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:20:04,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 23:20:05,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:20:07,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:20:09,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1055013.3333333333, ans=0.0 2023-10-02 23:20:10,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:20:10,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:20:10,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:20:12,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 23:20:16,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 23:20:17,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:20:18,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:20:20,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1055080.0, ans=0.0 2023-10-02 23:20:21,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:20:24,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 23:20:26,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:20:26,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:20:27,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 23:20:27,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:20:29,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:20:29,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:20:29,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:20:30,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:20:30,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1055146.6666666667, ans=0.125 2023-10-02 23:20:33,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 23:20:33,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:20:35,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1055146.6666666667, ans=0.125 2023-10-02 23:20:38,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 23:20:39,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:20:42,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:20:42,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:20:44,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:20:44,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 23:20:45,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:20:45,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:20:51,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:20:52,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:20:58,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:20:59,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1055213.3333333333, ans=0.09899494936611666 2023-10-02 23:21:01,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 23:21:03,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:21:07,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 23:21:09,203 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.983e+02 2.287e+02 2.650e+02 3.812e+02, threshold=4.574e+02, percent-clipped=0.0 2023-10-02 23:21:09,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:21:09,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1055280.0, ans=0.125 2023-10-02 23:21:11,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 23:21:13,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1055280.0, ans=0.2 2023-10-02 23:21:16,273 INFO [train.py:1046] (1/4) Epoch 30, batch 4250, loss[loss=0.1493, simple_loss=0.2288, pruned_loss=0.03493, over 23681.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2407, pruned_loss=0.04242, over 4701115.19 frames. ], batch size: 149, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:21:16,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:21:19,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:21:19,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:21:21,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1055346.6666666667, ans=0.125 2023-10-02 23:21:22,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:21:26,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:21:28,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 23:21:28,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:21:29,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:21:30,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1055413.3333333333, ans=0.1 2023-10-02 23:21:31,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:21:37,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:21:37,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:21:39,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:21:39,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:21:41,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:21:42,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:21:42,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:21:45,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:21:47,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:21:48,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 23:21:52,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 23:21:52,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:21:52,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:21:52,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:21:55,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:21:55,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:21:55,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:21:57,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 23:21:57,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:22:03,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:22:04,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:22:05,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 23:22:05,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:22:07,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 23:22:09,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:22:11,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:22:12,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:22:12,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:22:15,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 23:22:16,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:22:18,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:22:22,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:22:25,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:22:26,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:22:27,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:22:29,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:22:30,519 INFO [train.py:1046] (1/4) Epoch 30, batch 4300, loss[loss=0.1679, simple_loss=0.2395, pruned_loss=0.04812, over 23964.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2406, pruned_loss=0.04239, over 4703273.12 frames. ], batch size: 196, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:22:30,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:22:31,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:22:31,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 23:22:33,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:22:38,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:22:38,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:22:42,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:22:45,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1055746.6666666667, ans=0.125 2023-10-02 23:22:50,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:22:50,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 23:22:51,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:22:54,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:22:54,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:22:54,636 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 23:22:58,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:23:00,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:23:02,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 23:23:02,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:23:02,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 23:23:05,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 23:23:06,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:23:09,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:23:09,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:23:09,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:23:09,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1055813.3333333333, ans=0.125 2023-10-02 23:23:12,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:23:12,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:23:12,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 23:23:14,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 23:23:16,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:23:20,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:23:20,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 23:23:20,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:23:22,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:23:22,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 23:23:22,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 23:23:23,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 23:23:23,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:23:23,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 23:23:24,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 23:23:27,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:23:27,973 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 23:23:29,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:23:31,953 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.57 vs. limit=15.0 2023-10-02 23:23:32,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:23:32,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:23:35,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 23:23:36,593 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.805e+02 2.042e+02 2.369e+02 3.407e+02, threshold=4.083e+02, percent-clipped=0.0 2023-10-02 23:23:36,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:23:36,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:23:36,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:23:38,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:23:38,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:23:39,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:23:42,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:23:42,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:23:42,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:23:44,285 INFO [train.py:1046] (1/4) Epoch 30, batch 4350, loss[loss=0.1795, simple_loss=0.2634, pruned_loss=0.04785, over 24173.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2421, pruned_loss=0.04266, over 4702659.71 frames. ], batch size: 86, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:23:50,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 23:23:50,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:23:56,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:23:57,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:23:59,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:23:59,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:24:05,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:24:08,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:24:08,938 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.78 vs. limit=15.0 2023-10-02 23:24:09,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:24:09,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:24:12,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:24:13,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:24:14,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1056146.6666666667, ans=0.0 2023-10-02 23:24:15,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:24:20,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 23:24:21,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:24:23,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:24:26,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:24:28,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 23:24:33,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:24:34,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:24:38,973 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 23:24:40,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:24:40,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1056213.3333333333, ans=0.0 2023-10-02 23:24:41,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:24:41,877 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 23:24:41,941 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 23:24:41,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:24:43,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:24:43,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:24:44,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:24:46,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:24:46,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:24:50,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 23:24:51,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:24:51,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:24:51,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:24:51,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 23:24:53,116 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 23:24:53,129 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 23:24:53,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 23:24:55,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:24:55,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:24:55,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:24:57,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:24:57,715 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=11.01 vs. limit=15.0 2023-10-02 23:24:58,424 INFO [train.py:1046] (1/4) Epoch 30, batch 4400, loss[loss=0.1694, simple_loss=0.2417, pruned_loss=0.04852, over 22738.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.243, pruned_loss=0.04289, over 4708750.81 frames. ], batch size: 322, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:24:59,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 23:25:00,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1056346.6666666667, ans=0.1 2023-10-02 23:25:01,290 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 23:25:01,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:25:01,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1056346.6666666667, ans=0.0 2023-10-02 23:25:04,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:25:04,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:25:06,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:25:06,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1056346.6666666667, ans=0.07 2023-10-02 23:25:07,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 23:25:07,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 23:25:07,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 23:25:07,568 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 23:25:08,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:25:08,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:25:11,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 23:25:11,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1056413.3333333333, ans=0.125 2023-10-02 23:25:12,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:25:14,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:25:14,766 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 23:25:19,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:25:19,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 23:25:19,731 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 23:25:22,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 23:25:24,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 23:25:24,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 23:25:24,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:25:25,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:25:26,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:25:27,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:25:28,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 23:25:28,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 23:25:28,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1056480.0, ans=0.125 2023-10-02 23:25:29,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:25:30,149 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:25:32,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:25:32,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:25:34,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:25:34,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:25:34,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 23:25:35,782 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 23:25:40,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:25:45,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:25:45,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1056546.6666666667, ans=0.125 2023-10-02 23:25:48,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 23:25:53,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:25:56,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:25:58,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:25:58,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 23:25:58,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:25:58,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:25:58,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:25:59,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:26:02,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 23:26:05,411 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.891e+02 2.096e+02 2.482e+02 3.996e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-02 23:26:05,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 23:26:05,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1056613.3333333333, ans=10.0 2023-10-02 23:26:06,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 23:26:06,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:26:06,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 23:26:08,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:26:08,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1056613.3333333333, ans=0.0 2023-10-02 23:26:08,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1056613.3333333333, ans=0.07 2023-10-02 23:26:10,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:26:12,193 INFO [train.py:1046] (1/4) Epoch 30, batch 4450, loss[loss=0.1461, simple_loss=0.2294, pruned_loss=0.03143, over 24314.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2436, pruned_loss=0.04323, over 4712927.59 frames. ], batch size: 61, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:26:13,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 23:26:16,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:26:18,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:26:18,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:26:21,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1056680.0, ans=0.5 2023-10-02 23:26:27,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:26:27,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:26:31,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:26:33,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:26:36,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:26:36,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:26:37,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 23:26:37,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:26:39,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:26:39,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:26:39,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:26:40,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 23:26:47,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:26:47,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:26:47,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:26:49,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:26:50,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:26:54,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 23:26:55,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 23:26:55,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 23:26:55,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:26:57,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:26:59,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 23:27:01,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:27:05,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:27:05,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 23:27:06,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:27:06,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:27:06,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:27:06,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:27:09,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:27:10,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:27:10,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 23:27:12,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:27:14,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:27:16,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:27:17,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:27:17,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 23:27:18,441 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.57 vs. limit=15.0 2023-10-02 23:27:19,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:27:22,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 23:27:25,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:27:27,512 INFO [train.py:1046] (1/4) Epoch 30, batch 4500, loss[loss=0.1665, simple_loss=0.2346, pruned_loss=0.04921, over 23753.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2443, pruned_loss=0.0437, over 4701415.16 frames. ], batch size: 164, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:27:31,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:27:32,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 23:27:32,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 23:27:34,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:27:40,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:27:40,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:27:41,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:27:41,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:27:42,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:27:42,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:27:43,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1057080.0, ans=0.2 2023-10-02 23:27:45,515 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.37 vs. limit=12.0 2023-10-02 23:27:53,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:27:53,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:27:56,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:27:57,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:27:58,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:28:04,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 23:28:07,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:28:10,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:28:12,563 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.08 vs. limit=10.0 2023-10-02 23:28:13,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:28:13,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1057213.3333333333, ans=0.1 2023-10-02 23:28:14,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 23:28:14,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:28:16,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:28:17,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:28:17,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:28:20,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:28:21,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 23:28:21,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:28:21,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:28:26,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:28:26,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:28:28,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:28:31,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:28:31,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:28:32,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 23:28:35,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 23:28:35,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 23:28:37,063 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.840e+02 1.982e+02 2.300e+02 3.400e+02, threshold=3.964e+02, percent-clipped=0.0 2023-10-02 23:28:40,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 23:28:41,542 INFO [train.py:1046] (1/4) Epoch 30, batch 4550, loss[loss=0.162, simple_loss=0.2516, pruned_loss=0.03621, over 24356.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2427, pruned_loss=0.04332, over 4704638.54 frames. ], batch size: 74, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:28:43,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 23:28:44,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:28:47,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:28:48,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:28:51,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:28:53,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:28:57,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:28:58,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:28:58,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:28:58,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:01,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:29:01,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:29:04,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:29:06,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 23:29:07,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 23:29:07,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:29:08,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 23:29:12,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 23:29:14,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:29:16,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 23:29:18,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:29:21,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:22,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:22,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:29:24,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1057546.6666666667, ans=0.0 2023-10-02 23:29:26,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 23:29:27,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:29:30,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:30,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:29:30,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:29:32,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 23:29:33,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 23:29:33,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:29:34,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 23:29:36,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 23:29:36,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:29:39,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:29:39,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:29:39,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:40,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:29:42,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:29:42,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 23:29:45,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:29:45,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 23:29:46,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 23:29:46,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:29:46,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 23:29:49,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:29:49,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:29:50,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1057613.3333333333, ans=0.0 2023-10-02 23:29:52,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:29:52,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:52,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:29:53,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:29:54,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:29:55,747 INFO [train.py:1046] (1/4) Epoch 30, batch 4600, loss[loss=0.1521, simple_loss=0.2306, pruned_loss=0.03678, over 24331.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2412, pruned_loss=0.0427, over 4708811.33 frames. ], batch size: 56, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:29:59,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:29:59,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:30:01,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:30:01,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:30:01,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:30:03,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 23:30:03,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1057680.0, ans=0.125 2023-10-02 23:30:06,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:30:09,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:30:09,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:30:11,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:19,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 23:30:19,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:23,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:28,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:30:28,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:30:33,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 23:30:33,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 23:30:33,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:30:38,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:38,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:30:42,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:30:43,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 23:30:45,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 23:30:49,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:30:50,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:30:53,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:30:53,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 23:30:53,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:54,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 23:30:55,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:30:55,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:30:55,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1057946.6666666667, ans=0.05 2023-10-02 23:30:57,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:30:58,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:30:58,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:30:58,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 23:30:59,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1057946.6666666667, ans=0.0 2023-10-02 23:31:00,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 23:31:00,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 23:31:00,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:31:00,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1057946.6666666667, ans=0.05 2023-10-02 23:31:01,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:31:01,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:31:03,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:31:05,783 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.399e+02 1.837e+02 2.032e+02 2.322e+02 3.938e+02, threshold=4.064e+02, percent-clipped=0.0 2023-10-02 23:31:10,364 INFO [train.py:1046] (1/4) Epoch 30, batch 4650, loss[loss=0.1685, simple_loss=0.252, pruned_loss=0.04253, over 23488.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2404, pruned_loss=0.04238, over 4716790.15 frames. ], batch size: 134, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:31:13,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:31:14,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:31:16,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:31:16,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:31:16,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:31:16,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:31:17,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:31:17,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1058013.3333333333, ans=0.125 2023-10-02 23:31:20,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 23:31:22,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:31:26,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 23:31:26,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:31:28,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 23:31:28,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:31:28,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 23:31:28,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 23:31:29,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:31:29,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:31:32,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:31:33,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:31:33,731 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 23:31:36,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:31:37,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 23:31:41,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:31:41,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:31:41,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 23:31:44,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:31:46,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:31:49,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:31:51,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1058146.6666666667, ans=0.0 2023-10-02 23:31:56,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:31:59,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:31:59,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:32:00,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:32:00,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1058213.3333333333, ans=0.1 2023-10-02 23:32:01,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 23:32:01,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 23:32:03,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 23:32:03,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 23:32:03,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:32:07,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1058280.0, ans=0.1 2023-10-02 23:32:10,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:32:10,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:32:10,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 23:32:11,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:32:12,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:32:12,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:32:15,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:32:16,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:32:16,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:32:16,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1058280.0, ans=0.125 2023-10-02 23:32:17,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:32:21,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:32:23,212 INFO [train.py:1046] (1/4) Epoch 30, batch 4700, loss[loss=0.1781, simple_loss=0.2496, pruned_loss=0.05335, over 22853.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2415, pruned_loss=0.04318, over 4711482.66 frames. ], batch size: 322, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:32:23,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:32:23,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:32:23,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 23:32:24,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:32:26,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 23:32:34,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:32:35,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:32:35,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:32:36,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:32:38,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 23:32:42,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 23:32:42,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1058413.3333333333, ans=0.125 2023-10-02 23:32:43,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 23:32:45,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:32:45,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:32:45,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:32:47,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1058413.3333333333, ans=0.125 2023-10-02 23:32:48,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_na.min_abs, batch_count=1058413.3333333333, ans=0.02 2023-10-02 23:32:49,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:32:56,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 23:32:57,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 23:32:59,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:33:05,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 23:33:06,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:33:08,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:11,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 23:33:11,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1058546.6666666667, ans=0.0 2023-10-02 23:33:12,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:33:16,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:33:18,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 23:33:20,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:20,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:33:22,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:33:24,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:33:24,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 23:33:24,185 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 23:33:26,972 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.13 vs. limit=15.0 2023-10-02 23:33:27,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:33:28,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:28,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:28,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 23:33:28,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:32,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 23:33:34,176 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.856e+02 2.084e+02 2.252e+02 3.247e+02, threshold=4.168e+02, percent-clipped=0.0 2023-10-02 23:33:35,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:33:37,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:33:38,559 INFO [train.py:1046] (1/4) Epoch 30, batch 4750, loss[loss=0.1608, simple_loss=0.2387, pruned_loss=0.04141, over 24333.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2429, pruned_loss=0.04347, over 4717140.60 frames. ], batch size: 61, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:33:40,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:33:41,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:33:42,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 23:33:42,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:33:44,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1058680.0, ans=0.125 2023-10-02 23:33:46,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 23:33:48,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:33:48,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:33:50,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:33:54,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 23:33:54,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1058746.6666666667, ans=0.125 2023-10-02 23:33:58,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:34:02,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 23:34:02,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:34:05,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:34:05,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:34:06,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:34:06,513 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 23:34:06,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 23:34:11,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 23:34:13,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:34:16,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:34:19,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:34:19,446 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 23:34:19,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:34:23,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:34:24,144 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.53 vs. limit=22.5 2023-10-02 23:34:26,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:34:28,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 23:34:28,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 23:34:28,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:34:28,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1058880.0, ans=0.125 2023-10-02 23:34:29,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:34:29,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:34:31,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:34:31,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 23:34:35,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 23:34:36,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:34:39,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:34:39,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 23:34:39,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:34:40,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:34:42,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:34:42,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:34:43,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 23:34:45,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:34:46,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 23:34:46,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 23:34:47,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 23:34:50,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:34:50,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:34:52,582 INFO [train.py:1046] (1/4) Epoch 30, batch 4800, loss[loss=0.1857, simple_loss=0.2558, pruned_loss=0.05784, over 23642.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2437, pruned_loss=0.04358, over 4725547.56 frames. ], batch size: 232, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:34:52,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 23:35:00,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:00,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:35:05,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:35:06,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:35:08,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:08,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 23:35:09,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:35:09,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:35:10,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:35:12,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1059080.0, ans=0.2 2023-10-02 23:35:15,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:35:15,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1059080.0, ans=0.125 2023-10-02 23:35:15,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff2.min_abs, batch_count=1059080.0, ans=0.1 2023-10-02 23:35:16,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:35:16,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:35:19,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:35:19,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 23:35:19,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:35:20,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:35:23,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:35:26,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:35:27,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:35:27,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:35:29,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 23:35:29,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:31,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 23:35:31,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 23:35:32,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:32,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:35:34,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:35:34,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:35:34,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:35:36,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:35:36,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:35:40,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:35:43,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:35:44,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:35:47,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 23:35:48,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:35:48,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:35:48,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:35:50,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:54,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:35:54,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:35:54,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:35:56,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:35:56,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:35:56,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:36:00,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:36:00,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:36:01,344 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.838e+02 2.029e+02 2.262e+02 3.175e+02, threshold=4.059e+02, percent-clipped=0.0 2023-10-02 23:36:01,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:36:03,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 23:36:04,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 23:36:04,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:36:04,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:36:06,108 INFO [train.py:1046] (1/4) Epoch 30, batch 4850, loss[loss=0.1829, simple_loss=0.2475, pruned_loss=0.05918, over 23819.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2435, pruned_loss=0.04335, over 4732511.81 frames. ], batch size: 164, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:36:06,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:36:06,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:36:09,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:36:16,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 23:36:16,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1059346.6666666667, ans=0.1 2023-10-02 23:36:17,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:36:21,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:36:23,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:36:23,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:36:26,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:36:28,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:36:29,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:36:29,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 23:36:33,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:36:36,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:36:36,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 23:36:37,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:36:37,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 23:36:39,650 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.62 vs. limit=15.0 2023-10-02 23:36:40,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:36:40,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:36:46,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:36:46,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 23:36:46,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 23:36:48,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:36:54,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:36:56,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 23:36:56,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:36:56,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:36:57,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:36:59,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 23:36:59,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:37:02,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 23:37:02,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:37:04,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:37:05,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 23:37:12,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:37:17,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:37:19,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:37:19,748 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.41 vs. limit=15.0 2023-10-02 23:37:20,375 INFO [train.py:1046] (1/4) Epoch 30, batch 4900, loss[loss=0.1698, simple_loss=0.2524, pruned_loss=0.04363, over 23996.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2421, pruned_loss=0.04312, over 4712281.44 frames. ], batch size: 80, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:37:22,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1059680.0, ans=0.125 2023-10-02 23:37:23,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 23:37:23,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:37:28,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:37:28,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:37:28,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:37:32,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 23:37:38,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 23:37:42,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 23:37:42,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 23:37:42,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:37:42,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:37:43,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1059746.6666666667, ans=0.125 2023-10-02 23:37:44,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:37:44,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:37:44,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:37:44,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 23:37:47,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 23:37:48,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:37:48,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:37:50,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:37:51,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:37:53,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:37:55,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:37:55,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 23:37:57,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:37:58,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:37:58,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 23:37:58,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 23:38:00,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 23:38:03,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:38:04,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:38:04,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:38:04,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:38:06,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 23:38:07,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:38:08,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 23:38:11,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:38:12,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 23:38:13,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:38:18,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 23:38:19,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:38:20,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 23:38:20,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 23:38:27,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:38:27,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1059946.6666666667, ans=0.125 2023-10-02 23:38:28,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:38:30,098 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.877e+02 2.022e+02 2.304e+02 3.994e+02, threshold=4.045e+02, percent-clipped=0.0 2023-10-02 23:38:30,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 23:38:30,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 23:38:30,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:38:31,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:38:34,467 INFO [train.py:1046] (1/4) Epoch 30, batch 4950, loss[loss=0.1565, simple_loss=0.2337, pruned_loss=0.03969, over 21610.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2419, pruned_loss=0.04254, over 4727064.64 frames. ], batch size: 47, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:38:34,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:38:34,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:38:34,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:38:35,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 23:38:37,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:38:42,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:38:42,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 23:38:45,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 23:38:45,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 23:38:45,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:38:46,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 23:38:46,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:38:46,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:38:46,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:38:48,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:38:49,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:38:51,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:38:51,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:38:52,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:38:54,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:38:54,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:38:56,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:39:02,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:39:03,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=1060146.6666666667, ans=6.0 2023-10-02 23:39:04,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:39:05,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:39:07,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:08,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:39:08,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 23:39:10,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 23:39:10,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1060146.6666666667, ans=0.0 2023-10-02 23:39:13,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:13,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1060146.6666666667, ans=0.0 2023-10-02 23:39:15,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:39:15,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:39:16,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:39:16,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:39:17,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:39:21,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:39:22,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:39:23,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:39:26,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:39:26,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:27,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 23:39:28,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:39:29,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:39:32,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:39:34,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:39:34,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:39:35,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:35,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:39:35,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:39:35,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1060280.0, ans=0.125 2023-10-02 23:39:38,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:39:40,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:39:40,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:39:40,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1060280.0, ans=0.0 2023-10-02 23:39:41,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 23:39:42,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1060280.0, ans=0.125 2023-10-02 23:39:44,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:39:48,821 INFO [train.py:1046] (1/4) Epoch 30, batch 5000, loss[loss=0.1513, simple_loss=0.2438, pruned_loss=0.0294, over 24318.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2418, pruned_loss=0.04226, over 4729419.30 frames. ], batch size: 74, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:39:50,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 23:39:50,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 23:39:50,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1060346.6666666667, ans=0.125 2023-10-02 23:39:57,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:57,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:39:59,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 23:40:00,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 23:40:04,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:40:04,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 23:40:05,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:40:05,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:40:06,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 23:40:06,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:40:06,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:40:08,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 23:40:08,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:40:09,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:40:09,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 23:40:10,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 23:40:12,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:40:14,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 23:40:14,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:40:14,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:40:14,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:40:14,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 23:40:14,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 23:40:15,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1060413.3333333333, ans=0.125 2023-10-02 23:40:18,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 23:40:18,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:40:18,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:40:19,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 23:40:19,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:40:21,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:40:21,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:40:22,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 23:40:25,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 23:40:25,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:40:25,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1060480.0, ans=0.125 2023-10-02 23:40:28,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:40:31,148 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 23:40:34,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:40:34,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:40:34,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:40:36,944 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.43 vs. limit=22.5 2023-10-02 23:40:37,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 23:40:37,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:40:38,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:40:38,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:40:40,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 23:40:41,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:40:45,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:40:46,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:40:50,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 23:40:55,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:40:58,138 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.837e+02 2.079e+02 2.443e+02 4.073e+02, threshold=4.157e+02, percent-clipped=1.0 2023-10-02 23:41:01,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1060680.0, ans=0.125 2023-10-02 23:41:02,691 INFO [train.py:1046] (1/4) Epoch 30, batch 5050, loss[loss=0.1957, simple_loss=0.2543, pruned_loss=0.06856, over 19321.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2419, pruned_loss=0.04248, over 4722070.94 frames. ], batch size: 388, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:41:04,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:41:04,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:41:05,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:41:05,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:41:05,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:41:06,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:41:06,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:41:11,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:41:11,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 23:41:12,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:41:15,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:41:17,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:41:17,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 23:41:18,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:41:18,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:41:19,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1060746.6666666667, ans=0.125 2023-10-02 23:41:20,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:41:21,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:41:23,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:41:32,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 23:41:32,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 23:41:34,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:41:34,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 23:41:34,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:41:34,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:41:35,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:41:37,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:41:37,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 23:41:37,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 23:41:38,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:41:39,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:41:44,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:41:44,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 23:41:47,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:41:48,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 23:41:48,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:41:48,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:41:49,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1060880.0, ans=0.125 2023-10-02 23:41:50,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:41:51,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:41:51,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:41:55,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:41:55,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:41:56,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:41:56,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:41:56,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 23:41:57,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:41:59,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:42:05,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:42:05,263 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 23:42:05,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 23:42:06,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:42:06,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:42:07,992 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 23:42:09,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:42:09,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 23:42:09,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:42:13,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:42:13,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:42:13,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 23:42:15,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 23:42:16,885 INFO [train.py:1046] (1/4) Epoch 30, batch 5100, loss[loss=0.1713, simple_loss=0.2583, pruned_loss=0.04209, over 24625.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2425, pruned_loss=0.04241, over 4722061.83 frames. ], batch size: 73, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:42:18,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:42:18,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:42:18,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:42:19,871 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 23:42:21,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:42:25,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 23:42:25,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 23:42:27,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:42:27,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:42:30,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:42:31,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 23:42:31,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 23:42:36,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:42:37,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:42:40,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:42:42,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 23:42:44,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:42:45,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:42:47,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 23:42:49,588 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:42:50,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:42:52,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:42:52,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 23:42:53,633 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 23:42:54,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:42:56,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 23:42:56,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 23:42:58,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:43:06,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1061213.3333333333, ans=0.0 2023-10-02 23:43:07,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:43:07,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1061213.3333333333, ans=0.125 2023-10-02 23:43:08,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 23:43:08,806 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 23:43:08,815 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 23:43:11,026 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.67 vs. limit=15.0 2023-10-02 23:43:11,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 23:43:11,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:43:15,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 23:43:18,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 23:43:20,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:43:20,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:43:23,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 23:43:25,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 23:43:25,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 23:43:28,612 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.446e+02 1.879e+02 2.162e+02 2.700e+02 3.768e+02, threshold=4.325e+02, percent-clipped=0.0 2023-10-02 23:43:31,375 INFO [train.py:1046] (1/4) Epoch 30, batch 5150, loss[loss=0.2164, simple_loss=0.2769, pruned_loss=0.07798, over 19390.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2431, pruned_loss=0.04265, over 4725983.19 frames. ], batch size: 388, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:43:32,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:43:32,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:43:32,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:43:34,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:43:34,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 23:43:36,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:43:36,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 23:43:36,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 23:43:36,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 23:43:36,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:43:36,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 23:43:39,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:43:39,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 23:43:40,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:43:41,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:43:44,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:43:44,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 23:43:46,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:43:47,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:43:49,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:43:49,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:43:49,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:43:50,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:43:50,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:43:50,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 23:43:52,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:43:52,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:43:53,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:43:56,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 23:43:56,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:43:58,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1061413.3333333333, ans=0.125 2023-10-02 23:44:02,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:44:05,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 23:44:07,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:44:15,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:44:16,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:44:20,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:44:21,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:44:24,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 23:44:26,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:44:27,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:44:27,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:44:32,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:44:34,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:44:34,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 23:44:38,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:44:41,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:44:43,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:44:43,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:44:45,159 INFO [train.py:1046] (1/4) Epoch 30, batch 5200, loss[loss=0.1619, simple_loss=0.2501, pruned_loss=0.03683, over 24529.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2431, pruned_loss=0.04256, over 4736742.20 frames. ], batch size: 71, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:44:45,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 23:44:45,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:44:45,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:44:45,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:44:48,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1061680.0, ans=0.125 2023-10-02 23:44:49,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:44:50,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:44:52,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:44:57,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 23:44:58,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:45:00,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:45:01,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:45:03,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:45:03,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:45:03,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 23:45:05,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1061746.6666666667, ans=0.0 2023-10-02 23:45:06,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:45:06,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:45:09,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 23:45:11,043 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:45:12,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:45:13,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:45:13,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 23:45:13,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 23:45:13,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1061813.3333333333, ans=0.125 2023-10-02 23:45:16,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 23:45:17,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:45:17,500 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 23:45:17,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:45:18,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:45:18,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:45:20,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 23:45:20,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:45:23,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:45:26,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 23:45:26,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 23:45:26,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 23:45:32,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 23:45:33,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:45:38,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:45:38,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:45:40,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 23:45:41,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:45:41,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 23:45:41,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:45:41,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:45:45,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:45:45,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:45:49,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:45:51,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:45:51,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:45:55,888 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.943e+02 2.112e+02 2.505e+02 3.885e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-02 23:45:56,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:45:57,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 23:45:57,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:45:57,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:45:59,156 INFO [train.py:1046] (1/4) Epoch 30, batch 5250, loss[loss=0.1645, simple_loss=0.2401, pruned_loss=0.04447, over 23737.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2427, pruned_loss=0.0427, over 4727377.36 frames. ], batch size: 149, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:45:59,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:46:00,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 23:46:00,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:46:03,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:46:05,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:46:07,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:46:08,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:46:11,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:46:11,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1062013.3333333333, ans=0.125 2023-10-02 23:46:13,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1062080.0, ans=0.0 2023-10-02 23:46:14,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:46:16,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:46:19,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:46:19,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 23:46:19,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:46:21,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:46:53,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1062280.0, ans=0.035 2023-10-02 23:46:53,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1062280.0, ans=0.125 2023-10-02 23:47:05,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1062280.0, ans=0.2 2023-10-02 23:47:08,002 INFO [train.py:1046] (1/4) Epoch 30, batch 5300, loss[loss=0.1569, simple_loss=0.2161, pruned_loss=0.04884, over 22801.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2413, pruned_loss=0.04259, over 4733722.84 frames. ], batch size: 322, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:47:13,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1062346.6666666667, ans=0.125 2023-10-02 23:47:21,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:47:21,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 23:47:21,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 23:47:21,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:47:22,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:47:22,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:47:22,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:47:22,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:47:22,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:47:22,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:47:22,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 23:47:22,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:47:23,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 23:47:23,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 23:47:23,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 23:47:23,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 23:47:23,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 23:47:23,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 23:47:23,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:47:23,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:47:23,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:47:23,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:47:23,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:47:24,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:47:24,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:47:24,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:47:24,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:47:24,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:47:24,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:47:24,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:47:24,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:47:25,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 23:47:25,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:47:25,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:47:25,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 23:47:25,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 23:47:25,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:47:25,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:47:25,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 23:47:25,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 23:47:25,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:47:26,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:47:26,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:47:26,559 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 23:47:26,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 23:47:27,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:47:27,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:47:27,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 23:47:27,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 23:47:27,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 23:47:27,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:47:31,466 INFO [train.py:1046] (1/4) Epoch 31, batch 0, loss[loss=0.1475, simple_loss=0.229, pruned_loss=0.03301, over 24336.00 frames. ], tot_loss[loss=0.1475, simple_loss=0.229, pruned_loss=0.03301, over 24336.00 frames. ], batch size: 61, lr: 3.32e-03, grad_scale: 16.0 2023-10-02 23:47:31,466 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-02 23:47:41,689 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([1.3045, 2.2158, 2.1321, 2.2316, 1.9404, 2.2660, 1.9472, 2.1362], device='cuda:1') 2023-10-02 23:47:42,264 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.7337, 4.3203, 4.7270, 4.5732], device='cuda:1') 2023-10-02 23:47:43,365 INFO [train.py:1078] (1/4) Epoch 31, validation: loss=0.3244, simple_loss=0.2676, pruned_loss=0.1906, over 1125622.00 frames. 2023-10-02 23:47:43,366 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-02 23:47:43,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 23:47:44,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:47:47,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:47:52,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:47:52,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:47:53,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:47:53,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 23:47:55,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 23:47:58,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:47:58,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:48:01,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:48:01,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:48:02,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:48:02,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:48:05,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 23:48:06,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:48:15,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:48:15,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:48:18,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 23:48:22,479 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.35 vs. limit=6.0 2023-10-02 23:48:22,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:48:22,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:48:24,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:48:28,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:48:33,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:48:33,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1062626.6666666667, ans=10.0 2023-10-02 23:48:37,457 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.906e+02 2.085e+02 2.438e+02 4.411e+02, threshold=4.170e+02, percent-clipped=1.0 2023-10-02 23:48:37,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 23:48:40,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 23:48:40,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:48:40,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:48:41,303 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.02 vs. limit=15.0 2023-10-02 23:48:42,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:48:42,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:48:44,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 23:48:46,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:48:46,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1062693.3333333333, ans=0.125 2023-10-02 23:48:47,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:48:50,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:48:51,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1062693.3333333333, ans=0.2 2023-10-02 23:48:55,878 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 23:48:56,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1062760.0, ans=10.0 2023-10-02 23:48:57,123 INFO [train.py:1046] (1/4) Epoch 31, batch 50, loss[loss=0.1644, simple_loss=0.2385, pruned_loss=0.04515, over 23778.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2417, pruned_loss=0.04336, over 1057406.51 frames. ], batch size: 212, lr: 3.32e-03, grad_scale: 16.0 2023-10-02 23:48:57,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:48:57,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1062760.0, ans=0.1 2023-10-02 23:49:00,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:49:03,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:49:03,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 23:49:04,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:49:04,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:49:06,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:49:07,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:49:10,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:49:12,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 23:49:12,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:49:19,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:49:20,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 23:49:21,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 23:49:23,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:49:25,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:49:25,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:49:27,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:49:27,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:49:28,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:49:28,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:49:33,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1062893.3333333333, ans=0.125 2023-10-02 23:49:36,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:49:37,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:49:37,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:49:38,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 23:49:40,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:49:41,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:49:41,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 23:49:41,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:49:44,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 23:49:50,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:49:50,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:49:51,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:49:54,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:49:54,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:49:56,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 23:49:56,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 23:49:58,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:49:58,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:49:59,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:49:59,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:49:59,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 23:50:00,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 23:50:02,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 23:50:03,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:03,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:50:05,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 23:50:05,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 23:50:05,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:06,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:50:08,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:50:08,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:50:10,743 INFO [train.py:1046] (1/4) Epoch 31, batch 100, loss[loss=0.1719, simple_loss=0.2533, pruned_loss=0.04522, over 23891.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2428, pruned_loss=0.04253, over 1886121.78 frames. ], batch size: 86, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:50:10,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:50:15,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:50:18,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:50:19,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 23:50:19,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:50:23,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:50:25,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:50:25,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:50:25,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:50:25,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:50:25,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 23:50:28,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:50:29,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:30,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:50:30,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:50:32,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1063160.0, ans=0.125 2023-10-02 23:50:33,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 23:50:33,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:34,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:50:36,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:50:39,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:50:42,223 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 23:50:42,236 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 23:50:43,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:50:43,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:50:48,060 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.33 vs. limit=12.0 2023-10-02 23:50:48,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:50:51,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:53,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:50:56,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:50:57,777 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 23:50:59,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 23:51:03,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1063293.3333333333, ans=0.125 2023-10-02 23:51:04,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:51:06,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:51:07,427 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.861e+02 2.149e+02 2.468e+02 3.325e+02, threshold=4.297e+02, percent-clipped=0.0 2023-10-02 23:51:07,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:51:10,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:51:11,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:51:13,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:51:15,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:51:17,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:51:18,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:51:18,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:51:20,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:51:21,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 23:51:21,485 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 23:51:21,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:51:21,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:51:22,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:22,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:51:23,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 23:51:23,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:51:23,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:51:24,292 INFO [train.py:1046] (1/4) Epoch 31, batch 150, loss[loss=0.1655, simple_loss=0.2377, pruned_loss=0.04667, over 23601.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2432, pruned_loss=0.0422, over 2523321.07 frames. ], batch size: 134, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:51:24,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:24,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:51:26,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:51:26,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:51:26,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:51:28,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:51:32,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:51:32,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:51:32,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:35,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:51:35,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:38,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:51:39,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:39,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1063493.3333333333, ans=0.0 2023-10-02 23:51:39,627 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.73 vs. limit=15.0 2023-10-02 23:51:43,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 23:51:44,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 23:51:44,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 23:51:47,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:51:47,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:51:48,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:51:49,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:51:49,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:51:49,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:49,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:51,359 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 23:51:54,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:51:54,948 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.72 vs. limit=15.0 2023-10-02 23:52:00,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:52:03,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:52:05,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 23:52:06,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1063560.0, ans=0.125 2023-10-02 23:52:07,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:52:07,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:52:07,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:52:11,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:52:11,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:52:12,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:52:13,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:52:14,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 23:52:15,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1063626.6666666667, ans=0.0 2023-10-02 23:52:18,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:52:18,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:52:19,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:52:19,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:52:20,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:52:22,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 23:52:23,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:52:23,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:52:25,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:52:28,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:52:28,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 23:52:28,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:52:28,493 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 23:52:28,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1063693.3333333333, ans=0.0 2023-10-02 23:52:33,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:52:34,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1063693.3333333333, ans=0.0 2023-10-02 23:52:37,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:52:37,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:52:39,132 INFO [train.py:1046] (1/4) Epoch 31, batch 200, loss[loss=0.1392, simple_loss=0.2218, pruned_loss=0.02827, over 24465.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2445, pruned_loss=0.04383, over 2998444.47 frames. ], batch size: 58, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:52:40,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 23:52:40,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1063760.0, ans=0.0 2023-10-02 23:52:41,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:52:41,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:52:44,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 23:52:46,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 23:52:47,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:52:47,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:52:51,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:52:51,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:52:51,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:53:11,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:53:12,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:53:14,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:53:14,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:53:14,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1063893.3333333333, ans=0.035 2023-10-02 23:53:15,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 23:53:15,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:53:17,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:17,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:53:18,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:53:18,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:53:20,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 23:53:20,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:53:21,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:53:25,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:53:29,033 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.87 vs. limit=15.0 2023-10-02 23:53:31,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:53:34,995 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.946e+02 2.118e+02 2.313e+02 3.373e+02, threshold=4.235e+02, percent-clipped=0.0 2023-10-02 23:53:38,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:38,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:53:44,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:45,107 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.23 vs. limit=6.0 2023-10-02 23:53:45,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 23:53:47,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:53:47,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:53:48,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:53:48,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:53:50,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 23:53:50,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1064026.6666666667, ans=0.04949747468305833 2023-10-02 23:53:51,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:53:51,369 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 23:53:52,681 INFO [train.py:1046] (1/4) Epoch 31, batch 250, loss[loss=0.1616, simple_loss=0.2381, pruned_loss=0.04259, over 23296.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2435, pruned_loss=0.04368, over 3382464.44 frames. ], batch size: 119, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:53:52,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:54,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:53:55,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:55,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:53:56,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:53:57,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:54:00,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:54:03,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:54:03,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1064093.3333333333, ans=0.04949747468305833 2023-10-02 23:54:04,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1064093.3333333333, ans=0.1 2023-10-02 23:54:16,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:54:17,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:54:17,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:54:24,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:54:24,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:54:26,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:54:26,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:54:27,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:54:27,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:54:28,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:54:32,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:54:35,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 23:54:35,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:54:37,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:54:37,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:54:38,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:54:39,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:54:39,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:54:39,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:54:42,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:54:44,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:54:44,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:54:48,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:54:51,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:54:54,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:54:57,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1064360.0, ans=0.1 2023-10-02 23:54:58,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:55:00,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:55:05,077 INFO [train.py:1046] (1/4) Epoch 31, batch 300, loss[loss=0.1515, simple_loss=0.2335, pruned_loss=0.0347, over 24667.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2423, pruned_loss=0.04315, over 3686381.72 frames. ], batch size: 65, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:55:05,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 23:55:07,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:55:07,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:55:09,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 23:55:10,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:55:10,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:55:10,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 23:55:15,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:55:17,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:55:20,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:55:20,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 23:55:21,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:55:22,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:55:23,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 23:55:23,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:55:24,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1064493.3333333333, ans=0.1 2023-10-02 23:55:27,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:55:30,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:55:31,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 23:55:33,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 23:55:34,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:55:37,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:55:39,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:55:39,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 23:55:39,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:55:39,559 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:55:42,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:55:45,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:55:45,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:55:48,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 23:55:48,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 23:55:49,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:55:52,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:55:52,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 23:55:53,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:55:56,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:55:59,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:55:59,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 23:56:02,025 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.918e+02 2.245e+02 2.640e+02 3.587e+02, threshold=4.490e+02, percent-clipped=0.0 2023-10-02 23:56:03,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:56:03,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:56:05,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:56:08,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:56:08,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 23:56:08,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:56:08,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:56:11,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 23:56:11,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1064693.3333333333, ans=0.2 2023-10-02 23:56:12,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:56:12,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:14,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:56:16,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:56:16,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:19,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1064760.0, ans=0.0 2023-10-02 23:56:20,428 INFO [train.py:1046] (1/4) Epoch 31, batch 350, loss[loss=0.1606, simple_loss=0.2496, pruned_loss=0.03579, over 24680.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2402, pruned_loss=0.04245, over 3910535.73 frames. ], batch size: 73, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:56:20,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:56:20,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 23:56:24,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:29,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1064760.0, ans=0.035 2023-10-02 23:56:30,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:56:31,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:56:31,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:31,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1064760.0, ans=0.125 2023-10-02 23:56:34,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 23:56:36,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:56:37,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 23:56:40,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:40,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 23:56:41,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:56:45,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 23:56:47,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:56:47,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:56:48,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:56:50,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:56:50,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:56:50,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:56:50,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:56:51,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:56:53,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:56:53,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:57,883 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.88 vs. limit=5.0 2023-10-02 23:56:59,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:56:59,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:57:01,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:57:01,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:57:03,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1064960.0, ans=0.125 2023-10-02 23:57:05,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 23:57:07,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:57:11,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:57:11,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:57:11,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:57:12,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=1064960.0, ans=22.5 2023-10-02 23:57:13,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 23:57:14,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1064960.0, ans=0.0 2023-10-02 23:57:16,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:17,980 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 23:57:18,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 23:57:18,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:57:22,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:57:22,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 23:57:24,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:27,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:57:27,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:57:29,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:29,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:57:30,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:57:33,328 INFO [train.py:1046] (1/4) Epoch 31, batch 400, loss[loss=0.1728, simple_loss=0.2595, pruned_loss=0.04305, over 24349.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2409, pruned_loss=0.04233, over 4106023.24 frames. ], batch size: 74, lr: 3.32e-03, grad_scale: 16.0 2023-10-02 23:57:33,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:57:34,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:57:36,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 23:57:36,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:37,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:57:40,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:57:41,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:57:42,822 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.49 vs. limit=15.0 2023-10-02 23:57:43,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:57:45,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:57:46,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 23:57:48,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 23:57:48,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:57:49,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 23:57:51,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:57:53,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:57:53,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:57:53,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 23:57:53,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:57:53,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:57:53,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:57:55,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:58,183 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 23:57:58,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 23:58:03,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:58:03,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:58:04,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1065226.6666666667, ans=0.0 2023-10-02 23:58:05,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 23:58:06,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 23:58:09,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:58:11,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:58:19,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 23:58:22,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 23:58:23,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 23:58:26,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:58:26,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:58:28,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 23:58:29,308 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.935e+02 2.133e+02 2.545e+02 3.728e+02, threshold=4.266e+02, percent-clipped=0.0 2023-10-02 23:58:29,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1065293.3333333333, ans=0.125 2023-10-02 23:58:30,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:58:33,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:58:34,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:58:37,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:58:39,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 23:58:40,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:58:42,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 23:58:44,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:58:44,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:58:44,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1065360.0, ans=0.125 2023-10-02 23:58:46,828 INFO [train.py:1046] (1/4) Epoch 31, batch 450, loss[loss=0.1694, simple_loss=0.2571, pruned_loss=0.04087, over 24641.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.242, pruned_loss=0.04271, over 4239967.79 frames. ], batch size: 71, lr: 3.32e-03, grad_scale: 16.0 2023-10-02 23:58:46,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 23:58:48,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:58:48,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:58:50,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 23:58:50,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 23:58:51,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:58:53,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:58:53,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:58:53,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 23:58:54,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:58:55,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:58:57,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:59:00,627 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.57 vs. limit=6.0 2023-10-02 23:59:05,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:59:05,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:59:07,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 23:59:07,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 23:59:10,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:59:11,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1065493.3333333333, ans=0.0 2023-10-02 23:59:13,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:59:15,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:59:19,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:59:19,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:59:22,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 23:59:22,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 23:59:24,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 23:59:25,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:59:26,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:59:26,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:59:28,205 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 23:59:28,223 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 23:59:28,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:59:29,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:59:30,137 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.36 vs. limit=15.0 2023-10-02 23:59:32,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 23:59:32,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 23:59:34,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:59:34,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 23:59:34,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 23:59:36,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:59:39,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:59:40,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 23:59:40,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1065626.6666666667, ans=0.5 2023-10-02 23:59:42,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 23:59:43,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1065626.6666666667, ans=0.0 2023-10-02 23:59:46,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:59:46,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 23:59:49,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 23:59:49,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1065693.3333333333, ans=0.125 2023-10-02 23:59:50,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:59:55,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:59:56,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:59:56,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1065693.3333333333, ans=0.1 2023-10-02 23:59:58,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:59:58,286 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 00:00:00,935 INFO [train.py:1046] (1/4) Epoch 31, batch 500, loss[loss=0.1616, simple_loss=0.2451, pruned_loss=0.03907, over 24481.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2427, pruned_loss=0.0428, over 4348110.09 frames. ], batch size: 66, lr: 3.32e-03, grad_scale: 8.0 2023-10-03 00:00:01,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:00:02,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:00:02,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:00:02,404 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 00:00:05,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 00:00:05,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:00:07,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 00:00:11,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 00:00:11,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:00:12,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1065760.0, ans=0.0 2023-10-03 00:00:14,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:00:14,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:00:14,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:00:27,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:00:27,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:00:27,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 00:00:27,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:00:28,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 00:00:28,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:00:32,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:00:32,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:00:32,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:00:32,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:00:34,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 00:00:36,401 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.43 vs. limit=10.0 2023-10-03 00:00:37,112 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 00:00:38,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:00:40,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:00:41,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:00:41,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:00:41,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:00:44,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 00:00:47,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:00:48,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1065960.0, ans=0.125 2023-10-03 00:00:49,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:00:49,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1065960.0, ans=0.125 2023-10-03 00:00:54,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:00:57,866 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.909e+02 2.161e+02 2.446e+02 3.681e+02, threshold=4.321e+02, percent-clipped=0.0 2023-10-03 00:00:58,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:01:02,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1066026.6666666667, ans=0.07 2023-10-03 00:01:03,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:01:06,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 00:01:06,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:01:06,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:01:09,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 00:01:10,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:01:11,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:01:13,786 INFO [train.py:1046] (1/4) Epoch 31, batch 550, loss[loss=0.167, simple_loss=0.2403, pruned_loss=0.04688, over 23596.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2431, pruned_loss=0.04303, over 4422480.91 frames. ], batch size: 149, lr: 3.32e-03, grad_scale: 8.0 2023-10-03 00:01:16,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 00:01:17,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 00:01:18,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:01:19,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 00:01:20,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:01:20,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:01:20,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:21,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:22,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:01:24,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:01:25,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:01:25,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 00:01:25,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:01:29,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:01:30,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:31,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:01:33,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:37,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 00:01:37,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 00:01:37,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:01:42,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1066226.6666666667, ans=0.125 2023-10-03 00:01:44,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:01:45,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:01:45,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:01:45,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1066226.6666666667, ans=0.0 2023-10-03 00:01:48,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:01:48,699 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 00:01:50,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:51,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 00:01:55,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:01:55,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1066226.6666666667, ans=0.1 2023-10-03 00:01:56,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:01:56,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:01:57,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:01:59,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 00:02:01,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 00:02:02,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:02:02,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:02:02,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:02:02,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:02:07,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:02:07,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:02:07,781 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.27 vs. limit=15.0 2023-10-03 00:02:09,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:02:11,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:02:11,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 00:02:13,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:02:14,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:02:14,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 00:02:16,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:02:18,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:02:18,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 00:02:22,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1066360.0, ans=0.125 2023-10-03 00:02:22,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1066360.0, ans=0.125 2023-10-03 00:02:25,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 00:02:26,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 00:02:28,024 INFO [train.py:1046] (1/4) Epoch 31, batch 600, loss[loss=0.1682, simple_loss=0.2578, pruned_loss=0.03931, over 24411.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2442, pruned_loss=0.04336, over 4489332.50 frames. ], batch size: 69, lr: 3.32e-03, grad_scale: 8.0 2023-10-03 00:02:29,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:02:29,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:02:29,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:02:34,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1066426.6666666667, ans=0.125 2023-10-03 00:02:35,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:02:39,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 00:02:39,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 00:02:42,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:02:43,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:02:45,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:02:49,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 00:02:49,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:02:54,472 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.80 vs. limit=6.0 2023-10-03 00:02:55,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 00:02:57,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:02:57,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:02:59,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:03:04,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:03:04,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:03:04,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:03:10,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:03:15,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:03:15,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:03:15,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:03:20,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 00:03:27,629 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.826e+02 1.991e+02 2.219e+02 3.636e+02, threshold=3.983e+02, percent-clipped=0.0 2023-10-03 00:03:27,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 00:03:27,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:03:30,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 00:03:32,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:03:32,741 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.03 vs. limit=15.0 2023-10-03 00:03:33,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 00:03:34,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:03:34,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:03:42,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 00:03:42,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1066760.0, ans=0.125 2023-10-03 00:03:43,533 INFO [train.py:1046] (1/4) Epoch 31, batch 650, loss[loss=0.1713, simple_loss=0.2606, pruned_loss=0.04099, over 23975.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2426, pruned_loss=0.04314, over 4520778.82 frames. ], batch size: 80, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:03:43,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 00:03:45,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:03:48,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:03:50,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:03:50,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1066760.0, ans=0.0 2023-10-03 00:03:53,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 00:03:53,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:03:57,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:03:57,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:04:01,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:04:06,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 00:04:08,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:04:08,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:04:11,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1066826.6666666667, ans=0.0 2023-10-03 00:04:12,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:04:12,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 00:04:15,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:04:15,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:16,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 00:04:18,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:19,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:04:21,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 00:04:21,354 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 00:04:21,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1066893.3333333333, ans=0.125 2023-10-03 00:04:22,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:04:22,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:04:25,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:25,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:04:25,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:04:26,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:04:28,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 00:04:28,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:04:29,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:04:31,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:04:32,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:04:33,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:04:34,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 00:04:35,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 00:04:35,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:35,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:04:36,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:04:36,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:04:37,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:04:45,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:46,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:04:46,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:04:50,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:04:50,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:04:50,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:04:52,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1067026.6666666667, ans=0.2 2023-10-03 00:04:54,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1067026.6666666667, ans=0.125 2023-10-03 00:04:57,406 INFO [train.py:1046] (1/4) Epoch 31, batch 700, loss[loss=0.1581, simple_loss=0.23, pruned_loss=0.0431, over 23867.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2422, pruned_loss=0.04281, over 4570919.94 frames. ], batch size: 195, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:04:57,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:04:57,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:04:58,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:04:58,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:05:02,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 00:05:04,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 00:05:07,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 00:05:07,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:05:08,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:05:08,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 00:05:13,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:05:14,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:05:15,317 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.92 vs. limit=6.0 2023-10-03 00:05:16,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:05:16,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1067160.0, ans=0.125 2023-10-03 00:05:19,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:05:21,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:05:22,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:05:25,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 00:05:25,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:05:25,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1067226.6666666667, ans=0.2 2023-10-03 00:05:26,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 00:05:29,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 00:05:33,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:05:34,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:05:35,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:05:39,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:05:40,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 00:05:47,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:05:47,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:05:47,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 00:05:49,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:05:51,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:05:54,705 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.944e+02 2.173e+02 2.571e+02 3.380e+02, threshold=4.347e+02, percent-clipped=0.0 2023-10-03 00:05:54,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:05:59,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1067360.0, ans=0.1 2023-10-03 00:06:00,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:06:00,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 00:06:03,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 00:06:03,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 00:06:05,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:06:07,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:06:08,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:06:09,788 INFO [train.py:1046] (1/4) Epoch 31, batch 750, loss[loss=0.164, simple_loss=0.2547, pruned_loss=0.03668, over 24437.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2417, pruned_loss=0.04246, over 4612913.84 frames. ], batch size: 69, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:06:11,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:06:12,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 00:06:12,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1067426.6666666667, ans=0.0 2023-10-03 00:06:15,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 00:06:15,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 00:06:17,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 00:06:17,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 00:06:19,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 00:06:19,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:06:20,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 00:06:20,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:06:22,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:06:24,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:06:25,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:06:25,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:06:26,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:06:28,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:06:29,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:06:31,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:06:33,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:06:33,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:06:33,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 00:06:35,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:06:36,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:06:38,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:06:39,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:06:40,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 00:06:40,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:06:42,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 00:06:42,462 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 00:06:43,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 00:06:43,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:06:43,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 00:06:44,496 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.86 vs. limit=15.0 2023-10-03 00:06:47,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:06:47,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1067560.0, ans=0.1 2023-10-03 00:06:50,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1067560.0, ans=0.125 2023-10-03 00:06:50,944 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.43 vs. limit=15.0 2023-10-03 00:06:54,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:06:55,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:06:55,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:06:55,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1067626.6666666667, ans=0.2 2023-10-03 00:06:57,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1067626.6666666667, ans=0.2 2023-10-03 00:06:58,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:06:58,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1067626.6666666667, ans=0.0 2023-10-03 00:07:00,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:00,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 00:07:00,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:07:02,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 00:07:02,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:07:02,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1067626.6666666667, ans=0.04949747468305833 2023-10-03 00:07:05,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:07:05,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 00:07:05,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:07:10,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:07:12,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:07:12,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:07:13,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:07:18,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 00:07:18,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:07:18,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:07:20,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:07:20,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:07:23,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:07:25,401 INFO [train.py:1046] (1/4) Epoch 31, batch 800, loss[loss=0.1978, simple_loss=0.2558, pruned_loss=0.06993, over 19192.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2424, pruned_loss=0.04232, over 4643483.25 frames. ], batch size: 388, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:07:25,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:07:31,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:07:31,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:32,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:07:32,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:07:35,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:35,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:07:36,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:40,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:07:40,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1067826.6666666667, ans=0.125 2023-10-03 00:07:41,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:07:44,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 00:07:45,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:07:46,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:07:46,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:07:48,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:07:48,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 00:07:48,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:07:48,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 00:07:50,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1067826.6666666667, ans=0.2 2023-10-03 00:07:52,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:53,633 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.79 vs. limit=15.0 2023-10-03 00:07:56,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:07:57,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.whiten.whitening_limit, batch_count=1067893.3333333333, ans=12.0 2023-10-03 00:07:57,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:07:57,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:07:59,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:07:59,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:08:03,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:08:05,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:08:05,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 00:08:05,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1067893.3333333333, ans=0.125 2023-10-03 00:08:06,452 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 00:08:07,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 00:08:07,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:08:07,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:08:09,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:08:09,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:08:13,910 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 00:08:15,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 00:08:15,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:08:16,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1067960.0, ans=0.1 2023-10-03 00:08:18,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:08:22,714 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 2.007e+02 2.299e+02 2.659e+02 4.036e+02, threshold=4.599e+02, percent-clipped=0.0 2023-10-03 00:08:23,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1068026.6666666667, ans=0.2 2023-10-03 00:08:24,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:08:27,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:08:28,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 00:08:29,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:08:32,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 00:08:37,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:08:39,172 INFO [train.py:1046] (1/4) Epoch 31, batch 850, loss[loss=0.1387, simple_loss=0.2188, pruned_loss=0.02933, over 24622.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2428, pruned_loss=0.04244, over 4661565.93 frames. ], batch size: 60, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:08:39,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:08:40,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 00:08:40,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:08:41,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:08:43,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 00:08:43,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:08:44,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:08:46,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:08:48,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:08:48,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:08:49,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 00:08:51,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 00:08:51,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 00:08:53,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:08:53,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:08:53,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1068160.0, ans=0.1 2023-10-03 00:08:53,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1068160.0, ans=0.0 2023-10-03 00:08:54,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:08:54,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:08:55,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:09:00,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:09:01,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:09:01,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 00:09:03,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 00:09:09,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:09:09,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 00:09:13,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 00:09:13,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 00:09:15,111 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 00:09:16,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:09:16,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:09:16,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 00:09:18,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:09:19,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:09:19,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 00:09:23,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:09:23,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:09:24,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:09:24,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:09:25,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:09:27,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 00:09:28,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 00:09:32,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:09:32,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:09:33,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:09:33,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:09:33,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:09:37,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:09:39,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:09:41,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:09:41,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:09:43,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:09:50,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:09:52,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:09:52,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 00:09:52,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:09:53,679 INFO [train.py:1046] (1/4) Epoch 31, batch 900, loss[loss=0.1925, simple_loss=0.2658, pruned_loss=0.05957, over 22761.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2445, pruned_loss=0.04353, over 4649820.35 frames. ], batch size: 322, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:09:53,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:09:54,331 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.29 vs. limit=15.0 2023-10-03 00:09:55,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 00:10:03,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:10:06,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:10:07,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 00:10:10,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:10:10,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 00:10:11,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 00:10:12,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:10:12,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:10:13,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:10:13,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:10:21,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:10:21,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:10:21,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:10:26,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:10:30,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 00:10:31,842 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.08 vs. limit=15.0 2023-10-03 00:10:32,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:10:36,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:10:37,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:10:37,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1068626.6666666667, ans=0.0 2023-10-03 00:10:38,445 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 00:10:38,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 00:10:42,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:10:42,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:10:42,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1068626.6666666667, ans=0.0 2023-10-03 00:10:44,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:10:50,572 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.883e+02 2.040e+02 2.401e+02 4.175e+02, threshold=4.079e+02, percent-clipped=0.0 2023-10-03 00:10:50,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:10:50,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:10:53,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 00:10:53,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:10:56,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 00:10:58,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:10:58,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:11:00,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:11:00,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:11:06,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 00:11:06,980 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 00:11:08,235 INFO [train.py:1046] (1/4) Epoch 31, batch 950, loss[loss=0.1627, simple_loss=0.2312, pruned_loss=0.04715, over 23406.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2446, pruned_loss=0.04368, over 4668831.90 frames. ], batch size: 285, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:11:08,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 00:11:08,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 00:11:09,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1068760.0, ans=0.0 2023-10-03 00:11:11,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:11:13,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 00:11:16,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:11:18,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:11:19,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:11:19,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 00:11:22,366 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 00:11:25,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:11:25,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:11:27,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:11:28,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:11:28,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 00:11:30,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 00:11:30,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:11:32,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 00:11:32,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:11:32,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1068826.6666666667, ans=0.1 2023-10-03 00:11:35,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:11:35,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:11:36,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:11:36,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 00:11:39,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 00:11:40,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:11:42,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:11:47,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:11:47,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:11:50,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 00:11:54,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 00:11:54,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:11:54,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:11:56,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:11:56,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:12:00,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 00:12:01,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:12:03,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:12:04,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:12:04,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 00:12:04,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:12:04,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:12:05,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 00:12:06,242 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:12:08,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:12:11,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:12:15,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:12:16,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 00:12:16,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 00:12:20,944 INFO [train.py:1046] (1/4) Epoch 31, batch 1000, loss[loss=0.144, simple_loss=0.2194, pruned_loss=0.03433, over 24296.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2432, pruned_loss=0.04295, over 4682683.00 frames. ], batch size: 56, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:12:20,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:12:24,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 00:12:24,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:12:28,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:12:30,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 00:12:30,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 00:12:30,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1069093.3333333333, ans=0.2 2023-10-03 00:12:33,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1069093.3333333333, ans=0.125 2023-10-03 00:12:35,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1069160.0, ans=0.125 2023-10-03 00:12:36,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:12:36,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:12:37,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:12:38,714 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.61 vs. limit=6.0 2023-10-03 00:12:39,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 00:12:42,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 00:12:42,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 00:12:43,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:12:45,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 00:12:46,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 00:12:46,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 00:12:48,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:12:48,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:12:53,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1069226.6666666667, ans=0.125 2023-10-03 00:12:58,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:13:00,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:13:01,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:13:01,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:13:02,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 00:13:02,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:13:04,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:13:04,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:13:05,673 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.01 vs. limit=15.0 2023-10-03 00:13:06,113 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 00:13:07,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 00:13:09,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 00:13:10,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1069293.3333333333, ans=0.035 2023-10-03 00:13:11,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 00:13:12,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1069293.3333333333, ans=0.125 2023-10-03 00:13:13,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:13:18,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:13:20,107 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.823e+02 1.971e+02 2.145e+02 3.229e+02, threshold=3.942e+02, percent-clipped=0.0 2023-10-03 00:13:20,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:13:20,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:13:21,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:13:24,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 00:13:26,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:13:26,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 00:13:28,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 00:13:28,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:13:28,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:13:31,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:13:32,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:13:35,567 INFO [train.py:1046] (1/4) Epoch 31, batch 1050, loss[loss=0.1803, simple_loss=0.2474, pruned_loss=0.05662, over 23786.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2419, pruned_loss=0.04271, over 4684596.04 frames. ], batch size: 164, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:13:35,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:13:37,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:13:38,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:13:41,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 00:13:42,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:13:43,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1069426.6666666667, ans=0.0 2023-10-03 00:13:44,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:13:47,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:13:48,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:13:49,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:13:51,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:13:52,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:13:52,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:13:53,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 00:13:55,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:13:55,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 00:13:56,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:13:56,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 00:13:58,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:14:04,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:14:04,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:14:04,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:14:08,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 00:14:08,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 00:14:08,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:14:10,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 00:14:13,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 00:14:13,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:14:15,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1069560.0, ans=0.2 2023-10-03 00:14:17,236 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.30 vs. limit=15.0 2023-10-03 00:14:17,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 00:14:19,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:14:20,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:14:20,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:14:26,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:14:28,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1069626.6666666667, ans=0.125 2023-10-03 00:14:30,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 00:14:30,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 00:14:31,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 00:14:31,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:14:31,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:14:34,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 00:14:37,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:14:39,817 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:14:40,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:14:40,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:14:40,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:14:42,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:14:45,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:14:45,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 00:14:46,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:14:46,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 00:14:46,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 00:14:47,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:14:49,218 INFO [train.py:1046] (1/4) Epoch 31, batch 1100, loss[loss=0.1621, simple_loss=0.2403, pruned_loss=0.04196, over 23461.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2409, pruned_loss=0.04221, over 4700626.84 frames. ], batch size: 134, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:14:50,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:14:56,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:14:56,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1069760.0, ans=0.0 2023-10-03 00:14:59,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:15:00,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:15:01,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:15:01,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 00:15:02,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:15:04,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 00:15:06,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:15:07,642 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.58 vs. limit=15.0 2023-10-03 00:15:10,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:15:10,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 00:15:11,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 00:15:12,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:15:12,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:15:14,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1069826.6666666667, ans=0.125 2023-10-03 00:15:15,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:15:16,134 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:15:17,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:15:21,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:15:21,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1069893.3333333333, ans=0.07 2023-10-03 00:15:21,938 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.43 vs. limit=15.0 2023-10-03 00:15:24,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 00:15:26,058 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 00:15:26,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:15:29,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:15:29,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:15:30,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:15:32,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 00:15:32,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:15:32,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:15:32,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:15:33,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:15:33,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 00:15:41,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:15:41,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 00:15:44,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:15:48,045 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.833e+02 1.997e+02 2.295e+02 4.959e+02, threshold=3.994e+02, percent-clipped=1.0 2023-10-03 00:15:48,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:15:52,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 00:15:52,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 00:15:53,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:15:55,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:15:55,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:15:56,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 00:15:57,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:15:57,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:15:59,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 00:15:59,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:15:59,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 00:16:01,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:16:01,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:16:01,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:16:02,548 INFO [train.py:1046] (1/4) Epoch 31, batch 1150, loss[loss=0.1994, simple_loss=0.2596, pruned_loss=0.06958, over 19057.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2412, pruned_loss=0.04243, over 4698947.02 frames. ], batch size: 388, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:16:06,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:16:08,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:16:09,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1070093.3333333333, ans=0.07 2023-10-03 00:16:10,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:16:10,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:16:10,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 00:16:12,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:16:14,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 00:16:16,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:16:17,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:16:23,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 00:16:25,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:16:27,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:16:29,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:16:30,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 00:16:30,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:16:30,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:16:33,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 00:16:35,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:16:36,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:16:46,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:16:51,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:16:52,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 00:16:52,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:16:54,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:16:54,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1070293.3333333333, ans=0.125 2023-10-03 00:16:58,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1070293.3333333333, ans=0.125 2023-10-03 00:17:00,305 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 00:17:03,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:17:10,753 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 00:17:13,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:17:14,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:17:14,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:17:16,190 INFO [train.py:1046] (1/4) Epoch 31, batch 1200, loss[loss=0.1748, simple_loss=0.2425, pruned_loss=0.05353, over 23817.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2421, pruned_loss=0.04279, over 4708402.42 frames. ], batch size: 195, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:17:16,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:17:18,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:17:23,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:17:23,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:17:26,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:17:26,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:17:26,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:17:27,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:17:30,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:17:32,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:17:32,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:17:34,026 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 00:17:36,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 00:17:38,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:17:42,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:17:44,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:17:45,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:17:45,012 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 00:17:47,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:17:53,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:17:53,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:17:53,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 00:17:55,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:17:58,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 00:18:04,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 00:18:04,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:18:05,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:18:07,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:18:07,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:18:08,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:18:08,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:18:08,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:18:08,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 00:18:10,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:18:10,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:18:10,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:18:13,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:18:13,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:18:16,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 00:18:17,598 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.961e+02 2.154e+02 2.388e+02 3.166e+02, threshold=4.308e+02, percent-clipped=0.0 2023-10-03 00:18:17,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:18:21,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 00:18:24,007 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 00:18:24,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:18:24,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1070693.3333333333, ans=0.0 2023-10-03 00:18:26,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:18:28,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:18:29,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:18:31,674 INFO [train.py:1046] (1/4) Epoch 31, batch 1250, loss[loss=0.1571, simple_loss=0.237, pruned_loss=0.03861, over 24631.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2429, pruned_loss=0.04276, over 4715444.72 frames. ], batch size: 65, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:18:31,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 00:18:35,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:18:37,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:18:37,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 00:18:38,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:18:38,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:18:41,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 00:18:42,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:18:43,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2.whitening_limit, batch_count=1070760.0, ans=15.0 2023-10-03 00:18:43,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:18:43,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:18:46,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:18:49,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 00:18:49,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:18:49,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:18:49,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:18:51,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:18:52,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:18:52,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:18:57,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 00:18:57,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:19:01,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:19:03,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 00:19:03,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:19:03,292 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 00:19:03,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:19:04,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:19:05,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1070893.3333333333, ans=0.1 2023-10-03 00:19:08,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:19:10,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:19:10,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:19:12,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 00:19:12,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 00:19:12,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 00:19:15,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:19:16,496 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.83 vs. limit=15.0 2023-10-03 00:19:17,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 00:19:17,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:19:19,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 00:19:19,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:19:22,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 00:19:22,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 00:19:22,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:19:24,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:19:24,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:19:27,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 00:19:28,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:19:28,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:19:30,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:19:33,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:19:36,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:19:37,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 00:19:37,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1071026.6666666667, ans=0.0 2023-10-03 00:19:41,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:19:42,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:19:44,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:19:46,262 INFO [train.py:1046] (1/4) Epoch 31, batch 1300, loss[loss=0.1621, simple_loss=0.2441, pruned_loss=0.04006, over 23517.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2437, pruned_loss=0.04334, over 4698853.18 frames. ], batch size: 121, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:19:46,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:19:47,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:19:47,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 00:19:51,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:19:53,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:19:55,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 00:19:58,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:19:59,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1071160.0, ans=0.125 2023-10-03 00:20:03,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:20:03,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1071160.0, ans=0.125 2023-10-03 00:20:04,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:20:05,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:20:06,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:20:08,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:20:08,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 00:20:08,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 00:20:14,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:20:15,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:20:17,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 00:20:18,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 00:20:20,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:20:21,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:20:21,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1071226.6666666667, ans=0.125 2023-10-03 00:20:22,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 00:20:22,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:20:22,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 00:20:25,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:20:29,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:20:29,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:20:32,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1071293.3333333333, ans=0.2 2023-10-03 00:20:33,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 00:20:33,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1071293.3333333333, ans=0.1 2023-10-03 00:20:34,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 00:20:34,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 00:20:39,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:20:40,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1071293.3333333333, ans=0.2 2023-10-03 00:20:42,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 00:20:43,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:20:46,132 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.864e+02 2.122e+02 2.414e+02 4.284e+02, threshold=4.243e+02, percent-clipped=0.0 2023-10-03 00:20:51,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 00:20:51,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1071360.0, ans=0.0 2023-10-03 00:20:54,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:20:56,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:20:58,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1071426.6666666667, ans=0.1 2023-10-03 00:21:00,450 INFO [train.py:1046] (1/4) Epoch 31, batch 1350, loss[loss=0.1709, simple_loss=0.253, pruned_loss=0.04443, over 24491.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.243, pruned_loss=0.04272, over 4708136.89 frames. ], batch size: 66, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:21:01,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:21:01,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:21:02,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:21:03,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:21:05,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff2.min_abs, batch_count=1071426.6666666667, ans=0.1 2023-10-03 00:21:06,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:21:06,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 00:21:09,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:21:09,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:21:12,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 00:21:12,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:21:15,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:21:15,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 00:21:16,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 00:21:18,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 00:21:19,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:21:19,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 00:21:29,138 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:21:31,585 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.96 vs. limit=12.0 2023-10-03 00:21:32,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:21:40,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:21:40,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:21:40,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 00:21:45,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:21:46,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 00:21:46,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:21:46,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:21:49,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:21:52,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 00:21:53,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:21:58,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 00:22:01,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 00:22:06,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 00:22:07,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:22:10,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:22:11,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:22:14,814 INFO [train.py:1046] (1/4) Epoch 31, batch 1400, loss[loss=0.154, simple_loss=0.2185, pruned_loss=0.04472, over 23500.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2412, pruned_loss=0.04293, over 4708228.13 frames. ], batch size: 285, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:22:16,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 00:22:16,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1071760.0, ans=0.2 2023-10-03 00:22:17,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 00:22:17,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1071760.0, ans=0.125 2023-10-03 00:22:25,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:22:28,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:22:32,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:22:32,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:22:36,752 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.01 vs. limit=15.0 2023-10-03 00:22:37,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:22:38,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 00:22:47,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:22:48,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:22:51,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 00:22:52,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:22:53,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:22:54,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:22:54,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:22:56,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:22:56,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:22:57,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:22:58,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 00:22:58,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:23:02,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:05,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:23:11,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 00:23:12,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 00:23:13,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:23:15,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1072026.6666666667, ans=10.0 2023-10-03 00:23:16,200 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.848e+02 2.033e+02 2.325e+02 3.935e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-03 00:23:16,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 00:23:16,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:23:19,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:23:21,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:23:23,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:23:23,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:23,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 00:23:29,046 INFO [train.py:1046] (1/4) Epoch 31, batch 1450, loss[loss=0.1702, simple_loss=0.2605, pruned_loss=0.03992, over 24620.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2413, pruned_loss=0.04269, over 4714081.94 frames. ], batch size: 68, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:23:29,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:23:29,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:23:30,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:23:30,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 00:23:32,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:23:34,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 00:23:34,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:35,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:23:35,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 00:23:37,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:23:38,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:23:38,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 00:23:38,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:23:41,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:23:41,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1072093.3333333333, ans=0.0 2023-10-03 00:23:42,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:44,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1072160.0, ans=0.125 2023-10-03 00:23:46,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:23:49,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:23:49,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:23:50,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:23:50,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:51,286 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.16 vs. limit=15.0 2023-10-03 00:23:53,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:23:53,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:23:53,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:54,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:23:59,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 00:24:02,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:24:05,180 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 00:24:05,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1072226.6666666667, ans=0.2 2023-10-03 00:24:08,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:24:10,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:24:10,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:24:11,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 00:24:12,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1072226.6666666667, ans=0.2 2023-10-03 00:24:14,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:24:16,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 00:24:18,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 00:24:19,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:24:21,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1072293.3333333333, ans=0.125 2023-10-03 00:24:22,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:24:22,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:24:23,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 00:24:25,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 00:24:26,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 00:24:26,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1072293.3333333333, ans=0.125 2023-10-03 00:24:27,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:24:27,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:24:28,615 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.37 vs. limit=15.0 2023-10-03 00:24:41,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 00:24:41,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:24:41,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:24:43,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:24:43,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:24:44,685 INFO [train.py:1046] (1/4) Epoch 31, batch 1500, loss[loss=0.1532, simple_loss=0.247, pruned_loss=0.02967, over 24338.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2415, pruned_loss=0.04236, over 4716892.28 frames. ], batch size: 74, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:24:44,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:24:44,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 00:24:47,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:24:47,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:24:47,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:24:48,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:24:51,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:24:52,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:24:53,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1072426.6666666667, ans=0.0 2023-10-03 00:24:56,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:24:56,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 00:24:56,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:24:56,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:24:57,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:25:02,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 00:25:06,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 00:25:08,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:25:09,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 00:25:11,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 00:25:13,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1072560.0, ans=0.125 2023-10-03 00:25:14,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:25:14,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:25:16,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:25:17,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 00:25:17,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:25:17,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:25:17,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 00:25:18,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:25:24,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:25:24,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 00:25:29,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:25:30,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:25:35,337 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 00:25:35,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:25:35,384 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 00:25:38,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:25:39,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:25:39,690 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 00:25:41,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:25:43,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1072693.3333333333, ans=10.0 2023-10-03 00:25:44,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 00:25:46,100 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.898e+02 2.100e+02 2.437e+02 3.214e+02, threshold=4.200e+02, percent-clipped=0.0 2023-10-03 00:25:46,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:25:49,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:25:49,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:25:49,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:25:50,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:25:50,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:25:53,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 00:25:55,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 00:25:55,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:25:55,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 00:25:56,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 00:25:59,109 INFO [train.py:1046] (1/4) Epoch 31, batch 1550, loss[loss=0.1777, simple_loss=0.2619, pruned_loss=0.0468, over 23958.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2433, pruned_loss=0.04318, over 4715115.46 frames. ], batch size: 86, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:26:00,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:26:00,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:26:01,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:26:01,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:26:03,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:26:03,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:26:08,033 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 00:26:08,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:26:08,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:26:09,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:26:10,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:26:12,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 00:26:12,877 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.47 vs. limit=15.0 2023-10-03 00:26:14,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:26:14,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 00:26:15,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 00:26:15,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 00:26:16,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:26:17,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1072826.6666666667, ans=0.0 2023-10-03 00:26:18,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:26:22,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:26:24,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 00:26:24,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 00:26:27,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1072893.3333333333, ans=0.125 2023-10-03 00:26:28,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1072893.3333333333, ans=0.0 2023-10-03 00:26:33,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:26:37,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:26:37,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:26:37,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:26:37,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 00:26:43,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:26:44,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:26:46,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:26:50,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:26:52,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:26:52,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 00:26:52,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:26:54,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:26:54,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:26:54,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 00:26:54,151 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 00:26:57,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:27:01,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 00:27:05,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:27:07,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:27:07,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 00:27:09,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:27:11,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:27:11,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:27:11,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:27:11,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:27:13,103 INFO [train.py:1046] (1/4) Epoch 31, batch 1600, loss[loss=0.1837, simple_loss=0.2555, pruned_loss=0.05599, over 23413.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2438, pruned_loss=0.04316, over 4713763.15 frames. ], batch size: 285, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:27:14,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:27:16,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 00:27:16,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 00:27:19,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 00:27:22,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:27:22,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1073093.3333333333, ans=0.125 2023-10-03 00:27:23,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 00:27:23,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:27:26,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:27:31,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:27:35,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 00:27:37,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:27:37,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 00:27:37,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:27:38,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 00:27:40,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1073226.6666666667, ans=0.125 2023-10-03 00:27:45,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 00:27:51,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:27:53,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 00:27:53,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:27:53,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:27:53,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:27:56,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 00:27:59,096 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:28:00,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 00:28:03,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:28:03,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1073293.3333333333, ans=0.2 2023-10-03 00:28:04,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:28:04,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:28:04,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:28:06,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:28:07,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:28:09,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:28:09,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1073293.3333333333, ans=0.125 2023-10-03 00:28:10,908 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:28:13,239 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.827e+02 1.985e+02 2.199e+02 3.882e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-03 00:28:14,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:28:16,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:28:18,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 00:28:18,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:28:18,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 00:28:18,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1073360.0, ans=0.125 2023-10-03 00:28:19,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1073360.0, ans=0.2 2023-10-03 00:28:22,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1073360.0, ans=0.125 2023-10-03 00:28:25,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:28:27,289 INFO [train.py:1046] (1/4) Epoch 31, batch 1650, loss[loss=0.1578, simple_loss=0.2412, pruned_loss=0.03722, over 24327.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2439, pruned_loss=0.04356, over 4710675.25 frames. ], batch size: 61, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:28:27,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:28:27,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:28:27,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 00:28:27,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 00:28:27,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 00:28:28,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 00:28:33,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:28:34,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:28:34,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:28:35,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1073426.6666666667, ans=10.0 2023-10-03 00:28:36,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:28:38,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:28:40,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 00:28:41,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:28:43,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:28:43,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:28:43,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:28:43,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1073493.3333333333, ans=0.1 2023-10-03 00:28:44,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 00:28:44,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 00:28:49,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:28:51,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:29:00,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 00:29:00,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:29:02,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 00:29:04,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:29:07,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:29:07,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:29:07,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:29:07,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1073560.0, ans=0.125 2023-10-03 00:29:10,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:29:10,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:29:12,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:29:14,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:29:14,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:29:14,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:29:16,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:29:16,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:29:19,629 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.21 vs. limit=15.0 2023-10-03 00:29:21,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:29:21,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 00:29:23,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:29:23,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 00:29:23,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 00:29:25,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 00:29:25,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:29:26,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:29:26,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:29:26,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:29:26,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 00:29:31,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:29:31,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1073693.3333333333, ans=0.1 2023-10-03 00:29:32,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:29:34,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:29:35,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 00:29:40,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:29:40,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:29:41,380 INFO [train.py:1046] (1/4) Epoch 31, batch 1700, loss[loss=0.1631, simple_loss=0.245, pruned_loss=0.04063, over 24667.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2431, pruned_loss=0.04291, over 4710496.47 frames. ], batch size: 65, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:29:41,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 00:29:41,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:29:41,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:29:41,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:29:42,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:29:43,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:29:44,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 00:29:46,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:29:54,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:29:58,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:29:58,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1073826.6666666667, ans=0.0 2023-10-03 00:30:02,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:30:02,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:30:03,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:30:05,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:30:06,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1073826.6666666667, ans=0.125 2023-10-03 00:30:09,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 00:30:09,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1073893.3333333333, ans=0.125 2023-10-03 00:30:09,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1073893.3333333333, ans=0.2 2023-10-03 00:30:11,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:30:11,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:30:12,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:30:14,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:30:15,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 00:30:16,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 00:30:18,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:30:18,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 00:30:20,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:30:29,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:30:29,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:30:31,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:30:32,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 00:30:33,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 00:30:33,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:30:35,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:30:35,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 00:30:36,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:30:36,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:30:36,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:30:36,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:30:39,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:30:39,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:30:39,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:30:41,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:30:41,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:30:42,292 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.835e+02 2.052e+02 2.305e+02 3.998e+02, threshold=4.103e+02, percent-clipped=1.0 2023-10-03 00:30:45,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:30:47,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 00:30:50,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:30:50,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1074026.6666666667, ans=0.0 2023-10-03 00:30:51,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:30:51,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1074026.6666666667, ans=0.125 2023-10-03 00:30:53,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1074026.6666666667, ans=0.125 2023-10-03 00:30:54,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 00:30:55,839 INFO [train.py:1046] (1/4) Epoch 31, batch 1750, loss[loss=0.1435, simple_loss=0.2221, pruned_loss=0.03244, over 24596.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2429, pruned_loss=0.04257, over 4725085.06 frames. ], batch size: 60, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:30:57,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:31:01,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:31:01,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:31:02,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 00:31:02,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:31:02,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1074093.3333333333, ans=0.0 2023-10-03 00:31:03,277 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.24 vs. limit=15.0 2023-10-03 00:31:05,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:31:05,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:31:08,786 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.52 vs. limit=15.0 2023-10-03 00:31:09,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 00:31:11,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1074160.0, ans=0.125 2023-10-03 00:31:12,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:31:13,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 00:31:13,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:31:15,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:31:17,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1074160.0, ans=0.125 2023-10-03 00:31:19,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 00:31:20,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 00:31:20,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1074160.0, ans=0.2 2023-10-03 00:31:21,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:31:23,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 00:31:31,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:31:34,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1074226.6666666667, ans=0.125 2023-10-03 00:31:35,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:31:35,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:31:38,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:31:38,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:31:41,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:31:42,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:31:45,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:31:45,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:31:46,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 00:31:48,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:31:50,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 00:31:50,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:31:53,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:31:53,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:31:57,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:31:59,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 00:32:00,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:32:00,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:32:06,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:32:07,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:32:09,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:32:09,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1074426.6666666667, ans=0.125 2023-10-03 00:32:10,693 INFO [train.py:1046] (1/4) Epoch 31, batch 1800, loss[loss=0.1492, simple_loss=0.2405, pruned_loss=0.02892, over 24515.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2422, pruned_loss=0.04211, over 4732766.30 frames. ], batch size: 71, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:32:10,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 00:32:10,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:32:12,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:32:12,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:32:12,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:32:12,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:32:12,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:32:12,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1074426.6666666667, ans=0.07 2023-10-03 00:32:16,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:32:17,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:32:19,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 00:32:22,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1074426.6666666667, ans=0.1 2023-10-03 00:32:23,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:32:26,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 00:32:26,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:32:26,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1074493.3333333333, ans=0.09899494936611666 2023-10-03 00:32:28,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1074493.3333333333, ans=0.2 2023-10-03 00:32:28,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1074493.3333333333, ans=0.125 2023-10-03 00:32:29,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:32:31,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:32:32,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:32:34,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:32:37,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:32:37,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 00:32:37,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:32:39,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:32:43,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 00:32:45,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 00:32:45,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 00:32:46,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:32:46,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:32:46,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:32:49,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:32:50,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1074560.0, ans=0.0 2023-10-03 00:32:52,988 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 00:32:54,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:32:56,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:32:59,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 00:32:59,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 00:33:00,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:33:00,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:33:01,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:33:06,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 00:33:10,966 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.922e+02 2.138e+02 2.507e+02 4.896e+02, threshold=4.277e+02, percent-clipped=2.0 2023-10-03 00:33:12,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:33:13,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 00:33:13,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:33:13,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:33:13,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:33:13,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 00:33:16,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1074693.3333333333, ans=0.125 2023-10-03 00:33:18,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:33:18,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:33:20,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 00:33:20,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:33:20,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1074693.3333333333, ans=0.125 2023-10-03 00:33:20,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1074693.3333333333, ans=0.0 2023-10-03 00:33:22,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:33:22,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:33:24,466 INFO [train.py:1046] (1/4) Epoch 31, batch 1850, loss[loss=0.1415, simple_loss=0.2241, pruned_loss=0.0294, over 24293.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2424, pruned_loss=0.04217, over 4731151.38 frames. ], batch size: 61, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:33:24,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:33:24,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:33:25,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:33:27,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:33:27,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:33:30,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:33:30,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:33:36,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:33:36,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 00:33:39,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 00:33:42,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 00:33:46,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:33:46,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 00:33:46,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 00:33:58,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:33:59,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 00:33:59,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1074893.3333333333, ans=0.0 2023-10-03 00:34:02,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:34:03,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1074893.3333333333, ans=0.0 2023-10-03 00:34:04,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:34:07,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 00:34:07,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:07,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:34:09,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:34:11,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:34:14,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:34:15,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:34:17,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:17,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 00:34:17,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:34:18,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:34:21,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:34:23,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 00:34:23,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:34:28,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:34:29,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:34:29,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 00:34:29,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 00:34:30,841 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 00:34:32,578 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 00:34:33,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:34:33,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:34:33,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:34:33,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:35,396 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 00:34:35,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:34:35,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:37,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:34:37,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:34:38,579 INFO [train.py:1046] (1/4) Epoch 31, batch 1900, loss[loss=0.1531, simple_loss=0.2397, pruned_loss=0.03326, over 24491.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2425, pruned_loss=0.0421, over 4731987.46 frames. ], batch size: 63, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:34:38,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:34:38,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 00:34:41,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:41,452 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 00:34:41,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:34:42,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:34:48,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:34:49,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:34:49,695 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 00:34:51,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 00:34:53,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:34:53,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:34:54,560 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 00:34:54,593 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 00:34:57,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 00:34:59,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:35:02,116 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.61 vs. limit=15.0 2023-10-03 00:35:03,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 00:35:05,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 00:35:06,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1075226.6666666667, ans=0.0 2023-10-03 00:35:14,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 00:35:17,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 00:35:17,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:35:17,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1075226.6666666667, ans=0.05 2023-10-03 00:35:18,271 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 00:35:18,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 00:35:18,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 00:35:19,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 00:35:19,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:35:24,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 00:35:26,939 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.59 vs. limit=15.0 2023-10-03 00:35:27,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:35:30,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:35:30,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 00:35:32,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:35:37,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 00:35:39,195 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 2.002e+02 2.260e+02 2.885e+02 4.012e+02, threshold=4.521e+02, percent-clipped=0.0 2023-10-03 00:35:39,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:35:42,122 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.23 vs. limit=15.0 2023-10-03 00:35:42,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:35:42,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:35:42,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:35:44,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:35:45,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:35:45,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 00:35:45,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:35:48,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:35:48,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:35:51,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:35:51,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:35:52,438 INFO [train.py:1046] (1/4) Epoch 31, batch 1950, loss[loss=0.148, simple_loss=0.2283, pruned_loss=0.03387, over 24483.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2432, pruned_loss=0.04243, over 4735944.89 frames. ], batch size: 66, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:35:52,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:35:52,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:35:55,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:35:56,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1075426.6666666667, ans=0.0 2023-10-03 00:35:58,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:35:58,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:35:58,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:36:01,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 00:36:01,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 00:36:01,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:03,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:05,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:36:06,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:06,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:36:08,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:36:10,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:36:10,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:36:10,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:36:10,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:36:13,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:36:17,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:36:17,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:17,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 00:36:17,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 00:36:19,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:36:19,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:36:20,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:25,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:36:28,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:36:31,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:36:35,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:36:35,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:36:37,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 00:36:37,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:36:42,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:36:42,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1075626.6666666667, ans=0.1 2023-10-03 00:36:43,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:36:43,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:36:52,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:52,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:54,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:56,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:59,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:36:59,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:59,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 00:36:59,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:36:59,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:37:01,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 00:37:01,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1075693.3333333333, ans=0.125 2023-10-03 00:37:02,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:37:02,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1075693.3333333333, ans=0.1 2023-10-03 00:37:05,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:37:07,733 INFO [train.py:1046] (1/4) Epoch 31, batch 2000, loss[loss=0.1696, simple_loss=0.2537, pruned_loss=0.04278, over 23436.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2435, pruned_loss=0.04269, over 4728391.69 frames. ], batch size: 93, lr: 3.30e-03, grad_scale: 32.0 2023-10-03 00:37:07,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:37:07,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:37:10,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:37:12,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:37:15,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 00:37:16,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:37:20,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:37:21,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 00:37:23,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:37:23,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:37:24,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:37:26,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 00:37:28,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:29,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:29,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:29,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 00:37:31,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:37:32,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 00:37:32,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:37:35,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:37:37,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:37:37,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:37,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:37:39,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:37:40,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 00:37:43,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 00:37:43,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:37:43,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:37:44,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.whiten.whitening_limit, batch_count=1075893.3333333333, ans=12.0 2023-10-03 00:37:48,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:37:49,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:37:49,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:37:50,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:37:51,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:37:53,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:37:53,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:37:54,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:37:54,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:57,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:37:59,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 00:38:03,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:38:03,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1075960.0, ans=0.0 2023-10-03 00:38:04,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:08,528 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.852e+02 2.094e+02 2.367e+02 3.575e+02, threshold=4.187e+02, percent-clipped=0.0 2023-10-03 00:38:08,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:08,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:38:11,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:13,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:38:13,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:14,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:38:14,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:38:17,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:17,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:21,445 INFO [train.py:1046] (1/4) Epoch 31, batch 2050, loss[loss=0.1753, simple_loss=0.2632, pruned_loss=0.04365, over 23986.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2428, pruned_loss=0.04297, over 4724851.34 frames. ], batch size: 80, lr: 3.30e-03, grad_scale: 32.0 2023-10-03 00:38:21,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:38:22,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:24,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1076093.3333333333, ans=0.125 2023-10-03 00:38:29,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:38:30,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:38:31,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:33,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:38:34,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 00:38:34,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:38:34,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1076160.0, ans=0.5 2023-10-03 00:38:36,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:38:37,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:38:37,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1076160.0, ans=0.0 2023-10-03 00:38:46,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:38:46,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:46,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1076160.0, ans=0.125 2023-10-03 00:38:48,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 00:38:50,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:51,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 00:38:51,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:38:54,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:38:57,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:38:59,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:38:59,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:39:00,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:39:02,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:39:02,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:39:03,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:39:05,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:39:06,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:39:09,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:39:13,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:39:15,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:39:16,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 00:39:22,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:39:22,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:39:25,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:39:26,355 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=3.84 vs. limit=12.0 2023-10-03 00:39:27,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 00:39:29,796 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 00:39:29,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:39:31,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:39:31,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:39:32,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:39:33,732 INFO [train.py:1046] (1/4) Epoch 31, batch 2100, loss[loss=0.1652, simple_loss=0.2299, pruned_loss=0.05021, over 23642.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2412, pruned_loss=0.04249, over 4708462.89 frames. ], batch size: 256, lr: 3.30e-03, grad_scale: 32.0 2023-10-03 00:39:33,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 00:39:33,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 00:39:35,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:39:39,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:39:40,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:39:41,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1076426.6666666667, ans=0.125 2023-10-03 00:39:42,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:39:43,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:39:43,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 00:39:43,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:39:43,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 00:39:43,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 00:39:45,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1076426.6666666667, ans=0.125 2023-10-03 00:39:46,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:39:46,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:39:46,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 00:39:46,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 00:39:48,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1076493.3333333333, ans=0.125 2023-10-03 00:39:49,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.84 vs. limit=15.0 2023-10-03 00:39:51,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 00:39:51,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:39:54,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:39:56,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:39:57,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1076493.3333333333, ans=0.0 2023-10-03 00:39:58,775 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.90 vs. limit=22.5 2023-10-03 00:39:59,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:39:59,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 00:40:00,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:40:00,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 00:40:02,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 00:40:02,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:40:02,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 00:40:03,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 00:40:05,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 00:40:07,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:40:09,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:40:12,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:40:13,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:40:14,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:40:15,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:40:15,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 00:40:16,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:40:16,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:40:16,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:40:16,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 00:40:18,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 00:40:19,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 00:40:25,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:40:28,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:40:28,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 00:40:35,497 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 2.012e+02 2.254e+02 2.840e+02 4.737e+02, threshold=4.507e+02, percent-clipped=3.0 2023-10-03 00:40:35,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:40:37,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:40:37,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:40:37,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:40:38,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 00:40:38,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:40:41,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:40:41,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:40:41,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:40:41,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:40:45,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 00:40:46,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 00:40:46,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:40:47,879 INFO [train.py:1046] (1/4) Epoch 31, batch 2150, loss[loss=0.1503, simple_loss=0.2352, pruned_loss=0.03267, over 24473.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2414, pruned_loss=0.04234, over 4724778.86 frames. ], batch size: 63, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:40:48,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1076760.0, ans=0.2 2023-10-03 00:40:49,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:40:49,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:40:49,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:40:50,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:40:50,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1076760.0, ans=0.2 2023-10-03 00:40:54,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 00:40:56,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:40:57,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:40:58,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:40:58,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:40:58,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:41:02,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:41:03,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:41:03,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:41:03,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1076826.6666666667, ans=0.0 2023-10-03 00:41:06,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:06,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 00:41:07,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn1.whiten.whitening_limit, batch_count=1076826.6666666667, ans=22.5 2023-10-03 00:41:11,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:41:11,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:41:12,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:12,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:41:12,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:13,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:41:14,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:41:14,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:41:16,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:41:19,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 00:41:20,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:41:21,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:41:21,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:41:22,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1076893.3333333333, ans=0.125 2023-10-03 00:41:23,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:41:23,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:41:24,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1076893.3333333333, ans=0.125 2023-10-03 00:41:26,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:41:26,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:41:28,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:41:28,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 00:41:28,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:41:31,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1076960.0, ans=0.0 2023-10-03 00:41:32,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:41:32,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:33,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:41:34,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:41:36,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:41:37,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:37,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 00:41:37,833 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:41:40,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 00:41:40,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:41:40,425 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 00:41:40,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:41:40,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:41:42,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 00:41:42,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:41:42,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 00:41:42,473 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 00:41:42,473 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 00:41:42,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 00:41:44,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:41:45,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:41:45,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:41:45,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:47,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 00:41:49,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:41:49,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:49,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1077026.6666666667, ans=0.125 2023-10-03 00:41:54,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:41:55,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 00:41:56,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=1077026.6666666667, ans=0.05 2023-10-03 00:42:00,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:42:01,959 INFO [train.py:1046] (1/4) Epoch 31, batch 2200, loss[loss=0.1683, simple_loss=0.2406, pruned_loss=0.04794, over 23391.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2419, pruned_loss=0.04277, over 4721991.11 frames. ], batch size: 285, lr: 3.30e-03, grad_scale: 8.0 2023-10-03 00:42:02,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1077093.3333333333, ans=0.125 2023-10-03 00:42:04,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1077093.3333333333, ans=0.035 2023-10-03 00:42:06,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:42:06,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:42:06,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:42:07,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:42:07,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1077093.3333333333, ans=0.125 2023-10-03 00:42:10,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:42:10,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:42:10,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 00:42:16,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 00:42:18,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:42:20,400 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.77 vs. limit=15.0 2023-10-03 00:42:24,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 00:42:27,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:42:27,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:42:28,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:42:31,241 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.21 vs. limit=15.0 2023-10-03 00:42:31,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:42:32,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 00:42:35,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:42:37,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:42:37,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 00:42:40,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:42:41,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:42:43,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1077226.6666666667, ans=0.0 2023-10-03 00:42:45,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:42:47,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:42:49,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 00:42:49,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:42:50,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 00:42:52,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:42:52,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 00:42:52,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:42:55,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:42:55,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:42:55,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:42:55,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:42:58,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:42:58,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:42:59,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 00:43:03,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 00:43:04,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:43:05,899 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.847e+02 1.989e+02 2.183e+02 3.187e+02, threshold=3.977e+02, percent-clipped=0.0 2023-10-03 00:43:08,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:43:08,617 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 00:43:10,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1077360.0, ans=0.2 2023-10-03 00:43:11,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:43:11,910 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 00:43:13,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:43:13,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1077360.0, ans=0.5 2023-10-03 00:43:13,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1077360.0, ans=0.125 2023-10-03 00:43:14,636 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 00:43:15,960 INFO [train.py:1046] (1/4) Epoch 31, batch 2250, loss[loss=0.1591, simple_loss=0.2389, pruned_loss=0.03963, over 23741.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.242, pruned_loss=0.04268, over 4732865.00 frames. ], batch size: 135, lr: 3.30e-03, grad_scale: 8.0 2023-10-03 00:43:16,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:43:17,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 00:43:19,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:43:20,690 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 00:43:22,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:43:24,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:43:30,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:43:31,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:43:35,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:43:35,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:43:36,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:43:39,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 00:43:39,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:43:40,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:43:42,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 00:43:42,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:43:42,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:43:43,293 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.06 vs. limit=15.0 2023-10-03 00:43:43,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:43:48,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:43:48,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 00:43:50,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:43:52,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 00:43:53,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:43:54,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:43:59,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:44:01,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:44:02,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:44:02,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:44:03,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:44:04,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1077626.6666666667, ans=0.125 2023-10-03 00:44:05,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:44:09,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:44:13,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:44:17,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 00:44:17,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:44:17,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:44:23,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 00:44:26,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:44:26,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 00:44:26,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:44:26,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:44:28,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1077693.3333333333, ans=10.0 2023-10-03 00:44:28,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1077693.3333333333, ans=0.09899494936611666 2023-10-03 00:44:29,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 00:44:30,743 INFO [train.py:1046] (1/4) Epoch 31, batch 2300, loss[loss=0.1758, simple_loss=0.2564, pruned_loss=0.04755, over 23430.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2422, pruned_loss=0.04268, over 4734014.33 frames. ], batch size: 285, lr: 3.30e-03, grad_scale: 8.0 2023-10-03 00:44:32,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:44:32,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:44:35,852 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.57 vs. limit=15.0 2023-10-03 00:44:39,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:44:41,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:44:42,487 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 00:44:43,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:44:51,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:44:51,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:44:51,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:44:51,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:44:51,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 00:44:52,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:44:55,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:44:57,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:44:59,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1077893.3333333333, ans=0.125 2023-10-03 00:45:00,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:45:03,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:45:06,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:45:09,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:45:10,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:45:12,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:45:14,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:45:19,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:45:20,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:45:21,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:45:21,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 00:45:25,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 00:45:25,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1077960.0, ans=0.0 2023-10-03 00:45:26,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:45:26,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:45:26,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:45:26,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:45:28,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 00:45:28,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:45:30,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 00:45:30,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:45:30,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:45:30,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 00:45:34,149 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.897e+02 2.094e+02 2.362e+02 4.130e+02, threshold=4.187e+02, percent-clipped=1.0 2023-10-03 00:45:35,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:45:38,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:45:40,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1078026.6666666667, ans=0.125 2023-10-03 00:45:41,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:45:43,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:45:43,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:45:44,638 INFO [train.py:1046] (1/4) Epoch 31, batch 2350, loss[loss=0.1711, simple_loss=0.2458, pruned_loss=0.04817, over 23381.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2425, pruned_loss=0.04268, over 4735511.34 frames. ], batch size: 106, lr: 3.30e-03, grad_scale: 8.0 2023-10-03 00:45:44,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:45:44,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:45:44,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:45:46,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 00:45:46,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1078093.3333333333, ans=0.125 2023-10-03 00:45:53,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:45:53,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 00:45:57,185 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.23 vs. limit=15.0 2023-10-03 00:45:58,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 00:46:02,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:46:05,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:46:05,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:46:05,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:46:05,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:46:06,344 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.55 vs. limit=6.0 2023-10-03 00:46:06,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 00:46:09,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:46:13,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 00:46:14,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1078226.6666666667, ans=0.0 2023-10-03 00:46:15,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:46:19,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:46:20,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:46:21,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:46:24,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 00:46:25,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:46:26,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:46:26,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:46:26,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:46:31,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:46:32,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 00:46:32,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:46:32,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1078293.3333333333, ans=0.125 2023-10-03 00:46:35,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:46:35,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:46:37,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 00:46:38,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:46:39,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 00:46:39,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:46:44,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 00:46:47,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 00:46:47,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:46:47,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 00:46:47,426 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 00:46:47,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1078360.0, ans=0.0 2023-10-03 00:46:47,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1078360.0, ans=0.125 2023-10-03 00:46:48,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 00:46:52,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 00:46:52,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1078360.0, ans=0.125 2023-10-03 00:46:56,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:46:59,198 INFO [train.py:1046] (1/4) Epoch 31, batch 2400, loss[loss=0.1493, simple_loss=0.2308, pruned_loss=0.03388, over 24566.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2428, pruned_loss=0.04302, over 4729153.91 frames. ], batch size: 60, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:47:01,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:47:04,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:47:05,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:47:06,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 00:47:06,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 00:47:08,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1078426.6666666667, ans=0.1 2023-10-03 00:47:13,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 00:47:15,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:47:16,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 00:47:16,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:47:18,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:47:19,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 00:47:22,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:47:27,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 00:47:30,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:47:34,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 00:47:37,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:47:39,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:47:42,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:47:43,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 00:47:43,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:47:47,486 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.19 vs. limit=22.5 2023-10-03 00:47:50,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:47:53,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:47:56,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:47:56,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:47:56,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:47:57,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:47:57,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:47:57,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:47:57,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:48:00,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1078693.3333333333, ans=0.125 2023-10-03 00:48:00,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1078693.3333333333, ans=0.125 2023-10-03 00:48:02,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:48:03,809 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.898e+02 2.077e+02 2.404e+02 3.282e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-03 00:48:03,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:48:03,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 00:48:04,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 00:48:06,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:48:06,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:48:08,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 00:48:08,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 00:48:09,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 00:48:09,429 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 00:48:10,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 00:48:10,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:48:12,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:48:12,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:48:13,468 INFO [train.py:1046] (1/4) Epoch 31, batch 2450, loss[loss=0.1799, simple_loss=0.2665, pruned_loss=0.04666, over 24004.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2414, pruned_loss=0.04253, over 4738230.96 frames. ], batch size: 80, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:48:13,597 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 00:48:14,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:48:15,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 00:48:18,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:48:18,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:48:22,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:48:22,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:48:24,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 00:48:27,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:48:27,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:48:27,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1078826.6666666667, ans=0.0 2023-10-03 00:48:31,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:48:31,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:48:31,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:48:31,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 00:48:34,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:48:35,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1078826.6666666667, ans=0.2 2023-10-03 00:48:37,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:48:37,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:48:40,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:48:40,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:48:42,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:48:42,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:48:45,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 00:48:45,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1078893.3333333333, ans=0.125 2023-10-03 00:48:48,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:48:56,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:48:58,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:48:58,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:48:58,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:48:59,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:48:59,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:49:01,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 00:49:02,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:49:04,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:49:08,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:49:08,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:49:12,987 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:49:13,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:49:14,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 00:49:15,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:49:15,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:49:15,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 00:49:15,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:49:17,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:49:21,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:49:23,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:49:24,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:49:27,730 INFO [train.py:1046] (1/4) Epoch 31, batch 2500, loss[loss=0.1738, simple_loss=0.2499, pruned_loss=0.04884, over 23260.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2415, pruned_loss=0.04227, over 4737278.92 frames. ], batch size: 93, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:49:27,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 00:49:29,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:49:32,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:49:38,337 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.33 vs. limit=15.0 2023-10-03 00:49:41,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:49:41,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:49:43,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:49:43,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 00:49:49,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:49:50,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:49:50,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 00:49:50,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 00:49:52,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 00:49:52,743 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1079160.0, ans=0.1 2023-10-03 00:49:54,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:49:55,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:49:56,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 00:49:56,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:49:56,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 00:49:56,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:50:01,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:50:02,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:50:05,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 00:50:05,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 00:50:07,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:50:08,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:50:12,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:50:15,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:50:20,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:50:24,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 00:50:27,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 00:50:29,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:50:29,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 00:50:29,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1079360.0, ans=0.1 2023-10-03 00:50:31,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:50:31,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 00:50:31,129 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 00:50:31,129 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 00:50:31,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 00:50:32,335 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.760e+02 1.910e+02 2.096e+02 3.347e+02, threshold=3.821e+02, percent-clipped=0.0 2023-10-03 00:50:35,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:50:36,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 00:50:36,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 00:50:36,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:50:36,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1079360.0, ans=0.07 2023-10-03 00:50:38,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 00:50:40,006 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:50:42,254 INFO [train.py:1046] (1/4) Epoch 31, batch 2550, loss[loss=0.1601, simple_loss=0.25, pruned_loss=0.03507, over 24656.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2418, pruned_loss=0.04197, over 4724282.38 frames. ], batch size: 68, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:50:42,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 00:50:42,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1079426.6666666667, ans=0.125 2023-10-03 00:50:45,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:50:46,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:50:46,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:50:48,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:50:48,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 00:50:49,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:50:52,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 00:50:54,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:50:54,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1079426.6666666667, ans=0.05 2023-10-03 00:50:55,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:50:58,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:50:58,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 00:50:58,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:50:58,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:51:00,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:51:03,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:51:03,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 00:51:04,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 00:51:04,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:51:04,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 00:51:14,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:51:19,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:51:19,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:51:20,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:51:21,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:51:25,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:51:29,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:51:29,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:51:30,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:51:31,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:51:31,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:51:34,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:51:35,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:51:39,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:51:39,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 00:51:39,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:51:39,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:51:40,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:51:41,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:51:42,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:51:50,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:51:51,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:51:54,507 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 00:51:57,159 INFO [train.py:1046] (1/4) Epoch 31, batch 2600, loss[loss=0.1573, simple_loss=0.2432, pruned_loss=0.03571, over 24433.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2426, pruned_loss=0.04283, over 4712351.42 frames. ], batch size: 77, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:51:57,288 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 00:51:57,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:51:58,721 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 00:51:59,642 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.97 vs. limit=10.0 2023-10-03 00:52:00,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 00:52:00,179 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 00:52:02,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:52:02,278 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 00:52:05,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 00:52:06,441 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 00:52:07,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:52:09,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 00:52:09,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 00:52:10,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:52:12,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 00:52:13,868 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 00:52:13,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 00:52:20,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1079826.6666666667, ans=0.0 2023-10-03 00:52:21,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:52:21,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:52:21,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:52:21,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 00:52:23,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:52:28,754 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 00:52:29,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1079893.3333333333, ans=0.125 2023-10-03 00:52:33,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1079893.3333333333, ans=0.1 2023-10-03 00:52:34,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:52:34,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:52:36,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 00:52:37,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:52:37,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:52:37,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 00:52:41,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:52:41,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:52:43,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:52:46,526 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 00:52:46,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:52:46,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:52:52,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:52:53,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:52:54,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 00:52:54,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:52:56,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:52:56,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:52:56,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1080026.6666666667, ans=0.125 2023-10-03 00:53:02,169 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.914e+02 2.104e+02 2.405e+02 3.485e+02, threshold=4.207e+02, percent-clipped=0.0 2023-10-03 00:53:02,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 00:53:02,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1080026.6666666667, ans=0.125 2023-10-03 00:53:03,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:53:05,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 00:53:06,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1080026.6666666667, ans=0.0 2023-10-03 00:53:06,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1080026.6666666667, ans=0.125 2023-10-03 00:53:08,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 00:53:08,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:53:08,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:53:09,633 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 00:53:09,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:53:12,212 INFO [train.py:1046] (1/4) Epoch 31, batch 2650, loss[loss=0.1681, simple_loss=0.2512, pruned_loss=0.04253, over 24661.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2427, pruned_loss=0.0432, over 4714406.42 frames. ], batch size: 68, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:53:12,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:53:15,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 00:53:17,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:53:18,391 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.84 vs. limit=15.0 2023-10-03 00:53:19,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:53:21,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 00:53:21,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:53:21,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:53:21,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1080093.3333333333, ans=0.1 2023-10-03 00:53:22,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1080093.3333333333, ans=0.2 2023-10-03 00:53:25,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 00:53:25,294 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 00:53:27,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:53:29,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 00:53:31,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:53:31,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 00:53:31,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1080160.0, ans=0.0 2023-10-03 00:53:34,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:53:34,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 00:53:34,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:53:34,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:53:39,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 00:53:39,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 00:53:41,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:53:44,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 00:53:44,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:53:46,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:53:46,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:53:46,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:53:46,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:53:48,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:53:51,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:53:52,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:53:53,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:53:54,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:53:56,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:53:57,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:53:59,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:54:00,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:54:00,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 00:54:04,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:54:05,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:54:05,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:54:06,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 00:54:12,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:54:14,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:54:16,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:54:16,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:54:16,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:54:17,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:54:17,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1080360.0, ans=0.125 2023-10-03 00:54:19,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:54:19,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 00:54:19,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1080360.0, ans=0.1 2023-10-03 00:54:22,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:54:24,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 00:54:25,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:54:25,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:54:27,008 INFO [train.py:1046] (1/4) Epoch 31, batch 2700, loss[loss=0.1606, simple_loss=0.2467, pruned_loss=0.0373, over 24677.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2439, pruned_loss=0.04371, over 4701669.99 frames. ], batch size: 73, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:54:27,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:54:28,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:54:28,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:54:28,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:54:29,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:54:29,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 00:54:29,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:54:32,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:54:33,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:54:33,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:54:37,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:54:38,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 00:54:39,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:54:44,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:54:45,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:54:49,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:54:49,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:54:49,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:54:49,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:54:53,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:54:53,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1080493.3333333333, ans=0.0 2023-10-03 00:54:54,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:54:56,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:54:56,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:55:00,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:55:00,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:55:02,755 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.84 vs. limit=22.5 2023-10-03 00:55:08,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:55:08,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:55:10,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:55:10,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:55:13,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:55:13,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:55:15,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:55:16,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:19,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:55:21,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:55:24,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:55:25,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:55:25,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:55:28,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 00:55:30,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:55:31,497 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.858e+02 1.974e+02 2.152e+02 3.142e+02, threshold=3.947e+02, percent-clipped=0.0 2023-10-03 00:55:31,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:55:31,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 00:55:32,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 00:55:33,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:55:36,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1080693.3333333333, ans=0.1 2023-10-03 00:55:37,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:55:39,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:55:39,346 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:55:40,421 INFO [train.py:1046] (1/4) Epoch 31, batch 2750, loss[loss=0.1498, simple_loss=0.231, pruned_loss=0.03435, over 20348.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2432, pruned_loss=0.04355, over 4696853.15 frames. ], batch size: 44, lr: 3.29e-03, grad_scale: 8.0 2023-10-03 00:55:41,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:41,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:55:41,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:44,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:55:46,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 00:55:46,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:55:46,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:46,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 00:55:46,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:55:46,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:55:53,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 00:55:55,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:55:55,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:56,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:55:56,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 00:55:57,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:55:59,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:55:59,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:56:01,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:56:05,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:56:05,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 00:56:06,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:56:08,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:56:08,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:56:15,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:56:18,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 00:56:18,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:56:23,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:56:23,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:56:24,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:56:30,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:56:30,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:56:30,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 00:56:36,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:56:38,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 00:56:38,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1081026.6666666667, ans=0.1 2023-10-03 00:56:39,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1081026.6666666667, ans=0.2 2023-10-03 00:56:41,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 00:56:44,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:56:44,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 00:56:44,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:56:44,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1081026.6666666667, ans=0.125 2023-10-03 00:56:47,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:56:48,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 00:56:48,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:56:48,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1081026.6666666667, ans=0.125 2023-10-03 00:56:51,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 00:56:51,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:56:51,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:56:53,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 00:56:53,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:56:53,604 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.93 vs. limit=6.0 2023-10-03 00:56:54,346 INFO [train.py:1046] (1/4) Epoch 31, batch 2800, loss[loss=0.1535, simple_loss=0.2172, pruned_loss=0.04493, over 23515.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2417, pruned_loss=0.04321, over 4685772.25 frames. ], batch size: 285, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:56:54,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:56:57,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:56:57,075 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 00:56:57,076 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 00:57:00,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:57:01,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:57:03,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:57:03,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1081093.3333333333, ans=0.125 2023-10-03 00:57:06,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:57:09,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 00:57:10,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 00:57:10,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 00:57:12,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:57:12,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:57:12,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:57:16,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:57:16,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:57:16,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:57:18,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:57:26,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:57:27,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1081226.6666666667, ans=0.07 2023-10-03 00:57:28,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:57:31,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:57:31,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:57:31,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:57:37,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:57:37,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 00:57:37,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:57:39,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:57:39,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:57:45,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:57:45,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:57:49,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:57:50,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:57:50,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:57:50,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 00:57:50,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:57:52,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:57:53,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:57:53,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 00:57:54,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:57:55,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:57:56,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:57:58,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 00:57:59,536 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 1.999e+02 2.250e+02 2.864e+02 4.547e+02, threshold=4.500e+02, percent-clipped=1.0 2023-10-03 00:57:59,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:57:59,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:57:59,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:58:01,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 00:58:02,547 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.57 vs. limit=22.5 2023-10-03 00:58:07,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:58:07,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:58:08,901 INFO [train.py:1046] (1/4) Epoch 31, batch 2850, loss[loss=0.165, simple_loss=0.2414, pruned_loss=0.04429, over 15517.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2417, pruned_loss=0.04285, over 4688909.58 frames. ], batch size: 33, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:58:09,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:58:10,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:58:14,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:58:14,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:58:14,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:58:17,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:58:18,277 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.73 vs. limit=15.0 2023-10-03 00:58:19,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:58:20,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:58:20,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 00:58:24,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 00:58:24,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:58:28,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 00:58:29,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:58:32,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 00:58:34,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 00:58:34,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:58:46,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:58:47,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:58:47,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:58:49,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:58:49,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:58:49,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:58:52,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:58:52,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 00:58:54,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:58:54,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:58:54,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:58:54,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1081626.6666666667, ans=0.0 2023-10-03 00:58:56,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:58:59,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:58:59,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:59:00,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:59:02,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:59:03,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1081626.6666666667, ans=0.2 2023-10-03 00:59:05,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:59:05,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:59:06,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:59:09,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:59:14,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:59:14,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 00:59:14,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 00:59:15,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 00:59:17,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:59:17,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 00:59:19,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:59:20,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:59:20,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:59:20,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:59:20,607 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 00:59:20,657 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 00:59:20,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:59:21,337 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.37 vs. limit=15.0 2023-10-03 00:59:21,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:59:22,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1081760.0, ans=0.125 2023-10-03 00:59:23,273 INFO [train.py:1046] (1/4) Epoch 31, batch 2900, loss[loss=0.1678, simple_loss=0.2422, pruned_loss=0.04671, over 23756.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2408, pruned_loss=0.04258, over 4697054.82 frames. ], batch size: 164, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:59:26,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:59:26,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:59:26,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:59:27,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 00:59:32,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:59:32,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 00:59:33,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 00:59:34,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:59:34,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:59:35,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1081760.0, ans=0.125 2023-10-03 00:59:36,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:59:37,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:59:41,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:59:42,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:59:45,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:59:45,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 00:59:45,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:59:46,288 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.95 vs. limit=15.0 2023-10-03 00:59:48,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:59:51,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 00:59:52,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 00:59:54,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:59:54,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 00:59:54,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:59:57,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:59:57,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 01:00:00,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:00:01,524 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.92 vs. limit=12.0 2023-10-03 01:00:02,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:00:04,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:00:07,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:00:07,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 01:00:09,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 01:00:09,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:00:12,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:00:13,098 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.24 vs. limit=22.5 2023-10-03 01:00:15,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 01:00:18,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:00:18,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1081960.0, ans=0.125 2023-10-03 01:00:19,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1081960.0, ans=0.0 2023-10-03 01:00:22,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:00:28,269 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.814e+02 1.971e+02 2.189e+02 2.896e+02, threshold=3.942e+02, percent-clipped=0.0 2023-10-03 01:00:30,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:00:30,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:00:30,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1082026.6666666667, ans=0.0 2023-10-03 01:00:33,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 01:00:35,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:00:35,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 01:00:35,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:00:37,299 INFO [train.py:1046] (1/4) Epoch 31, batch 2950, loss[loss=0.1482, simple_loss=0.2254, pruned_loss=0.03553, over 24512.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2426, pruned_loss=0.04332, over 4682677.38 frames. ], batch size: 58, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:00:37,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:00:41,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:00:43,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 01:00:44,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:00:44,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:00:46,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:00:48,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:00:48,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 01:00:49,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 01:00:49,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:00:49,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:00:53,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1082160.0, ans=0.125 2023-10-03 01:00:55,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:00:57,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:00:59,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:00:59,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:01:03,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:01:03,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:01:04,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:01:05,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:01:05,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:01:08,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 01:01:13,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 01:01:13,407 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 01:01:14,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:01:16,274 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 01:01:17,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 01:01:17,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:01:19,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:01:19,488 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 01:01:19,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 01:01:22,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 01:01:23,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:01:23,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:01:27,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1082293.3333333333, ans=0.125 2023-10-03 01:01:28,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:01:28,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:01:29,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:01:29,942 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 01:01:29,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:01:29,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 01:01:37,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:01:37,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:01:38,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 01:01:38,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:01:39,545 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.37 vs. limit=15.0 2023-10-03 01:01:40,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 01:01:41,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:01:43,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:01:45,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:01:46,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:01:47,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 01:01:48,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:01:48,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:01:48,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:01:50,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:01:50,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:01:52,123 INFO [train.py:1046] (1/4) Epoch 31, batch 3000, loss[loss=0.1512, simple_loss=0.2402, pruned_loss=0.03112, over 24664.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2427, pruned_loss=0.04308, over 4699041.44 frames. ], batch size: 73, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:01:52,123 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 01:02:01,145 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.7596, 2.3973, 2.8612, 3.1360, 2.6280, 3.2097, 3.0797, 3.3001], device='cuda:1') 2023-10-03 01:02:05,001 INFO [train.py:1078] (1/4) Epoch 31, validation: loss=0.3333, simple_loss=0.2731, pruned_loss=0.1967, over 1125622.00 frames. 2023-10-03 01:02:05,001 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 01:02:05,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:02:06,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:02:06,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 01:02:06,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:02:09,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:02:09,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:02:13,971 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 01:02:14,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 01:02:16,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:02:16,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:02:16,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 01:02:18,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:02:22,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1082493.3333333333, ans=0.1 2023-10-03 01:02:25,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:02:29,735 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.79 vs. limit=15.0 2023-10-03 01:02:33,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:02:36,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1082560.0, ans=0.125 2023-10-03 01:02:38,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.82 vs. limit=15.0 2023-10-03 01:02:39,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 01:02:39,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:02:42,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:02:42,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:02:42,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:02:42,815 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.16 vs. limit=15.0 2023-10-03 01:02:44,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:02:44,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 01:02:46,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 01:02:47,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1082560.0, ans=0.0 2023-10-03 01:02:48,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:02:50,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:02:50,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:02:50,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:02:51,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:02:51,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:02:54,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:02:56,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:02:56,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:02:57,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:02:59,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 01:03:00,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:03:02,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:02,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:03:07,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:03:08,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:03:09,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 01:03:09,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 01:03:09,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:03:10,614 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.466e+02 1.904e+02 2.048e+02 2.313e+02 4.264e+02, threshold=4.096e+02, percent-clipped=1.0 2023-10-03 01:03:10,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 01:03:10,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:03:12,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 01:03:13,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:03:16,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:03:16,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 01:03:18,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 01:03:18,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 01:03:19,491 INFO [train.py:1046] (1/4) Epoch 31, batch 3050, loss[loss=0.1538, simple_loss=0.2421, pruned_loss=0.03274, over 24668.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2436, pruned_loss=0.04342, over 4704068.11 frames. ], batch size: 73, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:03:19,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:03:19,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:03:19,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 01:03:19,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:20,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:03:25,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 01:03:25,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1082760.0, ans=0.125 2023-10-03 01:03:27,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:03:29,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:03:30,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:03:31,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:34,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 01:03:42,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 01:03:42,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 01:03:42,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:03:46,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:03:50,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:50,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:03:51,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:03:51,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1082893.3333333333, ans=0.125 2023-10-03 01:03:53,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:03:54,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:03:54,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:03:54,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:03:54,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:03:56,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:59,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:02,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:04:02,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 01:04:02,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:04:02,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:04:06,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:04:07,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:04:07,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:04:07,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:04:13,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:04:13,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:04:18,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:18,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:04:18,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:04:21,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:04:21,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:04:21,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:04:23,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 01:04:24,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:04:24,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:26,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 01:04:27,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:04:27,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1083026.6666666667, ans=0.125 2023-10-03 01:04:32,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:04:33,362 INFO [train.py:1046] (1/4) Epoch 31, batch 3100, loss[loss=0.1722, simple_loss=0.2325, pruned_loss=0.05594, over 19517.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2429, pruned_loss=0.04306, over 4703706.52 frames. ], batch size: 388, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:04:33,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:04:36,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:04:37,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 01:04:40,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 01:04:40,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 01:04:40,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:04:42,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1083093.3333333333, ans=0.125 2023-10-03 01:04:43,125 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.09 vs. limit=15.0 2023-10-03 01:04:45,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:04:45,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:48,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 01:04:52,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:57,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 01:05:01,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:05:01,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:01,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:05:01,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:05:03,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 01:05:05,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:05:05,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 01:05:05,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:05:07,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:05:08,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 01:05:10,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:05:14,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:05:14,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 01:05:16,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 01:05:16,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:18,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:05:19,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:05:20,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:20,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:05:21,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:05:21,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:05:24,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:05:25,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:05:25,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:25,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 01:05:28,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1083293.3333333333, ans=0.2 2023-10-03 01:05:30,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:05:31,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 01:05:34,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:05:34,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 01:05:35,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:05:35,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:35,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 01:05:37,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1083360.0, ans=0.0 2023-10-03 01:05:38,197 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.903e+02 2.126e+02 2.520e+02 4.741e+02, threshold=4.252e+02, percent-clipped=3.0 2023-10-03 01:05:39,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1083360.0, ans=0.125 2023-10-03 01:05:41,400 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:05:45,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 01:05:46,961 INFO [train.py:1046] (1/4) Epoch 31, batch 3150, loss[loss=0.1732, simple_loss=0.2459, pruned_loss=0.05023, over 23292.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2421, pruned_loss=0.04248, over 4712972.77 frames. ], batch size: 105, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:05:48,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:05:49,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:51,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:05:51,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:05:52,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 01:05:53,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:05:53,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 01:05:54,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 01:05:55,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:05:59,225 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 01:06:00,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 01:06:00,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:06:02,069 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 01:06:02,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 01:06:02,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 01:06:03,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 01:06:03,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 01:06:03,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:06:03,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:06:03,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1083493.3333333333, ans=0.125 2023-10-03 01:06:04,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:06:06,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 01:06:07,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:06:07,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:06:09,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:06:10,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 01:06:15,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 01:06:15,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:06:19,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:06:20,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:06:22,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 01:06:24,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 01:06:25,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:06:25,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:06:25,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:06:25,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:06:25,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1083560.0, ans=0.0 2023-10-03 01:06:27,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:06:28,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 01:06:28,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:06:28,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 01:06:30,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:06:30,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:06:33,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:06:33,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:06:33,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 01:06:34,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:06:35,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 01:06:35,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:06:38,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 01:06:38,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 01:06:40,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:06:40,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:06:41,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 01:06:42,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 01:06:44,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:06:45,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=1083693.3333333333, ans=0.5 2023-10-03 01:06:45,725 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.20 vs. limit=12.0 2023-10-03 01:06:47,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:06:48,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:06:50,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:06:50,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1083693.3333333333, ans=0.2 2023-10-03 01:06:55,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:06:55,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:06:56,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 01:06:56,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1083693.3333333333, ans=0.125 2023-10-03 01:06:59,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=1083760.0, ans=0.025 2023-10-03 01:07:00,745 INFO [train.py:1046] (1/4) Epoch 31, batch 3200, loss[loss=0.1755, simple_loss=0.2576, pruned_loss=0.04669, over 24380.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2411, pruned_loss=0.04209, over 4709230.06 frames. ], batch size: 77, lr: 3.29e-03, grad_scale: 32.0 2023-10-03 01:07:02,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:07:02,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 01:07:05,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:07:06,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:07:06,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 01:07:09,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:07:14,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:07:17,684 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.81 vs. limit=15.0 2023-10-03 01:07:18,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:07:26,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:07:28,195 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:07:31,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1083893.3333333333, ans=0.95 2023-10-03 01:07:36,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 01:07:38,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:07:39,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 01:07:39,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1083893.3333333333, ans=0.125 2023-10-03 01:07:40,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:07:44,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:07:44,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:07:45,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:07:47,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1083960.0, ans=0.1 2023-10-03 01:07:47,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1083960.0, ans=0.0 2023-10-03 01:07:48,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 01:07:48,915 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:07:50,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 01:07:52,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 01:07:53,779 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.24 vs. limit=12.0 2023-10-03 01:07:55,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 01:07:56,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1083960.0, ans=0.0 2023-10-03 01:07:57,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:08:02,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:08:04,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:08:04,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:08:04,172 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 01:08:04,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:08:04,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1084026.6666666667, ans=0.1 2023-10-03 01:08:05,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1084026.6666666667, ans=0.125 2023-10-03 01:08:06,914 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.876e+02 2.218e+02 2.823e+02 4.927e+02, threshold=4.435e+02, percent-clipped=2.0 2023-10-03 01:08:08,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:08:09,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 01:08:09,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 01:08:11,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 01:08:11,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 01:08:12,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1084026.6666666667, ans=0.125 2023-10-03 01:08:13,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:08:15,757 INFO [train.py:1046] (1/4) Epoch 31, batch 3250, loss[loss=0.1623, simple_loss=0.2505, pruned_loss=0.03708, over 24648.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2404, pruned_loss=0.04218, over 4696729.31 frames. ], batch size: 73, lr: 3.29e-03, grad_scale: 32.0 2023-10-03 01:08:17,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 01:08:17,684 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 01:08:17,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:08:17,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:17,801 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 01:08:22,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:08:26,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:08:32,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:08:32,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 01:08:33,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:08:35,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:08:35,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:08:36,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:08:36,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:08:38,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:39,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:08:39,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:08:39,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:39,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:39,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:08:41,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1084160.0, ans=0.2 2023-10-03 01:08:44,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:08:44,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:08:47,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:08:47,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:49,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:08:50,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:08:50,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:08:53,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1084226.6666666667, ans=0.125 2023-10-03 01:08:54,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 01:08:54,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:08:54,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:08:55,465 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.83 vs. limit=22.5 2023-10-03 01:08:57,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:08:58,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:09:03,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:09:10,016 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.62 vs. limit=15.0 2023-10-03 01:09:12,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:09:12,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:09:12,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 01:09:12,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:09:12,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 01:09:12,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:09:15,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 01:09:15,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 01:09:17,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:09:17,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:09:19,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:09:19,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 01:09:19,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:09:24,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:09:24,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:09:25,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 01:09:25,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:09:28,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:09:28,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 01:09:30,113 INFO [train.py:1046] (1/4) Epoch 31, batch 3300, loss[loss=0.1889, simple_loss=0.2674, pruned_loss=0.05515, over 23452.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2417, pruned_loss=0.04214, over 4720424.33 frames. ], batch size: 93, lr: 3.29e-03, grad_scale: 32.0 2023-10-03 01:09:31,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:09:31,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 01:09:34,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 01:09:36,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 01:09:36,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:09:40,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:09:42,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:09:42,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:09:43,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 01:09:43,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:09:45,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:09:46,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:09:50,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 01:09:51,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:09:53,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:09:54,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:09:54,443 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 01:09:54,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1084493.3333333333, ans=0.05 2023-10-03 01:09:55,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:09:57,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:09:57,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:09:57,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:09:57,231 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 01:10:00,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:10:00,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:10:02,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:10:02,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 01:10:02,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 01:10:03,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1084560.0, ans=0.125 2023-10-03 01:10:04,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:10:06,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:10:07,518 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 01:10:09,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 01:10:10,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:10:12,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1084560.0, ans=0.0 2023-10-03 01:10:13,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 01:10:14,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:10:16,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:10:16,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:10:16,815 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.68 vs. limit=6.0 2023-10-03 01:10:20,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:10:21,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:10:21,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:10:21,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 01:10:23,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:10:23,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:10:24,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:10:25,516 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.63 vs. limit=12.0 2023-10-03 01:10:26,152 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 01:10:27,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 01:10:29,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1084693.3333333333, ans=0.0 2023-10-03 01:10:30,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 01:10:30,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:10:30,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:10:30,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1084693.3333333333, ans=0.0 2023-10-03 01:10:31,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:10:31,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:10:34,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:10:34,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:10:34,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 01:10:35,807 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.832e+02 2.063e+02 2.275e+02 2.937e+02, threshold=4.126e+02, percent-clipped=0.0 2023-10-03 01:10:35,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:10:37,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:10:41,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 01:10:41,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:10:42,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:10:42,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1084760.0, ans=0.125 2023-10-03 01:10:43,767 INFO [train.py:1046] (1/4) Epoch 31, batch 3350, loss[loss=0.1646, simple_loss=0.2521, pruned_loss=0.03853, over 24414.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2421, pruned_loss=0.0423, over 4726373.27 frames. ], batch size: 69, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:10:45,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:10:45,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:10:46,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:10:47,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:10:47,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:10:49,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1084760.0, ans=0.0 2023-10-03 01:10:51,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:10:53,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:10:53,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:10:56,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:10:56,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1084760.0, ans=0.125 2023-10-03 01:10:57,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:10:59,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.15 vs. limit=22.5 2023-10-03 01:11:00,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:11:00,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:11:01,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 01:11:01,731 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 01:11:01,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:11:05,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 01:11:06,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 01:11:07,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:11:07,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:11:09,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:11:10,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 01:11:10,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:11:10,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:11:13,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:11:15,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:11:16,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:11:16,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:11:20,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:11:21,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:11:21,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:11:22,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1084893.3333333333, ans=0.125 2023-10-03 01:11:25,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:11:25,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:11:27,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:11:27,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:11:30,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:11:30,327 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1084960.0, ans=0.2 2023-10-03 01:11:32,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 01:11:32,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 01:11:33,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1084960.0, ans=0.125 2023-10-03 01:11:34,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 01:11:34,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:11:35,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 01:11:36,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:11:38,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:11:45,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:11:46,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 01:11:46,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:11:47,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:11:49,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:11:50,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1085026.6666666667, ans=10.0 2023-10-03 01:11:52,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:11:54,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1085026.6666666667, ans=0.0 2023-10-03 01:11:55,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 01:11:56,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:11:56,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:11:58,004 INFO [train.py:1046] (1/4) Epoch 31, batch 3400, loss[loss=0.1643, simple_loss=0.2483, pruned_loss=0.04019, over 24389.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2426, pruned_loss=0.04274, over 4724023.73 frames. ], batch size: 77, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:11:58,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:11:58,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 01:11:59,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:12:00,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 01:12:01,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:12:02,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:12:03,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:12:05,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:12:05,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 01:12:05,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1085093.3333333333, ans=0.1 2023-10-03 01:12:09,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 01:12:09,280 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 01:12:09,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:12:13,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:12:13,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:12:13,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:12:15,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 01:12:21,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:12:22,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 01:12:27,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:12:29,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:12:30,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:12:32,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 01:12:37,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:12:40,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 01:12:42,852 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:12:45,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:12:46,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:12:46,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 01:12:46,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1085293.3333333333, ans=0.125 2023-10-03 01:12:48,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:12:48,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:12:48,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:12:49,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:12:51,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:12:54,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:12:54,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:12:54,876 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1085293.3333333333, ans=0.0 2023-10-03 01:12:56,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1085360.0, ans=0.1 2023-10-03 01:12:59,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1085360.0, ans=0.125 2023-10-03 01:13:01,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:13:03,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 01:13:04,323 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.916e+02 2.114e+02 2.433e+02 5.346e+02, threshold=4.228e+02, percent-clipped=1.0 2023-10-03 01:13:07,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:13:11,337 INFO [train.py:1046] (1/4) Epoch 31, batch 3450, loss[loss=0.174, simple_loss=0.2593, pruned_loss=0.04437, over 24370.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2438, pruned_loss=0.04305, over 4712699.75 frames. ], batch size: 77, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:13:11,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 01:13:14,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 01:13:16,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:13:18,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:13:18,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 01:13:18,977 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.13 vs. limit=5.0 2023-10-03 01:13:19,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:13:22,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:13:26,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:13:27,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:13:27,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:13:27,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:13:30,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:13:38,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 01:13:42,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 01:13:42,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:13:42,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:13:44,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:13:48,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 01:13:49,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:13:50,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1085560.0, ans=0.0 2023-10-03 01:13:55,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:13:55,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:13:56,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:13:56,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:13:58,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 01:13:58,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:13:59,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:14:01,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:14:04,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 01:14:06,169 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.43 vs. limit=15.0 2023-10-03 01:14:08,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:14:10,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1085693.3333333333, ans=0.1 2023-10-03 01:14:11,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:14:12,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:16,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:14:18,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1085693.3333333333, ans=0.125 2023-10-03 01:14:20,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:21,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:14:21,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:14:21,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:14:25,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:14:26,725 INFO [train.py:1046] (1/4) Epoch 31, batch 3500, loss[loss=0.1636, simple_loss=0.2427, pruned_loss=0.04224, over 24485.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2428, pruned_loss=0.04279, over 4718449.85 frames. ], batch size: 63, lr: 3.29e-03, grad_scale: 8.0 2023-10-03 01:14:28,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:14:29,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 01:14:31,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:14:35,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 01:14:35,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:14:35,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 01:14:42,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:14:43,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:14:43,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:14:43,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:14:45,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 01:14:45,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:45,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:14:47,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 01:14:48,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:48,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 01:14:50,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:14:55,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:56,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 01:14:56,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:14:59,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:15:00,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:15:02,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:15:04,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:15:04,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:15:06,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 01:15:07,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 01:15:07,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 01:15:07,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:15:08,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:15:10,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:15:10,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:15:13,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 01:15:14,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:15:18,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:15:20,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 01:15:20,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 01:15:20,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:15:24,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:15:26,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:15:27,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:15:30,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 01:15:30,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:15:31,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:15:33,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 01:15:33,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1086026.6666666667, ans=0.125 2023-10-03 01:15:34,360 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.859e+02 2.081e+02 2.339e+02 4.872e+02, threshold=4.163e+02, percent-clipped=1.0 2023-10-03 01:15:34,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 01:15:35,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:15:36,580 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.78 vs. limit=15.0 2023-10-03 01:15:37,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:15:37,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:15:38,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:15:39,951 INFO [train.py:1046] (1/4) Epoch 31, batch 3550, loss[loss=0.1726, simple_loss=0.2638, pruned_loss=0.04067, over 24436.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2415, pruned_loss=0.04201, over 4726173.52 frames. ], batch size: 69, lr: 3.29e-03, grad_scale: 8.0 2023-10-03 01:15:41,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:15:47,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1086093.3333333333, ans=0.125 2023-10-03 01:15:50,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:15:52,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 01:15:56,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:15:57,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:15:59,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:16:00,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:16:00,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:16:03,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:16:05,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:16:05,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:16:05,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 01:16:05,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:16:10,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:16:10,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:16:12,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:16:12,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:16:13,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:16:13,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 01:16:13,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:16:15,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:16:17,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 01:16:20,327 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1086226.6666666667, ans=0.0 2023-10-03 01:16:21,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:16:21,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:16:23,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:16:24,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 01:16:27,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:16:28,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 01:16:28,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:16:29,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:16:29,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:16:32,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 01:16:33,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:16:39,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:16:39,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 01:16:40,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:16:45,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:16:45,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 01:16:51,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1086360.0, ans=0.1 2023-10-03 01:16:52,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 01:16:52,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:16:52,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:16:53,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1086426.6666666667, ans=0.125 2023-10-03 01:16:54,686 INFO [train.py:1046] (1/4) Epoch 31, batch 3600, loss[loss=0.187, simple_loss=0.2457, pruned_loss=0.06415, over 19315.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2417, pruned_loss=0.04219, over 4713443.14 frames. ], batch size: 388, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:16:56,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:16:56,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:16:56,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:17:00,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:17:01,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:17:01,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:17:03,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:17:03,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:17:03,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 01:17:06,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:17:06,877 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.71 vs. limit=15.0 2023-10-03 01:17:07,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:17:10,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:17:11,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1086493.3333333333, ans=0.0 2023-10-03 01:17:13,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:17:13,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:17:15,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:17:15,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 01:17:16,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:17:19,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:17:20,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:17:22,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:17:24,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:17:25,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:17:27,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 01:17:32,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:17:34,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1086560.0, ans=0.2 2023-10-03 01:17:35,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:17:36,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 01:17:41,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:17:46,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:17:48,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1086626.6666666667, ans=0.125 2023-10-03 01:17:49,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:17:55,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:17:55,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:17:57,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 01:17:59,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 01:18:00,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 01:18:02,191 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.943e+02 2.241e+02 2.571e+02 3.664e+02, threshold=4.481e+02, percent-clipped=0.0 2023-10-03 01:18:03,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:18:03,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:18:05,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 01:18:05,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:18:05,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:18:05,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:18:05,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 01:18:06,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 01:18:08,144 INFO [train.py:1046] (1/4) Epoch 31, batch 3650, loss[loss=0.1613, simple_loss=0.2351, pruned_loss=0.04372, over 23440.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2416, pruned_loss=0.04199, over 4720032.11 frames. ], batch size: 134, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:18:09,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:18:10,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 01:18:15,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 01:18:16,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:18:21,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 01:18:22,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 01:18:26,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:18:26,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:18:26,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:18:32,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 01:18:32,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:18:32,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 01:18:33,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:18:33,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:18:33,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1086826.6666666667, ans=0.0 2023-10-03 01:18:34,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 01:18:35,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 01:18:36,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:18:36,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:18:37,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:18:40,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 01:18:41,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 01:18:43,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:18:44,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 01:18:45,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:18:46,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:18:48,047 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.25 vs. limit=15.0 2023-10-03 01:18:49,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1086893.3333333333, ans=0.0 2023-10-03 01:18:50,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:18:52,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:18:52,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:18:53,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:18:54,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:18:57,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:19:01,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:19:03,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:19:03,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:19:05,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:19:05,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:19:07,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:19:11,578 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 01:19:13,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:19:14,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:19:14,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 01:19:15,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:19:17,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:19:18,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:19:19,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1087026.6666666667, ans=0.0 2023-10-03 01:19:20,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 01:19:20,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:19:21,968 INFO [train.py:1046] (1/4) Epoch 31, batch 3700, loss[loss=0.169, simple_loss=0.245, pruned_loss=0.04655, over 23734.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.242, pruned_loss=0.04205, over 4729662.48 frames. ], batch size: 232, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:19:23,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:19:26,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:19:26,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:19:28,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:19:28,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 01:19:28,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:19:32,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:19:32,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:19:36,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:19:38,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:19:39,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:19:39,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:19:39,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:19:41,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 01:19:43,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:19:45,206 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 01:19:49,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:19:50,253 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.68 vs. limit=12.0 2023-10-03 01:19:50,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:19:52,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:19:52,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 01:19:52,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:19:55,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:19:55,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 01:19:57,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:19:58,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:20:02,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:20:03,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:20:06,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 01:20:09,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:20:09,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 01:20:09,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:20:10,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 01:20:15,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:20:15,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:20:17,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:20:19,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 01:20:20,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:20:20,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:20:20,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:20:21,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:20:23,879 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.08 vs. limit=15.0 2023-10-03 01:20:25,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:20:26,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 01:20:26,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 01:20:28,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:20:28,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:20:29,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:20:29,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:20:32,845 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.869e+02 2.113e+02 2.369e+02 2.929e+02, threshold=4.225e+02, percent-clipped=0.0 2023-10-03 01:20:34,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:20:34,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:20:35,942 INFO [train.py:1046] (1/4) Epoch 31, batch 3750, loss[loss=0.1579, simple_loss=0.2359, pruned_loss=0.04001, over 23446.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2431, pruned_loss=0.04275, over 4710610.07 frames. ], batch size: 119, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:20:36,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:20:38,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 01:20:40,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 01:20:41,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 01:20:42,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 01:20:43,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:20:43,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1087426.6666666667, ans=0.125 2023-10-03 01:20:44,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:20:45,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:20:47,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:20:48,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1087493.3333333333, ans=0.1 2023-10-03 01:20:50,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:20:51,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1087493.3333333333, ans=0.125 2023-10-03 01:20:52,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:20:54,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:20:56,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:20:58,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:21:00,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 01:21:00,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1087493.3333333333, ans=0.125 2023-10-03 01:21:01,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:21:03,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:21:03,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:21:07,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 01:21:10,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 01:21:11,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:21:12,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:21:15,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:21:21,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:21:21,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 01:21:24,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 01:21:27,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:21:28,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1087626.6666666667, ans=0.0 2023-10-03 01:21:30,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:21:32,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:21:34,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:21:39,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 01:21:40,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:21:43,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:21:43,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:21:45,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1087693.3333333333, ans=0.2 2023-10-03 01:21:45,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1087693.3333333333, ans=0.0 2023-10-03 01:21:47,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 01:21:49,176 INFO [train.py:1046] (1/4) Epoch 31, batch 3800, loss[loss=0.1418, simple_loss=0.2256, pruned_loss=0.02901, over 24325.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2423, pruned_loss=0.04241, over 4722886.91 frames. ], batch size: 61, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:21:49,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1087760.0, ans=0.125 2023-10-03 01:21:53,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:21:56,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1087760.0, ans=0.05 2023-10-03 01:21:58,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:21:59,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 01:21:59,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 01:22:02,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:22:03,149 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.19 vs. limit=15.0 2023-10-03 01:22:04,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:22:05,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 01:22:07,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 01:22:07,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:22:07,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:22:10,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:22:10,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:22:11,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:22:11,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 01:22:15,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 01:22:16,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:22:18,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:22:19,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:22:21,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:22:22,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 01:22:22,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:22:25,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:22:25,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:22:30,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 01:22:30,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 01:22:31,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:22:36,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=1087960.0, ans=0.05 2023-10-03 01:22:40,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:22:44,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:22:47,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 01:22:47,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=1087960.0, ans=6.0 2023-10-03 01:22:50,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 01:22:50,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:22:52,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:22:52,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:22:54,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 01:22:56,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1088026.6666666667, ans=0.125 2023-10-03 01:22:57,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 01:22:57,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 01:22:57,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1088026.6666666667, ans=0.0 2023-10-03 01:22:58,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:22:59,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:23:01,710 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.890e+02 2.060e+02 2.285e+02 4.176e+02, threshold=4.120e+02, percent-clipped=0.0 2023-10-03 01:23:03,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:23:03,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:23:04,034 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.74 vs. limit=10.0 2023-10-03 01:23:04,717 INFO [train.py:1046] (1/4) Epoch 31, batch 3850, loss[loss=0.1684, simple_loss=0.2457, pruned_loss=0.04558, over 24313.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2417, pruned_loss=0.04229, over 4709837.86 frames. ], batch size: 61, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:23:06,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1088093.3333333333, ans=0.2 2023-10-03 01:23:08,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1088093.3333333333, ans=0.125 2023-10-03 01:23:09,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:23:10,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 01:23:13,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:23:13,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:23:17,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:23:18,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:23:20,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 01:23:21,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 01:23:27,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:27,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:23:27,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1088160.0, ans=0.5 2023-10-03 01:23:29,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:23:30,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:23:33,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:34,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:23:35,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:23:35,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:23:36,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:23:37,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:23:39,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:40,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:23:40,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 01:23:40,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 01:23:43,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:23:43,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:45,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:23:47,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:47,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 01:23:49,522 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.64 vs. limit=10.0 2023-10-03 01:23:50,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 01:23:51,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:23:51,782 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:23:51,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1088293.3333333333, ans=0.1 2023-10-03 01:23:53,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 01:23:54,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 01:23:59,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:24:01,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:24:03,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:24:03,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 01:24:05,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1088360.0, ans=0.1 2023-10-03 01:24:05,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1088360.0, ans=0.0 2023-10-03 01:24:06,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 01:24:09,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:24:09,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:24:12,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:24:12,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:24:14,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:14,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:14,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:24:14,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 01:24:15,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:24:16,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 01:24:16,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:16,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:24:17,133 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:24:18,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:24:18,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1088426.6666666667, ans=0.1 2023-10-03 01:24:19,612 INFO [train.py:1046] (1/4) Epoch 31, batch 3900, loss[loss=0.1596, simple_loss=0.2421, pruned_loss=0.03852, over 24453.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2403, pruned_loss=0.04188, over 4711765.73 frames. ], batch size: 63, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:24:19,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:21,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1088426.6666666667, ans=0.0 2023-10-03 01:24:22,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:24:22,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:24:22,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:24:24,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:24:24,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 01:24:24,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:27,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1088426.6666666667, ans=0.1 2023-10-03 01:24:28,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:24:28,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:24:28,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:24:29,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:24:31,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:24:33,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:33,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:24:34,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 01:24:34,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:24:37,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 01:24:37,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:38,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 01:24:40,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 01:24:46,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:24:48,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:24:48,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:24:48,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1088560.0, ans=0.0 2023-10-03 01:24:49,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:24:53,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:24:53,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1088560.0, ans=0.125 2023-10-03 01:24:55,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:24:56,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:24:58,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:24:58,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:25:03,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:25:03,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:25:09,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:25:11,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:25:15,263 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=11.09 vs. limit=15.0 2023-10-03 01:25:19,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1088693.3333333333, ans=0.125 2023-10-03 01:25:22,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:25:23,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:25:23,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 01:25:24,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 01:25:24,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:25:26,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 01:25:28,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:25:28,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 01:25:30,790 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.803e+02 2.022e+02 2.316e+02 4.391e+02, threshold=4.043e+02, percent-clipped=1.0 2023-10-03 01:25:33,537 INFO [train.py:1046] (1/4) Epoch 31, batch 3950, loss[loss=0.1697, simple_loss=0.2464, pruned_loss=0.04653, over 23770.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2404, pruned_loss=0.04186, over 4718198.58 frames. ], batch size: 212, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:25:35,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:25:36,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 01:25:38,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:25:38,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1088760.0, ans=0.1 2023-10-03 01:25:38,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1088760.0, ans=0.125 2023-10-03 01:25:39,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:25:40,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1088760.0, ans=0.125 2023-10-03 01:25:42,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:25:47,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1088826.6666666667, ans=0.0 2023-10-03 01:25:49,193 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 01:25:50,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:25:50,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 01:25:51,784 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 01:25:51,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:25:54,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:25:54,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:25:54,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:25:54,801 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:25:56,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 01:25:59,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:25:59,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:26:00,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:26:00,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:26:02,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:26:12,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:26:12,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:26:17,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 01:26:23,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 01:26:23,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 01:26:24,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:26:26,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:26:31,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:26:31,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:26:32,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:26:32,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:26:32,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 01:26:35,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:26:38,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:26:38,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1089026.6666666667, ans=0.125 2023-10-03 01:26:40,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1089026.6666666667, ans=0.125 2023-10-03 01:26:41,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 01:26:45,356 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.93 vs. limit=15.0 2023-10-03 01:26:46,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1089026.6666666667, ans=0.125 2023-10-03 01:26:49,255 INFO [train.py:1046] (1/4) Epoch 31, batch 4000, loss[loss=0.1535, simple_loss=0.2282, pruned_loss=0.03944, over 24344.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2411, pruned_loss=0.04226, over 4717466.51 frames. ], batch size: 56, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:26:50,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:26:56,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:27:01,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:27:02,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:27:03,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:27:03,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 01:27:04,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:27:05,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 01:27:05,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:27:05,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 01:27:06,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:27:09,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:27:09,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:27:09,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:27:09,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:27:09,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 01:27:11,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:27:13,150 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 01:27:14,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:27:14,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:27:18,880 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 01:27:20,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:27:20,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:27:24,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1089226.6666666667, ans=0.95 2023-10-03 01:27:25,335 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=15.52 vs. limit=15.0 2023-10-03 01:27:27,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 01:27:27,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:27:28,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:27:29,916 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 01:27:30,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1089226.6666666667, ans=0.2 2023-10-03 01:27:31,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:27:31,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 01:27:31,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:27:32,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1089293.3333333333, ans=0.1 2023-10-03 01:27:33,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:27:33,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:27:34,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:27:36,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:27:36,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:27:37,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 01:27:37,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:27:40,041 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 01:27:44,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:27:47,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 01:27:49,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:27:51,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:27:51,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:27:52,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:27:58,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:27:59,514 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.852e+02 2.002e+02 2.226e+02 3.079e+02, threshold=4.003e+02, percent-clipped=0.0 2023-10-03 01:27:59,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 01:28:01,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 01:28:02,497 INFO [train.py:1046] (1/4) Epoch 31, batch 4050, loss[loss=0.147, simple_loss=0.2294, pruned_loss=0.03236, over 24443.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2419, pruned_loss=0.0428, over 4708536.04 frames. ], batch size: 63, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:28:04,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:28:04,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:28:04,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1089426.6666666667, ans=0.125 2023-10-03 01:28:05,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:28:06,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:28:07,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:28:11,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:28:14,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:28:14,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 01:28:14,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1089426.6666666667, ans=0.0 2023-10-03 01:28:18,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:28:18,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:28:22,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:28:24,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:28:26,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 01:28:28,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 01:28:29,689 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 01:28:31,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:28:36,962 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:28:38,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 01:28:39,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:28:40,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1089560.0, ans=0.1 2023-10-03 01:28:42,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:28:45,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:28:45,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:28:46,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:28:49,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:28:50,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 01:28:50,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 01:28:52,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:28:55,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 01:28:59,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:29:06,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 01:29:06,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:29:06,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:29:09,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 01:29:09,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 01:29:09,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:29:11,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1089693.3333333333, ans=0.0 2023-10-03 01:29:12,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:29:13,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:29:15,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:29:15,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1089760.0, ans=0.95 2023-10-03 01:29:16,975 INFO [train.py:1046] (1/4) Epoch 31, batch 4100, loss[loss=0.1715, simple_loss=0.2437, pruned_loss=0.04969, over 23325.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2427, pruned_loss=0.04315, over 4698257.99 frames. ], batch size: 119, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:29:17,615 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.81 vs. limit=15.0 2023-10-03 01:29:21,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 01:29:23,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 01:29:25,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 01:29:26,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 01:29:26,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:29:26,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:29:26,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:29:26,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:29:26,670 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 01:29:29,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:29:32,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:29:32,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:29:32,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:29:33,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1089826.6666666667, ans=0.0 2023-10-03 01:29:33,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1089826.6666666667, ans=0.125 2023-10-03 01:29:36,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:29:37,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:29:37,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:29:37,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 01:29:40,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:29:40,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:29:40,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:29:40,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:29:41,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 01:29:42,612 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.29 vs. limit=15.0 2023-10-03 01:29:43,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:29:44,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 01:29:46,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:29:49,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:29:49,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 01:29:51,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:29:52,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:29:52,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:29:55,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 01:29:57,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:29:57,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=1089893.3333333333, ans=0.05 2023-10-03 01:29:58,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:30:00,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1089960.0, ans=0.125 2023-10-03 01:30:01,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 01:30:01,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:30:01,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:30:04,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:30:08,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:30:13,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:30:13,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:30:16,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1090026.6666666667, ans=0.125 2023-10-03 01:30:18,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:30:19,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:30:22,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:30:24,938 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.88 vs. limit=15.0 2023-10-03 01:30:25,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:30:28,710 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.780e+02 1.987e+02 2.300e+02 3.252e+02, threshold=3.974e+02, percent-clipped=0.0 2023-10-03 01:30:28,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:30:30,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:30:31,473 INFO [train.py:1046] (1/4) Epoch 31, batch 4150, loss[loss=0.1475, simple_loss=0.2269, pruned_loss=0.03401, over 24393.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2431, pruned_loss=0.04301, over 4703302.38 frames. ], batch size: 56, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:30:31,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:30:31,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:30:34,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 01:30:34,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:30:35,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 01:30:36,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 01:30:36,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 01:30:38,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:30:42,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:30:42,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:30:44,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1090160.0, ans=0.0 2023-10-03 01:30:45,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:30:45,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:30:47,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 01:30:48,035 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.21 vs. limit=15.0 2023-10-03 01:30:50,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 01:30:51,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:30:51,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 01:30:57,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:30:59,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1090226.6666666667, ans=0.2 2023-10-03 01:30:59,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1090226.6666666667, ans=0.125 2023-10-03 01:31:01,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1090226.6666666667, ans=0.125 2023-10-03 01:31:02,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:31:03,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 01:31:05,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 01:31:05,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:31:06,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 01:31:06,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:31:06,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:31:06,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1090226.6666666667, ans=0.0 2023-10-03 01:31:08,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1090226.6666666667, ans=0.125 2023-10-03 01:31:09,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:31:11,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:31:13,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 01:31:16,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:31:19,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:31:19,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 01:31:19,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:31:20,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 01:31:22,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:31:24,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:31:25,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:31:27,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 01:31:27,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:31:27,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 01:31:29,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:31:33,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 01:31:33,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:31:33,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:31:33,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:31:34,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 01:31:34,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:31:35,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:31:35,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:31:36,413 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.06 vs. limit=15.0 2023-10-03 01:31:37,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:31:37,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 01:31:38,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:31:43,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:31:44,719 INFO [train.py:1046] (1/4) Epoch 31, batch 4200, loss[loss=0.1431, simple_loss=0.2293, pruned_loss=0.02851, over 24668.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2423, pruned_loss=0.04309, over 4700842.34 frames. ], batch size: 65, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:31:44,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 01:31:45,127 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:31:47,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:31:49,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:31:50,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:31:50,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:31:52,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:31:54,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 01:31:55,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1090426.6666666667, ans=0.0 2023-10-03 01:31:57,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 01:31:58,128 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.08 vs. limit=8.0 2023-10-03 01:31:58,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:31:59,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:32:03,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:32:05,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1090493.3333333333, ans=0.09899494936611666 2023-10-03 01:32:07,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 01:32:07,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1090493.3333333333, ans=0.125 2023-10-03 01:32:08,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:32:08,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:32:09,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 01:32:09,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:32:10,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:32:10,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:32:11,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:32:13,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:32:14,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 01:32:14,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:32:15,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1090560.0, ans=0.125 2023-10-03 01:32:20,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:32:20,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:32:23,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:32:24,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:32:27,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:32:27,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 01:32:28,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1090626.6666666667, ans=0.1 2023-10-03 01:32:29,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:32:29,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:32:31,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1090626.6666666667, ans=0.125 2023-10-03 01:32:33,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:32:34,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:32:41,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:32:43,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 01:32:45,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:32:49,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 01:32:51,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:32:53,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 01:32:56,585 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.884e+02 2.053e+02 2.259e+02 3.350e+02, threshold=4.107e+02, percent-clipped=0.0 2023-10-03 01:32:58,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1090760.0, ans=0.125 2023-10-03 01:32:59,319 INFO [train.py:1046] (1/4) Epoch 31, batch 4250, loss[loss=0.1655, simple_loss=0.2318, pruned_loss=0.04964, over 23616.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2408, pruned_loss=0.04296, over 4707649.42 frames. ], batch size: 256, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:32:59,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:33:03,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:33:03,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:33:06,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:33:10,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1090760.0, ans=0.125 2023-10-03 01:33:11,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:33:11,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 01:33:11,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:33:15,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:33:18,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:33:21,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:33:21,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:33:23,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:33:23,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:33:24,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:33:26,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:33:26,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:33:28,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:33:30,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:33:31,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 01:33:32,259 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=12.11 vs. limit=10.0 2023-10-03 01:33:34,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 01:33:34,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:33:35,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:33:35,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:33:37,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:33:37,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:33:37,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:33:38,201 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.04 vs. limit=6.0 2023-10-03 01:33:39,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 01:33:41,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:33:46,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:33:49,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:33:49,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 01:33:49,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:33:51,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 01:33:52,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:33:53,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:33:55,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:33:55,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:33:57,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 01:33:59,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:33:59,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:34:02,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1091026.6666666667, ans=0.125 2023-10-03 01:34:03,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:34:06,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:34:07,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:34:08,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:34:11,559 INFO [train.py:1046] (1/4) Epoch 31, batch 4300, loss[loss=0.1662, simple_loss=0.2423, pruned_loss=0.04501, over 23209.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.24, pruned_loss=0.0427, over 4699796.70 frames. ], batch size: 105, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:34:11,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:34:11,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:34:13,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:34:13,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 01:34:15,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:34:21,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:34:21,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:34:24,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:34:31,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:34:31,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 01:34:33,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:34:34,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:34:34,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:34:34,689 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 01:34:37,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:34:40,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:34:42,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 01:34:42,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:34:44,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 01:34:47,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 01:34:47,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:34:50,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:34:50,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:34:52,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:34:53,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:34:55,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:34:55,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 01:34:56,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 01:34:58,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:35:00,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:35:00,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:35:00,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:35:00,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:35:00,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 01:35:00,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 01:35:02,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 01:35:03,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:35:03,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 01:35:03,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 01:35:07,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:35:08,991 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 01:35:10,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:35:11,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:35:11,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:35:15,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 01:35:15,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:35:15,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:35:15,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:35:16,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:35:16,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:35:17,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1091360.0, ans=0.5 2023-10-03 01:35:19,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:35:22,330 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.858e+02 1.987e+02 2.158e+02 3.474e+02, threshold=3.975e+02, percent-clipped=0.0 2023-10-03 01:35:22,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:35:24,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:35:24,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:35:25,700 INFO [train.py:1046] (1/4) Epoch 31, batch 4350, loss[loss=0.1606, simple_loss=0.2555, pruned_loss=0.03291, over 24463.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.24, pruned_loss=0.04227, over 4715731.64 frames. ], batch size: 69, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:35:29,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 01:35:29,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 01:35:33,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:35:34,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:35:36,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:35:36,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:35:40,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:35:40,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1091493.3333333333, ans=0.95 2023-10-03 01:35:43,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:35:46,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:35:46,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:35:46,980 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.47 vs. limit=22.5 2023-10-03 01:35:49,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:35:51,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:35:52,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1091493.3333333333, ans=0.125 2023-10-03 01:35:54,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:36:00,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 01:36:01,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:36:01,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:07,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:08,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 01:36:08,802 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:36:11,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:36:13,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:36:17,596 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 01:36:18,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:36:19,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:36:19,104 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 01:36:19,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1091626.6666666667, ans=0.125 2023-10-03 01:36:20,445 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 01:36:20,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:36:20,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:36:21,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:36:23,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:36:23,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:36:25,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:36:28,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 01:36:28,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:28,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:36:28,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:29,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 01:36:29,810 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 01:36:29,814 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 01:36:31,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 01:36:33,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:36:34,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:36:34,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:36:35,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:36:36,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 01:36:38,106 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 01:36:38,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:39,826 INFO [train.py:1046] (1/4) Epoch 31, batch 4400, loss[loss=0.1733, simple_loss=0.2481, pruned_loss=0.0492, over 24048.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2409, pruned_loss=0.04244, over 4711474.83 frames. ], batch size: 86, lr: 3.28e-03, grad_scale: 32.0 2023-10-03 01:36:42,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:36:42,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:45,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:36:46,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 01:36:48,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 01:36:48,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 01:36:48,309 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 01:36:49,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 01:36:49,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:36:51,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 01:36:51,669 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.36 vs. limit=15.0 2023-10-03 01:36:52,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:54,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:36:54,347 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 01:36:58,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:36:58,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 01:36:58,252 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 01:37:02,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 01:37:03,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 01:37:04,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 01:37:05,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:37:05,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:37:06,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:37:06,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:37:07,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 01:37:07,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 01:37:09,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:37:12,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:37:12,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:37:13,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:37:13,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:37:13,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 01:37:15,186 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 01:37:19,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:37:25,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:37:27,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 01:37:30,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:37:33,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:37:36,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:37:36,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 01:37:36,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:37:37,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:37:37,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:37:39,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:37:44,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 01:37:45,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 01:37:46,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 01:37:46,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:37:46,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 01:37:48,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:37:50,997 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.506e+02 1.843e+02 2.055e+02 2.239e+02 3.074e+02, threshold=4.110e+02, percent-clipped=0.0 2023-10-03 01:37:51,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:37:52,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 01:37:52,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1092093.3333333333, ans=0.125 2023-10-03 01:37:53,807 INFO [train.py:1046] (1/4) Epoch 31, batch 4450, loss[loss=0.1712, simple_loss=0.2468, pruned_loss=0.04779, over 23505.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2415, pruned_loss=0.04209, over 4723375.41 frames. ], batch size: 134, lr: 3.28e-03, grad_scale: 32.0 2023-10-03 01:37:57,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:38:00,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:38:00,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:38:07,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:38:07,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:38:10,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:38:13,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:38:15,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:38:15,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:38:18,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 01:38:18,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:38:18,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:38:19,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:38:19,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:38:22,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:38:23,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1092226.6666666667, ans=0.0 2023-10-03 01:38:26,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:38:26,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:38:28,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:38:28,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:38:29,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:38:32,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 01:38:34,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 01:38:34,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 01:38:34,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:38:37,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:38:38,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 01:38:42,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:38:45,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:38:45,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 01:38:47,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:38:47,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:38:47,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:38:47,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:38:48,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:38:51,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 01:38:52,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 01:38:54,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:38:56,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:38:57,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:39:00,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:39:00,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 01:39:01,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:39:04,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 01:39:06,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:39:07,431 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.11 vs. limit=12.0 2023-10-03 01:39:07,821 INFO [train.py:1046] (1/4) Epoch 31, batch 4500, loss[loss=0.15, simple_loss=0.2123, pruned_loss=0.04388, over 22723.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2416, pruned_loss=0.04211, over 4731854.66 frames. ], batch size: 322, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:39:12,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:39:13,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 01:39:13,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 01:39:14,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:39:18,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1092426.6666666667, ans=0.5 2023-10-03 01:39:18,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.whiten.whitening_limit, batch_count=1092426.6666666667, ans=15.0 2023-10-03 01:39:19,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:39:20,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:39:20,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:39:22,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:39:22,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:39:22,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:39:33,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:39:35,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1092560.0, ans=0.0 2023-10-03 01:39:36,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:39:37,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:39:38,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:39:40,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:39:44,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:39:47,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:39:52,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:39:54,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:39:56,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 01:39:56,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:39:57,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:39:59,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:40:00,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:40:00,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:40:00,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 01:40:00,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:40:02,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:40:08,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:40:08,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:40:11,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:40:14,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:40:14,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:40:15,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 01:40:17,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 01:40:17,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 01:40:20,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 01:40:21,550 INFO [train.py:1046] (1/4) Epoch 31, batch 4550, loss[loss=0.164, simple_loss=0.2504, pruned_loss=0.03882, over 24578.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2415, pruned_loss=0.0422, over 4727120.88 frames. ], batch size: 71, lr: 3.28e-03, grad_scale: 4.0 2023-10-03 01:40:22,876 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.951e+02 2.109e+02 2.572e+02 4.097e+02, threshold=4.218e+02, percent-clipped=0.0 2023-10-03 01:40:23,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 01:40:24,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:40:27,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:40:27,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:40:30,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:40:33,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1092760.0, ans=0.0 2023-10-03 01:40:34,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:40:37,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:40:37,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:40:37,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:40:37,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:40:40,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:40:41,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:40:44,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:40:46,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 01:40:48,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 01:40:49,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:40:49,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 01:40:50,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 01:40:52,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:40:52,910 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.10 vs. limit=15.0 2023-10-03 01:40:56,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 01:40:58,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:41:00,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:00,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:00,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:41:03,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 01:41:05,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:41:07,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:08,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:41:09,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:41:12,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 01:41:12,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 01:41:12,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:41:13,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 01:41:16,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 01:41:16,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:41:17,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:41:17,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:41:18,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:18,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:41:19,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1093026.6666666667, ans=0.125 2023-10-03 01:41:20,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:41:20,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1093026.6666666667, ans=0.125 2023-10-03 01:41:21,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 01:41:22,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:41:22,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 01:41:23,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 01:41:24,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:41:24,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 01:41:24,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1093026.6666666667, ans=0.0 2023-10-03 01:41:25,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1093026.6666666667, ans=0.125 2023-10-03 01:41:27,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:41:27,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:41:30,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:41:30,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:30,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 01:41:33,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:41:34,910 INFO [train.py:1046] (1/4) Epoch 31, batch 4600, loss[loss=0.1811, simple_loss=0.2527, pruned_loss=0.05473, over 23790.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2407, pruned_loss=0.04205, over 4725916.71 frames. ], batch size: 212, lr: 3.27e-03, grad_scale: 8.0 2023-10-03 01:41:34,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:41:36,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:41:37,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:41:40,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:41:40,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:41:42,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:41:44,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 01:41:45,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:41:49,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:41:51,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:41:52,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:41:58,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 01:41:58,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:02,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:04,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:42:04,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:42:05,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1093226.6666666667, ans=0.0 2023-10-03 01:42:12,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 01:42:12,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:42:12,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:42:15,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1093226.6666666667, ans=0.0 2023-10-03 01:42:18,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:18,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:42:20,589 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.71 vs. limit=6.0 2023-10-03 01:42:21,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:42:23,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 01:42:24,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 01:42:26,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:42:30,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:42:32,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:42:32,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 01:42:33,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:34,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 01:42:34,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:42:34,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:42:38,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:42:38,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:42:40,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:42:41,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 01:42:41,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 01:42:43,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 01:42:43,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:42:43,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1093360.0, ans=0.125 2023-10-03 01:42:44,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:42:45,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:42:47,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:42:51,990 INFO [train.py:1046] (1/4) Epoch 31, batch 4650, loss[loss=0.1538, simple_loss=0.239, pruned_loss=0.03425, over 24469.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2401, pruned_loss=0.04172, over 4731593.78 frames. ], batch size: 66, lr: 3.27e-03, grad_scale: 8.0 2023-10-03 01:42:53,310 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.466e+02 1.809e+02 1.983e+02 2.209e+02 6.032e+02, threshold=3.967e+02, percent-clipped=1.0 2023-10-03 01:42:56,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:42:57,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:42:57,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:59,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:42:59,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:42:59,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:42:59,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:43:03,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 01:43:06,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:43:09,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 01:43:09,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:43:11,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 01:43:11,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:43:11,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1093493.3333333333, ans=0.125 2023-10-03 01:43:12,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 01:43:12,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 01:43:12,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:43:12,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:43:16,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:43:16,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:43:18,248 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 01:43:18,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1093493.3333333333, ans=0.2 2023-10-03 01:43:21,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:43:22,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 01:43:24,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:43:24,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:43:25,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 01:43:26,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:43:29,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:43:29,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1093560.0, ans=0.0 2023-10-03 01:43:33,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:43:39,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:43:41,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:43:41,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:43:41,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:43:43,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1093626.6666666667, ans=0.125 2023-10-03 01:43:45,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 01:43:46,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 01:43:46,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 01:43:46,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 01:43:48,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:43:53,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:43:55,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:43:55,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 01:43:55,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:43:56,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:43:56,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:43:56,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:43:58,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1093693.3333333333, ans=0.125 2023-10-03 01:43:59,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:44:00,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:44:00,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:44:04,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:44:04,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:44:04,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:44:06,272 INFO [train.py:1046] (1/4) Epoch 31, batch 4700, loss[loss=0.1717, simple_loss=0.2446, pruned_loss=0.04941, over 23776.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2413, pruned_loss=0.04208, over 4732375.84 frames. ], batch size: 212, lr: 3.27e-03, grad_scale: 8.0 2023-10-03 01:44:06,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 01:44:06,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:44:08,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 01:44:17,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:44:17,947 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.03 vs. limit=15.0 2023-10-03 01:44:18,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:44:19,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:44:22,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:44:23,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 01:44:27,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 01:44:27,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 01:44:28,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:44:30,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:44:31,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:44:34,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:44:34,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1093893.3333333333, ans=0.0 2023-10-03 01:44:38,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:44:39,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 01:44:41,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:44:41,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=1093893.3333333333, ans=0.95 2023-10-03 01:44:49,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 01:44:50,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:44:52,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:44:55,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 01:44:56,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:45:00,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:45:01,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 01:45:02,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:45:02,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:45:02,745 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.40 vs. limit=12.0 2023-10-03 01:45:04,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:45:04,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:45:06,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 01:45:06,246 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 01:45:07,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:45:09,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1094026.6666666667, ans=0.125 2023-10-03 01:45:10,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:45:10,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:45:10,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 01:45:12,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:45:15,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 01:45:18,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:45:18,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:45:20,105 INFO [train.py:1046] (1/4) Epoch 31, batch 4750, loss[loss=0.1477, simple_loss=0.2221, pruned_loss=0.03668, over 22144.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2418, pruned_loss=0.04225, over 4718815.70 frames. ], batch size: 48, lr: 3.27e-03, grad_scale: 8.0 2023-10-03 01:45:21,362 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.883e+02 2.081e+02 2.313e+02 2.638e+02, threshold=4.163e+02, percent-clipped=0.0 2023-10-03 01:45:21,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1094093.3333333333, ans=0.2 2023-10-03 01:45:22,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:45:22,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:45:24,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 01:45:25,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:45:28,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 01:45:28,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1094093.3333333333, ans=0.125 2023-10-03 01:45:31,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:45:31,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:45:31,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:45:36,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 01:45:41,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:45:42,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 01:45:42,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:45:48,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:45:48,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:45:48,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:45:48,133 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 01:45:48,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 01:45:54,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 01:45:56,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:45:59,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:46:01,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:46:01,119 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 01:46:01,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:46:05,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:46:06,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:46:09,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 01:46:09,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 01:46:10,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:46:10,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:46:11,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1094293.3333333333, ans=0.0 2023-10-03 01:46:11,550 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.98 vs. limit=15.0 2023-10-03 01:46:12,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:46:12,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:46:12,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 01:46:14,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 01:46:18,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:46:21,165 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.07 vs. limit=15.0 2023-10-03 01:46:21,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:46:21,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 01:46:23,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:46:24,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1094360.0, ans=10.0 2023-10-03 01:46:25,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:46:25,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1094360.0, ans=0.0 2023-10-03 01:46:26,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:46:26,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:46:27,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:46:29,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:46:30,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 01:46:32,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 01:46:33,418 INFO [train.py:1046] (1/4) Epoch 31, batch 4800, loss[loss=0.1869, simple_loss=0.2697, pruned_loss=0.05204, over 23253.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2427, pruned_loss=0.04266, over 4716449.91 frames. ], batch size: 93, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:46:33,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 01:46:33,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:46:34,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:46:34,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 01:46:39,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:46:39,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:46:42,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1094426.6666666667, ans=0.125 2023-10-03 01:46:45,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:46:46,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:46:46,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:46:46,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 01:46:50,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:46:50,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:46:50,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1094493.3333333333, ans=0.125 2023-10-03 01:46:52,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:46:54,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:46:57,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:46:57,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:46:58,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:46:58,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 01:46:58,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:46:59,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:47:00,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:47:03,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:47:04,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:47:06,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:47:07,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:47:10,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:47:10,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 01:47:10,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 01:47:11,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:47:12,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:47:13,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:47:13,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:47:13,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:47:16,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:47:16,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:47:21,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:47:24,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:24,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:47:28,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1094626.6666666667, ans=0.125 2023-10-03 01:47:29,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 01:47:29,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:47:30,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:30,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:47:30,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:47:34,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:47:37,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:47:37,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:37,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:47:37,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:47:39,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:47:43,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:47:43,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:43,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:47:44,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 01:47:44,716 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:47:45,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 01:47:45,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:47:45,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:47:47,766 INFO [train.py:1046] (1/4) Epoch 31, batch 4850, loss[loss=0.1553, simple_loss=0.2278, pruned_loss=0.04139, over 24438.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2437, pruned_loss=0.04295, over 4716660.83 frames. ], batch size: 58, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:47:47,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:47:47,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:49,140 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.923e+02 2.074e+02 2.370e+02 4.081e+02, threshold=4.148e+02, percent-clipped=0.0 2023-10-03 01:47:51,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:47:58,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 01:47:59,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:48:04,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:48:06,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:48:06,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:48:09,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:48:10,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:48:11,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:48:11,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 01:48:14,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:48:17,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:48:17,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:48:17,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:48:17,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 01:48:19,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1094893.3333333333, ans=0.0 2023-10-03 01:48:22,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:48:22,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:48:26,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:48:26,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 01:48:26,714 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.78 vs. limit=12.0 2023-10-03 01:48:27,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 01:48:27,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:48:33,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:48:34,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 01:48:34,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:48:36,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:48:37,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:48:38,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 01:48:38,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:48:40,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 01:48:40,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:48:43,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:48:43,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 01:48:43,771 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.22 vs. limit=15.0 2023-10-03 01:48:51,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1095026.6666666667, ans=0.2 2023-10-03 01:48:52,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:48:52,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1095026.6666666667, ans=0.125 2023-10-03 01:49:01,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:49:01,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:49:03,077 INFO [train.py:1046] (1/4) Epoch 31, batch 4900, loss[loss=0.1742, simple_loss=0.2581, pruned_loss=0.04513, over 24311.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2424, pruned_loss=0.04277, over 4727174.15 frames. ], batch size: 77, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:49:06,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 01:49:06,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:49:11,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:49:13,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:49:13,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:49:14,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 01:49:17,861 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.21 vs. limit=15.0 2023-10-03 01:49:20,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 01:49:22,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 01:49:24,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 01:49:24,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:49:24,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:49:24,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:49:24,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:49:24,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:49:26,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 01:49:27,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1095160.0, ans=0.125 2023-10-03 01:49:29,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 01:49:30,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:49:30,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:49:31,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:49:33,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:49:34,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:49:35,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:49:35,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 01:49:37,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:49:38,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:49:38,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 01:49:38,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 01:49:43,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 01:49:44,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:49:45,261 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.17 vs. limit=10.0 2023-10-03 01:49:45,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:49:45,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:49:46,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1095293.3333333333, ans=0.07 2023-10-03 01:49:46,725 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.85 vs. limit=15.0 2023-10-03 01:49:47,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:49:47,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 01:49:47,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:49:47,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1095293.3333333333, ans=0.0 2023-10-03 01:49:48,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 01:49:49,731 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.66 vs. limit=22.5 2023-10-03 01:49:50,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:49:51,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 01:49:52,208 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.27 vs. limit=15.0 2023-10-03 01:49:55,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:49:56,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 01:49:57,538 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.14 vs. limit=10.0 2023-10-03 01:49:58,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:49:58,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 01:49:58,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 01:50:04,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:50:06,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:50:08,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 01:50:08,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:50:08,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:50:09,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:50:13,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:50:13,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:50:13,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:50:15,116 INFO [train.py:1046] (1/4) Epoch 31, batch 4950, loss[loss=0.1568, simple_loss=0.2187, pruned_loss=0.04741, over 22826.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.24, pruned_loss=0.04227, over 4713094.49 frames. ], batch size: 322, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:50:15,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 01:50:16,401 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.656e+02 1.910e+02 2.098e+02 2.377e+02 3.455e+02, threshold=4.195e+02, percent-clipped=0.0 2023-10-03 01:50:16,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:50:18,799 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.13 vs. limit=15.0 2023-10-03 01:50:19,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:50:19,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:50:21,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 01:50:22,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 01:50:22,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:50:22,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 01:50:22,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:50:22,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:50:24,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:50:24,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:50:26,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1095426.6666666667, ans=0.0 2023-10-03 01:50:27,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:50:27,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:50:28,198 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.77 vs. limit=10.0 2023-10-03 01:50:28,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:50:30,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:50:32,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:50:32,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:50:35,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:50:36,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1095493.3333333333, ans=0.1 2023-10-03 01:50:39,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:50:41,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:50:41,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:50:43,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:50:44,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:50:44,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 01:50:46,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 01:50:47,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:50:51,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:50:51,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:50:52,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:50:52,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:50:54,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:50:55,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:50:58,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:51:00,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:51:02,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:51:02,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:51:03,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 01:51:03,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:51:06,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:51:10,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:51:10,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1095626.6666666667, ans=0.125 2023-10-03 01:51:11,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:51:11,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:51:11,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:51:11,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:51:13,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:51:15,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:51:15,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:51:15,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:51:17,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 01:51:20,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1095693.3333333333, ans=0.035 2023-10-03 01:51:21,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:51:26,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 01:51:26,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 01:51:28,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1095760.0, ans=0.125 2023-10-03 01:51:29,248 INFO [train.py:1046] (1/4) Epoch 31, batch 5000, loss[loss=0.1781, simple_loss=0.2568, pruned_loss=0.04972, over 23198.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2397, pruned_loss=0.04198, over 4713226.71 frames. ], batch size: 93, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:51:32,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:51:32,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:51:34,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 01:51:35,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 01:51:36,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:51:39,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 01:51:39,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:51:39,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:51:39,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1095760.0, ans=0.125 2023-10-03 01:51:40,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 01:51:40,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:51:41,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:51:42,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 01:51:42,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:51:42,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:51:43,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 01:51:43,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 01:51:45,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:51:46,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 01:51:46,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:51:46,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:51:48,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:51:48,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 01:51:48,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 01:51:50,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 01:51:50,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:51:51,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:51:51,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 01:51:53,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:51:54,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:51:56,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:51:57,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 01:52:00,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 01:52:00,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:52:01,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:52:04,412 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 01:52:06,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:52:06,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:52:06,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:10,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 01:52:10,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:52:10,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:52:11,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:52:13,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 01:52:13,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:52:16,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:52:18,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:52:24,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 01:52:28,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:38,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:52:38,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:38,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:52:38,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:52:40,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:52:40,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:52:40,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:41,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1096093.3333333333, ans=0.07 2023-10-03 01:52:42,983 INFO [train.py:1046] (1/4) Epoch 31, batch 5050, loss[loss=0.1498, simple_loss=0.2382, pruned_loss=0.03076, over 24627.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2404, pruned_loss=0.04235, over 4697955.52 frames. ], batch size: 68, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:52:43,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:43,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 01:52:44,301 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.838e+02 2.026e+02 2.267e+02 4.820e+02, threshold=4.051e+02, percent-clipped=1.0 2023-10-03 01:52:44,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:52:45,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:52:47,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:52:49,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 01:52:49,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:52:50,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:52:53,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:52:55,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:52:55,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:53:03,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 01:53:03,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 01:53:05,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:53:05,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 01:53:05,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:53:06,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:53:06,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:53:06,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:53:06,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 01:53:08,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 01:53:09,039 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.44 vs. limit=12.0 2023-10-03 01:53:09,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:53:11,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:53:13,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:53:15,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 01:53:15,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:53:16,604 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.09 vs. limit=15.0 2023-10-03 01:53:19,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 01:53:20,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:53:20,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:53:21,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:53:21,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:53:24,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:53:26,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:53:27,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:53:27,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:53:27,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:53:29,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 01:53:29,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:53:30,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:53:35,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:53:35,313 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 01:53:35,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 01:53:36,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:53:38,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:53:38,073 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 01:53:40,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:53:40,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 01:53:40,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:53:41,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1096360.0, ans=0.125 2023-10-03 01:53:43,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:53:44,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:53:44,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 01:53:46,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 01:53:48,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:53:50,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:53:50,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:53:54,191 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 01:53:55,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:53:56,878 INFO [train.py:1046] (1/4) Epoch 31, batch 5100, loss[loss=0.1649, simple_loss=0.2387, pruned_loss=0.04552, over 23821.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2412, pruned_loss=0.04269, over 4688756.86 frames. ], batch size: 195, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:53:58,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 01:53:58,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 01:54:00,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:54:01,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:54:04,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:54:04,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 01:54:04,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 01:54:10,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:54:11,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:54:14,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:54:16,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 01:54:17,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:54:18,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:54:20,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 01:54:22,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:54:23,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:54:23,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 01:54:27,523 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 01:54:27,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:54:27,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 01:54:27,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 01:54:27,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1096560.0, ans=0.1 2023-10-03 01:54:32,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:54:39,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:54:43,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 01:54:43,712 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 01:54:43,719 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 01:54:46,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 01:54:46,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:54:49,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 01:54:52,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 01:54:54,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:54:57,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:54:58,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 01:55:00,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 01:55:01,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 01:55:06,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:55:06,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:55:06,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:55:06,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:55:07,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:55:08,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:55:09,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 01:55:09,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 01:55:11,130 INFO [train.py:1046] (1/4) Epoch 31, batch 5150, loss[loss=0.1875, simple_loss=0.2596, pruned_loss=0.05775, over 23434.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.242, pruned_loss=0.04263, over 4708298.67 frames. ], batch size: 285, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:55:11,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 01:55:11,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:55:11,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 01:55:11,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:55:12,447 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 1.869e+02 2.053e+02 2.257e+02 3.083e+02, threshold=4.107e+02, percent-clipped=0.0 2023-10-03 01:55:12,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 01:55:13,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:55:16,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:55:22,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 01:55:23,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 01:55:24,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:55:24,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:55:28,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:55:28,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:55:28,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:55:28,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:55:30,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:55:30,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 01:55:31,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:55:31,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:55:34,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:55:37,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 01:55:39,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:55:42,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:55:44,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 01:55:46,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1096893.3333333333, ans=0.035 2023-10-03 01:55:46,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1096893.3333333333, ans=0.125 2023-10-03 01:55:48,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:55:52,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:55:52,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1096893.3333333333, ans=0.07 2023-10-03 01:55:53,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:55:55,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1096960.0, ans=0.125 2023-10-03 01:55:56,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:55:57,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:56:00,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 01:56:00,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1096960.0, ans=0.0 2023-10-03 01:56:04,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:56:05,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:56:05,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:56:07,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:56:09,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:56:10,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 01:56:13,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:56:15,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:56:16,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:56:18,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:56:18,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 01:56:18,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:56:18,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:56:19,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:56:22,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:56:23,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:56:25,001 INFO [train.py:1046] (1/4) Epoch 31, batch 5200, loss[loss=0.1596, simple_loss=0.2212, pruned_loss=0.04903, over 22712.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2425, pruned_loss=0.04278, over 4722192.68 frames. ], batch size: 322, lr: 3.27e-03, grad_scale: 32.0 2023-10-03 01:56:26,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:56:31,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 01:56:33,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:56:33,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:56:36,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:56:37,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:56:39,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:56:39,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 01:56:40,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1097160.0, ans=0.125 2023-10-03 01:56:41,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:56:43,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:56:45,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 01:56:48,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:56:49,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:56:49,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 01:56:49,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 01:56:52,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 01:56:52,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:56:52,488 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 01:56:52,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:56:53,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:56:53,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:56:55,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 01:56:56,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:57:01,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:57:03,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 01:57:03,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 01:57:03,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 01:57:07,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 01:57:07,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:57:14,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:57:14,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:57:16,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 01:57:17,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:57:17,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 01:57:17,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:57:18,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:57:21,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:57:23,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:57:24,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:57:26,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:57:26,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:57:26,622 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.14 vs. limit=15.0 2023-10-03 01:57:33,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:57:35,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 01:57:35,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:57:35,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:57:36,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:57:38,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 01:57:39,488 INFO [train.py:1046] (1/4) Epoch 31, batch 5250, loss[loss=0.1642, simple_loss=0.2519, pruned_loss=0.03822, over 24529.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2422, pruned_loss=0.04272, over 4732817.01 frames. ], batch size: 63, lr: 3.27e-03, grad_scale: 32.0 2023-10-03 01:57:39,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:57:40,854 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.931e+02 2.126e+02 2.506e+02 4.070e+02, threshold=4.253e+02, percent-clipped=0.0 2023-10-03 01:57:42,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:57:43,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:57:45,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:57:45,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:57:52,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:57:53,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:57:56,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:57:57,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:57:59,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 01:57:59,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:58:02,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:58:47,930 INFO [train.py:1046] (1/4) Epoch 31, batch 5300, loss[loss=0.1773, simple_loss=0.2501, pruned_loss=0.05225, over 23321.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2414, pruned_loss=0.04261, over 4723779.49 frames. ], batch size: 93, lr: 3.27e-03, grad_scale: 32.0 2023-10-03 01:59:02,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:59:02,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 01:59:02,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 01:59:02,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:59:02,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:59:02,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:59:02,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:59:02,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:59:02,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:03,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:59:03,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 01:59:03,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:59:03,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 01:59:03,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 01:59:03,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 01:59:03,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:59:03,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 01:59:03,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 01:59:03,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:59:04,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:59:04,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:59:04,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:59:04,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:59:04,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:59:04,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:59:04,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:59:05,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:59:05,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:59:05,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:59:05,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:59:05,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:59:05,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 01:59:05,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:59:06,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:59:06,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 01:59:06,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 01:59:06,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:59:06,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:59:06,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 01:59:06,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 01:59:06,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:59:06,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:59:06,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:59:07,024 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 01:59:07,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 01:59:07,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:59:07,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:59:07,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 01:59:07,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 01:59:07,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 01:59:08,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:59:14,804 INFO [train.py:1046] (1/4) Epoch 32, batch 0, loss[loss=0.1745, simple_loss=0.2627, pruned_loss=0.04314, over 24067.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2627, pruned_loss=0.04314, over 24067.00 frames. ], batch size: 80, lr: 3.21e-03, grad_scale: 32.0 2023-10-03 01:59:14,805 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 01:59:26,548 INFO [train.py:1078] (1/4) Epoch 32, validation: loss=0.3377, simple_loss=0.28, pruned_loss=0.1977, over 1125622.00 frames. 2023-10-03 01:59:26,548 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 01:59:26,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 01:59:26,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:59:28,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1097840.0, ans=0.1 2023-10-03 01:59:29,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:59:29,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1097840.0, ans=0.0 2023-10-03 01:59:33,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:59:33,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:59:34,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:35,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 01:59:36,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 01:59:38,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:39,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:41,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1097906.6666666667, ans=0.125 2023-10-03 01:59:42,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:42,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:59:43,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:59:43,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:59:44,516 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.14 vs. limit=15.0 2023-10-03 01:59:46,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 01:59:48,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:59:56,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:59:56,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:59:58,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 02:00:02,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=1097973.3333333333, ans=15.0 2023-10-03 02:00:02,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:00:02,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:00:05,613 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.65 vs. limit=10.0 2023-10-03 02:00:06,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:00:06,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1097973.3333333333, ans=0.125 2023-10-03 02:00:09,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:00:13,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:00:13,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1098040.0, ans=0.1 2023-10-03 02:00:18,296 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.68 vs. limit=15.0 2023-10-03 02:00:19,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 02:00:21,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 02:00:21,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:00:21,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:00:23,173 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.938e+02 2.224e+02 2.523e+02 4.024e+02, threshold=4.448e+02, percent-clipped=0.0 2023-10-03 02:00:23,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:00:23,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:00:24,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 02:00:27,113 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.08 vs. limit=12.0 2023-10-03 02:00:28,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:00:29,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1098106.6666666667, ans=0.125 2023-10-03 02:00:31,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:00:32,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1098106.6666666667, ans=0.1 2023-10-03 02:00:35,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:00:38,076 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 02:00:39,944 INFO [train.py:1046] (1/4) Epoch 32, batch 50, loss[loss=0.1657, simple_loss=0.2461, pruned_loss=0.04266, over 23446.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2428, pruned_loss=0.04243, over 1071809.74 frames. ], batch size: 134, lr: 3.21e-03, grad_scale: 32.0 2023-10-03 02:00:40,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:00:44,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:00:45,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:00:45,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 02:00:46,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:00:46,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:00:48,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:00:50,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:00:52,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:00:56,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 02:00:56,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:01:01,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:01:02,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 02:01:05,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 02:01:07,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:01:08,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:01:08,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:01:10,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:01:12,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:01:13,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 02:01:13,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:01:19,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1098306.6666666667, ans=0.2 2023-10-03 02:01:20,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:01:20,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:01:21,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:01:21,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 02:01:24,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:01:24,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:01:24,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 02:01:26,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:01:26,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1098373.3333333333, ans=0.125 2023-10-03 02:01:27,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 02:01:33,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:01:33,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:01:35,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:01:38,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:01:38,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:01:40,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 02:01:40,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 02:01:41,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:01:42,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:01:45,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:01:45,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:01:47,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 02:01:47,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 02:01:48,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 02:01:49,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:01:50,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:01:51,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 02:01:51,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 02:01:52,589 INFO [train.py:1046] (1/4) Epoch 32, batch 100, loss[loss=0.1469, simple_loss=0.2267, pruned_loss=0.03353, over 24595.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2439, pruned_loss=0.04305, over 1886413.58 frames. ], batch size: 60, lr: 3.21e-03, grad_scale: 32.0 2023-10-03 02:01:52,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:01:52,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:01:55,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:01:55,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:01:58,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:02:01,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:02:05,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:02:07,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 02:02:07,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:02:12,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:02:12,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:02:12,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:02:12,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:02:13,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:02:14,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 02:02:17,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 02:02:17,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:02:18,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:02:18,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:02:20,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1098640.0, ans=0.0 2023-10-03 02:02:22,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 02:02:22,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:02:23,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:02:23,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:02:26,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 02:02:29,505 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 02:02:29,519 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 02:02:30,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:02:30,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:02:31,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1098640.0, ans=0.2 2023-10-03 02:02:35,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:02:38,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:02:39,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:02:43,032 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.14 vs. limit=15.0 2023-10-03 02:02:43,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:02:45,396 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 02:02:47,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 02:02:47,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1098706.6666666667, ans=0.1 2023-10-03 02:02:49,851 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.824e+02 1.996e+02 2.210e+02 3.286e+02, threshold=3.993e+02, percent-clipped=0.0 2023-10-03 02:02:51,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:02:52,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:02:54,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:02:57,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:02:58,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:03:00,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:03:03,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:03:04,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:03:04,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:03:04,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:03:04,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:03:04,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1098840.0, ans=0.125 2023-10-03 02:03:04,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1098840.0, ans=0.1 2023-10-03 02:03:05,702 INFO [train.py:1046] (1/4) Epoch 32, batch 150, loss[loss=0.1632, simple_loss=0.253, pruned_loss=0.03666, over 24398.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2443, pruned_loss=0.04331, over 2512657.05 frames. ], batch size: 69, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:03:05,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 02:03:05,819 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 02:03:07,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:03:07,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:03:08,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:08,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:03:08,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 02:03:08,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 02:03:09,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 02:03:10,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:10,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:03:11,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:03:11,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:03:13,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:03:15,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:03:18,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:03:18,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:03:18,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:21,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:03:21,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:24,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:03:24,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:29,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 02:03:29,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 02:03:29,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 02:03:31,243 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:03:32,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:03:32,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:03:33,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:03:34,424 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.34 vs. limit=12.0 2023-10-03 02:03:35,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:03:35,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:03:35,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:36,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:39,782 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 02:03:41,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:03:45,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:03:46,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1098973.3333333333, ans=0.0 2023-10-03 02:03:50,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:03:51,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 02:03:55,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:03:55,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:03:55,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:03:57,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:04:00,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:04:00,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:04:03,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:04:03,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 02:04:07,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:04:09,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:04:09,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:04:09,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:04:10,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:04:12,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 02:04:13,149 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.98 vs. limit=22.5 2023-10-03 02:04:15,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:04:15,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:04:17,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:04:18,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:04:20,039 INFO [train.py:1046] (1/4) Epoch 32, batch 200, loss[loss=0.1781, simple_loss=0.247, pruned_loss=0.05466, over 23605.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2432, pruned_loss=0.04271, over 3004797.83 frames. ], batch size: 232, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:04:20,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 02:04:20,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:04:20,143 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 02:04:22,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1099173.3333333333, ans=0.1 2023-10-03 02:04:24,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:04:26,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:04:28,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:04:30,262 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.22 vs. limit=15.0 2023-10-03 02:04:30,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 02:04:32,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:04:32,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:04:34,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 02:04:36,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:04:37,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:04:39,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:04:41,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1099240.0, ans=0.1 2023-10-03 02:04:42,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:04:42,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:04:43,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:04:47,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1099240.0, ans=0.125 2023-10-03 02:05:00,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:05:00,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:05:01,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:05:03,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:05:03,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 02:05:03,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 02:05:03,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1099373.3333333333, ans=0.07 2023-10-03 02:05:04,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:05,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:05:07,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:05:07,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:05:07,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 02:05:09,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 02:05:09,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:05:12,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:05:12,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_na.min_abs, batch_count=1099373.3333333333, ans=0.02 2023-10-03 02:05:13,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1099373.3333333333, ans=0.04949747468305833 2023-10-03 02:05:17,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1099440.0, ans=0.125 2023-10-03 02:05:18,855 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.373e+02 1.798e+02 1.948e+02 2.252e+02 2.874e+02, threshold=3.895e+02, percent-clipped=0.0 2023-10-03 02:05:18,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:05:25,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:25,876 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:05:27,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:05:31,370 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.28 vs. limit=15.0 2023-10-03 02:05:33,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:34,322 INFO [train.py:1046] (1/4) Epoch 32, batch 250, loss[loss=0.1531, simple_loss=0.2422, pruned_loss=0.03198, over 24595.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2435, pruned_loss=0.04234, over 3385539.25 frames. ], batch size: 71, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:05:34,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 02:05:34,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1099506.6666666667, ans=0.0 2023-10-03 02:05:35,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:05:35,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:05:35,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:05:37,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:05:37,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 02:05:39,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:05:39,153 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 02:05:40,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:43,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:05:44,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:44,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:05:47,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:05:47,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:48,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:05:51,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:06:01,021 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.60 vs. limit=6.0 2023-10-03 02:06:01,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:06:02,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:06:04,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:06:09,277 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:06:10,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:06:11,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:06:12,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1099640.0, ans=0.125 2023-10-03 02:06:13,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:06:14,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:06:15,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 02:06:15,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:06:15,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:06:16,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:06:18,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 02:06:20,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:06:21,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:06:21,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:06:21,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:06:21,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:06:23,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:06:23,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:06:25,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:06:26,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:06:28,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:06:30,312 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.00 vs. limit=15.0 2023-10-03 02:06:30,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:06:35,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:06:36,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:06:42,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:06:43,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:06:45,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 02:06:46,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:06:46,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:06:48,128 INFO [train.py:1046] (1/4) Epoch 32, batch 300, loss[loss=0.1697, simple_loss=0.2607, pruned_loss=0.03936, over 24399.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2423, pruned_loss=0.04206, over 3679095.32 frames. ], batch size: 77, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:06:48,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 02:06:49,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:06:51,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:06:51,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 02:06:55,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:06:56,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:06:59,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:06:59,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 02:07:00,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:07:00,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:07:00,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 02:07:00,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:07:05,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 02:07:10,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:07:10,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 02:07:13,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 02:07:13,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:14,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:07:17,024 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.94 vs. limit=6.0 2023-10-03 02:07:17,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:17,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 02:07:17,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:07:19,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:07:21,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:07:21,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:07:27,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 02:07:27,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 02:07:27,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:07:30,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:31,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 02:07:31,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:07:33,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1100040.0, ans=0.0 2023-10-03 02:07:34,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:07:37,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:07:37,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 02:07:40,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:40,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:07:43,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:45,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:07:45,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 02:07:45,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 02:07:46,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:07:47,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 02:07:47,533 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=14.68 vs. limit=15.0 2023-10-03 02:07:50,054 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.863e+02 2.088e+02 2.387e+02 3.568e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-03 02:07:50,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:50,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:07:51,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:07:53,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:07:54,072 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.98 vs. limit=15.0 2023-10-03 02:07:54,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:00,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:08:00,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 02:08:03,153 INFO [train.py:1046] (1/4) Epoch 32, batch 350, loss[loss=0.1564, simple_loss=0.2484, pruned_loss=0.03217, over 24656.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2408, pruned_loss=0.04175, over 3897612.06 frames. ], batch size: 68, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:08:03,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:10,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:08:12,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:08:13,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:14,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 02:08:16,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:08:16,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 02:08:20,407 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.79 vs. limit=10.0 2023-10-03 02:08:20,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:21,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 02:08:21,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:08:24,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 02:08:25,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:08:28,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:08:28,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:08:29,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:08:29,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:08:30,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:08:30,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:08:30,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:08:31,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:08:31,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:38,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:08:39,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:08:39,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:08:40,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:08:44,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 02:08:44,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:51,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:08:51,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:08:51,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:08:52,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 02:08:54,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:08:55,870 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 02:08:57,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 02:08:57,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:09:01,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:09:01,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 02:09:04,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:09:06,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:09:08,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_na.min_abs, batch_count=1100440.0, ans=0.02 2023-10-03 02:09:09,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:09:10,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:09:10,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:09:12,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:09:12,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=1100440.0, ans=0.5 2023-10-03 02:09:14,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:09:17,451 INFO [train.py:1046] (1/4) Epoch 32, batch 400, loss[loss=0.1723, simple_loss=0.2468, pruned_loss=0.04883, over 23817.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2404, pruned_loss=0.04167, over 4088356.21 frames. ], batch size: 212, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:09:17,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:09:18,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 02:09:18,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:09:21,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:09:22,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1100506.6666666667, ans=0.0 2023-10-03 02:09:23,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:09:23,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:09:26,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:09:27,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:09:28,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 02:09:28,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1100506.6666666667, ans=0.125 2023-10-03 02:09:29,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 02:09:29,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:09:30,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 02:09:31,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:09:34,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:09:34,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:09:34,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 02:09:35,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:09:35,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:09:35,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:09:35,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:09:40,192 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 02:09:40,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 02:09:45,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:09:48,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:09:49,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 02:09:49,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 02:09:54,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:09:54,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:10:01,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 02:10:03,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1100706.6666666667, ans=0.125 2023-10-03 02:10:05,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:10:05,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 02:10:05,762 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.18 vs. limit=22.5 2023-10-03 02:10:06,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:10:09,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:10:09,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 02:10:13,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:10:15,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 02:10:16,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:10:16,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1100773.3333333333, ans=0.1 2023-10-03 02:10:19,454 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.948e+02 2.204e+02 2.746e+02 3.868e+02, threshold=4.409e+02, percent-clipped=0.0 2023-10-03 02:10:19,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:10:19,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 02:10:20,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 02:10:21,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1100773.3333333333, ans=0.0 2023-10-03 02:10:22,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 02:10:24,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:10:24,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:10:28,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 02:10:30,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:10:31,426 INFO [train.py:1046] (1/4) Epoch 32, batch 450, loss[loss=0.1471, simple_loss=0.2321, pruned_loss=0.03107, over 24663.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.241, pruned_loss=0.0417, over 4234942.27 frames. ], batch size: 65, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:10:31,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:10:31,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:10:32,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 02:10:32,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:10:34,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:10:34,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:10:34,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 02:10:36,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:10:36,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:10:37,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:10:47,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:10:47,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:10:49,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 02:10:50,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 02:10:53,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:10:56,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:10:58,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:11:02,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:11:04,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:11:05,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 02:11:07,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 02:11:08,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 02:11:09,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:11:10,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:11:11,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:11:13,371 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 02:11:13,381 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 02:11:13,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:11:14,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:11:14,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 02:11:18,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 02:11:18,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:11:19,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 02:11:19,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 02:11:21,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:11:23,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:11:25,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:11:26,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 02:11:28,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1101040.0, ans=0.1 2023-10-03 02:11:29,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:11:30,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 02:11:32,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 02:11:34,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:11:40,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:11:41,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:11:42,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:11:42,962 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 02:11:45,610 INFO [train.py:1046] (1/4) Epoch 32, batch 500, loss[loss=0.1536, simple_loss=0.241, pruned_loss=0.03313, over 24654.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2418, pruned_loss=0.04227, over 4345758.78 frames. ], batch size: 65, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:11:45,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:11:46,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1101173.3333333333, ans=0.1 2023-10-03 02:11:47,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:11:47,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:11:47,553 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 02:11:47,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1101173.3333333333, ans=0.125 2023-10-03 02:11:49,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 02:11:49,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:11:53,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 02:11:57,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 02:11:57,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:12:00,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:12:00,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:12:00,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1101240.0, ans=0.125 2023-10-03 02:12:01,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:11,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=1101240.0, ans=0.5 2023-10-03 02:12:12,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:12:12,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:12:12,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 02:12:12,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:12:13,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 02:12:13,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:12:17,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:12:17,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:12:18,676 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.80 vs. limit=15.0 2023-10-03 02:12:19,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:12:19,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:12:21,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 02:12:24,352 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 02:12:25,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:12:27,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:27,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:29,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:29,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:12:30,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 02:12:32,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:12:33,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:12:38,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:12:41,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:42,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1101373.3333333333, ans=0.04949747468305833 2023-10-03 02:12:46,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:12:46,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1101440.0, ans=0.0 2023-10-03 02:12:48,035 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.888e+02 2.126e+02 2.426e+02 3.415e+02, threshold=4.253e+02, percent-clipped=0.0 2023-10-03 02:12:51,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 02:12:51,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:12:51,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:12:55,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 02:12:56,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:12:58,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:13:00,077 INFO [train.py:1046] (1/4) Epoch 32, batch 550, loss[loss=0.1374, simple_loss=0.2169, pruned_loss=0.02892, over 24591.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2424, pruned_loss=0.04236, over 4438291.47 frames. ], batch size: 60, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:13:02,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 02:13:05,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 02:13:05,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:13:05,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 02:13:05,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:13:05,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:13:07,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:08,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:08,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:13:10,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:13:10,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1101506.6666666667, ans=0.125 2023-10-03 02:13:11,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:13:13,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 02:13:13,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:13:19,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:13:19,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:22,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:13:22,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:25,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 02:13:27,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 02:13:29,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:13:34,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:13:34,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:13:37,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:13:39,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:13:39,996 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 02:13:41,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:43,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 02:13:45,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:13:46,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:13:46,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:13:46,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:13:48,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 02:13:48,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1101706.6666666667, ans=0.2 2023-10-03 02:13:49,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 02:13:49,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:13:50,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:13:50,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:13:50,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:13:55,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:13:55,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:13:58,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:13:58,996 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.15 vs. limit=12.0 2023-10-03 02:13:59,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:13:59,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 02:13:59,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:14:00,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:14:02,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:14:02,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:14:04,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 02:14:04,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 02:14:08,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1101773.3333333333, ans=0.0 2023-10-03 02:14:11,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 02:14:13,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 02:14:14,387 INFO [train.py:1046] (1/4) Epoch 32, batch 600, loss[loss=0.1373, simple_loss=0.2174, pruned_loss=0.02858, over 24326.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2421, pruned_loss=0.04196, over 4517395.96 frames. ], batch size: 61, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:14:16,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:14:16,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:14:16,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:14:20,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1101840.0, ans=0.0 2023-10-03 02:14:22,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:14:25,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 02:14:26,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 02:14:27,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:14:29,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:14:32,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:14:35,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 02:14:35,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:14:41,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 02:14:45,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:14:45,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:14:45,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:14:50,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:14:50,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:14:50,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:14:59,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:14:59,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1102040.0, ans=0.125 2023-10-03 02:15:03,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:15:03,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:15:03,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:15:10,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 02:15:14,702 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.98 vs. limit=22.5 2023-10-03 02:15:16,459 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.923e+02 2.160e+02 2.483e+02 3.446e+02, threshold=4.321e+02, percent-clipped=0.0 2023-10-03 02:15:16,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 02:15:16,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:15:18,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 02:15:18,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1102106.6666666667, ans=0.0 2023-10-03 02:15:20,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:15:22,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 02:15:22,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:15:22,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:15:27,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1102173.3333333333, ans=0.125 2023-10-03 02:15:28,702 INFO [train.py:1046] (1/4) Epoch 32, batch 650, loss[loss=0.1732, simple_loss=0.253, pruned_loss=0.04672, over 23350.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2418, pruned_loss=0.04189, over 4567157.97 frames. ], batch size: 105, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:15:28,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 02:15:30,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 02:15:32,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:15:33,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:15:36,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:15:38,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 02:15:39,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:15:43,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:15:43,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:15:45,923 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.26 vs. limit=15.0 2023-10-03 02:15:47,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:15:51,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 02:15:53,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:15:53,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:15:58,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:15:58,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 02:15:59,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:16:01,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:01,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 02:16:03,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:04,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:16:05,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:16:05,935 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 02:16:05,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:16:05,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:16:08,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:11,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:16:11,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:16:11,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:16:11,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 02:16:13,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:16:13,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:16:14,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:16:14,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:16:15,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:16:19,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 02:16:20,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 02:16:20,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:20,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:16:21,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:16:21,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:16:22,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:16:27,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1102440.0, ans=0.125 2023-10-03 02:16:27,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.71 vs. limit=15.0 2023-10-03 02:16:28,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:28,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:16:29,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:16:34,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:16:34,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:16:34,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:16:38,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1102440.0, ans=0.0 2023-10-03 02:16:40,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:16:40,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:16:42,284 INFO [train.py:1046] (1/4) Epoch 32, batch 700, loss[loss=0.1361, simple_loss=0.2215, pruned_loss=0.02532, over 24319.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2405, pruned_loss=0.04186, over 4591232.20 frames. ], batch size: 61, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:16:42,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:16:42,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:16:48,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 02:16:49,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 02:16:51,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 02:16:51,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:53,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:16:54,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 02:17:00,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:17:02,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:17:02,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:17:05,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:17:05,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:17:07,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1102573.3333333333, ans=0.125 2023-10-03 02:17:08,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:17:10,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 02:17:11,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:17:11,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1102640.0, ans=0.125 2023-10-03 02:17:12,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 02:17:14,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 02:17:17,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:17:19,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:17:20,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:17:23,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:17:23,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 02:17:23,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1102640.0, ans=0.125 2023-10-03 02:17:27,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1102706.6666666667, ans=0.2 2023-10-03 02:17:28,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:17:28,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:17:28,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 02:17:32,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:17:34,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:17:36,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:17:41,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1102773.3333333333, ans=0.125 2023-10-03 02:17:42,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:17:43,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 02:17:45,707 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.878e+02 2.011e+02 2.266e+02 3.115e+02, threshold=4.021e+02, percent-clipped=0.0 2023-10-03 02:17:45,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 02:17:45,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 02:17:49,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:17:50,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:17:50,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1102773.3333333333, ans=0.0 2023-10-03 02:17:51,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:17:53,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:17:53,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 02:17:58,507 INFO [train.py:1046] (1/4) Epoch 32, batch 750, loss[loss=0.1643, simple_loss=0.2434, pruned_loss=0.04264, over 23589.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2401, pruned_loss=0.04165, over 4619878.84 frames. ], batch size: 134, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:17:58,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 02:17:59,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 02:17:59,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 02:18:00,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 02:18:01,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 02:18:01,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:18:01,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1102840.0, ans=0.125 2023-10-03 02:18:02,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 02:18:03,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:18:05,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:18:07,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:18:09,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:18:09,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 02:18:09,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:18:12,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:18:13,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:18:15,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:18:18,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:18:18,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:18:20,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 02:18:20,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:18:21,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:18:23,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:18:24,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:18:26,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 02:18:26,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:18:28,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 02:18:28,119 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 02:18:28,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 02:18:28,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:18:28,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 02:18:30,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:18:36,055 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.38 vs. limit=6.0 2023-10-03 02:18:38,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:18:38,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:18:39,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:18:42,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:18:43,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:18:43,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 02:18:43,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:18:45,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 02:18:45,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:18:48,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:18:49,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 02:18:49,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:18:54,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:18:55,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:18:55,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:18:59,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:19:00,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1103106.6666666667, ans=0.125 2023-10-03 02:19:01,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 02:19:01,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:19:01,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:19:06,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:19:06,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:19:09,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:19:09,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:19:11,917 INFO [train.py:1046] (1/4) Epoch 32, batch 800, loss[loss=0.17, simple_loss=0.2574, pruned_loss=0.04129, over 24020.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2407, pruned_loss=0.04199, over 4642011.19 frames. ], batch size: 80, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:19:16,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:19:16,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:18,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:19:18,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:19:19,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:19,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:19:21,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:21,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1103173.3333333333, ans=0.0 2023-10-03 02:19:24,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:19:26,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:19:28,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 02:19:28,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:19:30,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:19:30,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:19:30,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:19:30,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 02:19:30,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:19:32,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 02:19:33,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:35,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1103240.0, ans=0.0 2023-10-03 02:19:35,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1103240.0, ans=0.1 2023-10-03 02:19:36,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:19:38,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1103240.0, ans=0.125 2023-10-03 02:19:39,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:19:39,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:19:42,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:19:42,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:19:48,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:19:48,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:19:48,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 02:19:51,055 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 02:19:52,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 02:19:52,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:19:52,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:19:54,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:54,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:20:00,531 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 02:20:00,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 02:20:03,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:20:03,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1103373.3333333333, ans=0.0 2023-10-03 02:20:06,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:20:06,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1103373.3333333333, ans=0.125 2023-10-03 02:20:09,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:20:12,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:20:12,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1103440.0, ans=0.125 2023-10-03 02:20:12,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1103440.0, ans=0.125 2023-10-03 02:20:13,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 02:20:13,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:20:14,953 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.903e+02 2.081e+02 2.304e+02 3.546e+02, threshold=4.162e+02, percent-clipped=0.0 2023-10-03 02:20:17,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 02:20:18,709 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.82 vs. limit=12.0 2023-10-03 02:20:22,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:20:25,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:20:25,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 02:20:26,698 INFO [train.py:1046] (1/4) Epoch 32, batch 850, loss[loss=0.1859, simple_loss=0.2765, pruned_loss=0.04764, over 24277.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.242, pruned_loss=0.0421, over 4661461.89 frames. ], batch size: 74, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:20:26,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:20:28,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:20:28,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 02:20:28,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:20:30,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:20:31,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:20:34,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:20:34,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:20:34,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 02:20:36,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 02:20:36,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 02:20:37,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:20:37,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:20:40,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:20:40,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:20:40,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:20:44,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:20:46,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:20:46,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 02:20:46,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1103573.3333333333, ans=0.0 2023-10-03 02:20:50,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 02:20:53,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:20:54,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 02:20:59,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 02:20:59,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 02:21:01,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1103640.0, ans=0.125 2023-10-03 02:21:03,897 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 02:21:03,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:21:03,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:21:03,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 02:21:04,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1103640.0, ans=0.125 2023-10-03 02:21:04,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1103640.0, ans=0.125 2023-10-03 02:21:06,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:21:08,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:21:08,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 02:21:08,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff2.min_abs, batch_count=1103640.0, ans=0.1 2023-10-03 02:21:11,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:21:11,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:21:12,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:21:12,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:21:15,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:21:15,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 02:21:17,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 02:21:21,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:21:21,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:21:21,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:21:21,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:21:22,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:21:25,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:21:27,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:21:28,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:21:30,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:21:31,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:21:35,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1103773.3333333333, ans=0.09899494936611666 2023-10-03 02:21:36,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:21:37,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:21:39,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 02:21:39,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:21:40,459 INFO [train.py:1046] (1/4) Epoch 32, batch 900, loss[loss=0.1632, simple_loss=0.2378, pruned_loss=0.04429, over 23621.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.242, pruned_loss=0.04187, over 4680451.77 frames. ], batch size: 149, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:21:40,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:21:43,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 02:21:46,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1103840.0, ans=0.125 2023-10-03 02:21:47,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:21:50,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:21:52,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 02:21:55,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:21:56,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 02:21:56,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 02:21:58,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:21:58,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:21:58,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:21:59,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:22:04,714 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:22:07,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:22:07,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:22:07,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:22:11,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:22:16,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 02:22:16,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:22:16,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1103973.3333333333, ans=0.035 2023-10-03 02:22:22,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:22:22,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:22:23,687 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 02:22:25,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 02:22:27,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1104040.0, ans=0.0 2023-10-03 02:22:31,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:22:32,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:22:32,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:22:34,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1104040.0, ans=0.1 2023-10-03 02:22:38,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:22:38,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:22:40,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 02:22:40,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:22:43,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 02:22:45,148 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.827e+02 2.004e+02 2.233e+02 3.058e+02, threshold=4.009e+02, percent-clipped=0.0 2023-10-03 02:22:45,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:22:46,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:22:47,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:22:47,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:22:50,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 02:22:52,063 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 02:22:53,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 02:22:53,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 02:22:54,752 INFO [train.py:1046] (1/4) Epoch 32, batch 950, loss[loss=0.1714, simple_loss=0.2512, pruned_loss=0.04585, over 23403.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2429, pruned_loss=0.04279, over 4682152.27 frames. ], batch size: 93, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:22:56,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:22:58,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1104173.3333333333, ans=0.125 2023-10-03 02:22:59,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 02:22:59,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1104173.3333333333, ans=0.0 2023-10-03 02:23:02,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1104173.3333333333, ans=0.125 2023-10-03 02:23:05,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:23:06,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:23:06,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:23:06,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 02:23:09,830 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 02:23:14,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:23:14,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:23:15,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:23:16,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:23:16,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 02:23:16,468 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=12.12 vs. limit=15.0 2023-10-03 02:23:17,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 02:23:20,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:23:21,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 02:23:22,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:23:26,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:23:26,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:23:26,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:23:27,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 02:23:29,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1104306.6666666667, ans=0.0 2023-10-03 02:23:30,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 02:23:32,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:23:34,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:23:37,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:23:37,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:23:41,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 02:23:44,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 02:23:44,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:23:44,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:23:45,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:23:45,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:23:49,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 02:23:49,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:23:51,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1104373.3333333333, ans=0.04949747468305833 2023-10-03 02:23:52,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:23:52,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:23:52,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 02:23:52,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:23:52,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:23:53,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 02:23:56,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1104440.0, ans=0.1 2023-10-03 02:23:57,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:23:58,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:24:04,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:24:06,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 02:24:06,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 02:24:08,737 INFO [train.py:1046] (1/4) Epoch 32, batch 1000, loss[loss=0.1592, simple_loss=0.2439, pruned_loss=0.03721, over 24483.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2424, pruned_loss=0.04252, over 4681368.38 frames. ], batch size: 66, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:24:08,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:24:13,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 02:24:14,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:24:18,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:24:19,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 02:24:19,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 02:24:21,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1104506.6666666667, ans=0.0 2023-10-03 02:24:23,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:24:24,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:24:25,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:24:25,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1104573.3333333333, ans=0.04949747468305833 2023-10-03 02:24:29,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 02:24:32,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 02:24:34,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 02:24:34,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:24:36,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 02:24:37,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 02:24:37,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 02:24:38,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:24:38,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:24:39,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1104640.0, ans=0.125 2023-10-03 02:24:48,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:24:48,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1104640.0, ans=0.0 2023-10-03 02:24:49,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:24:49,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:24:51,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:24:51,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 02:24:51,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:24:51,602 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.26 vs. limit=15.0 2023-10-03 02:24:52,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:24:53,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:24:55,205 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 02:24:58,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 02:24:58,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 02:24:59,297 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.34 vs. limit=15.0 2023-10-03 02:25:00,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 02:25:01,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:25:05,447 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=8.11 vs. limit=12.0 2023-10-03 02:25:07,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:25:07,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:25:08,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:25:09,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:25:10,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 02:25:11,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:25:13,048 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.852e+02 2.033e+02 2.255e+02 3.341e+02, threshold=4.066e+02, percent-clipped=0.0 2023-10-03 02:25:13,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 02:25:13,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 02:25:13,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:25:13,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:25:18,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:25:19,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:25:19,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1104773.3333333333, ans=0.125 2023-10-03 02:25:19,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1104773.3333333333, ans=0.07 2023-10-03 02:25:22,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:25:23,464 INFO [train.py:1046] (1/4) Epoch 32, batch 1050, loss[loss=0.154, simple_loss=0.2273, pruned_loss=0.04034, over 24308.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2405, pruned_loss=0.04187, over 4694456.66 frames. ], batch size: 56, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:25:24,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:25:26,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:25:28,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 02:25:29,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:25:31,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:25:33,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:25:35,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:25:36,388 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.21 vs. limit=22.5 2023-10-03 02:25:38,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:25:39,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:25:39,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:25:40,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1104906.6666666667, ans=0.2 2023-10-03 02:25:41,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:25:41,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 02:25:42,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:25:43,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 02:25:46,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:25:46,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 02:25:46,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:25:52,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:25:54,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:25:54,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:25:55,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 02:25:55,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 02:25:57,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:25:57,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1104973.3333333333, ans=0.0 2023-10-03 02:26:00,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 02:26:00,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1104973.3333333333, ans=0.125 2023-10-03 02:26:03,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 02:26:03,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:26:07,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 02:26:09,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 02:26:09,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:26:10,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:26:13,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:26:16,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 02:26:19,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 02:26:19,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 02:26:19,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:26:19,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:26:20,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1105040.0, ans=0.0 2023-10-03 02:26:21,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 02:26:25,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:26:27,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:26:27,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:26:28,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:26:28,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:26:31,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:26:31,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 02:26:33,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:26:33,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 02:26:33,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1105106.6666666667, ans=0.025 2023-10-03 02:26:34,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 02:26:36,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:26:37,354 INFO [train.py:1046] (1/4) Epoch 32, batch 1100, loss[loss=0.1576, simple_loss=0.2272, pruned_loss=0.04403, over 22796.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2397, pruned_loss=0.04145, over 4697702.09 frames. ], batch size: 322, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:26:40,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:26:44,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:26:49,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:26:51,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:26:51,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:26:52,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 02:26:53,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:26:55,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:26:56,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:27:00,719 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.77 vs. limit=15.0 2023-10-03 02:27:01,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:27:01,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 02:27:02,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 02:27:03,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:27:03,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:27:07,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:27:08,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:27:12,151 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.86 vs. limit=15.0 2023-10-03 02:27:13,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:27:15,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 02:27:17,339 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 02:27:17,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:27:19,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1105306.6666666667, ans=0.125 2023-10-03 02:27:20,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:27:20,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 02:27:22,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:27:23,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 02:27:23,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:27:23,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:27:25,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:27:25,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:27:25,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 02:27:29,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:27:29,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 02:27:32,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:27:38,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:27:41,292 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.860e+02 2.079e+02 2.474e+02 4.878e+02, threshold=4.158e+02, percent-clipped=1.0 2023-10-03 02:27:41,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 02:27:41,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 02:27:42,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:27:42,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:27:44,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:27:45,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 02:27:47,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:27:47,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:27:48,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 02:27:50,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:27:50,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 02:27:51,630 INFO [train.py:1046] (1/4) Epoch 32, batch 1150, loss[loss=0.1525, simple_loss=0.2358, pruned_loss=0.03455, over 24425.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2404, pruned_loss=0.0414, over 4695207.72 frames. ], batch size: 63, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:27:51,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:27:51,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:27:53,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:27:57,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:27:59,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:28:00,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1105506.6666666667, ans=0.125 2023-10-03 02:28:02,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:28:02,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:28:02,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 02:28:03,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:28:05,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 02:28:06,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:28:06,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:28:11,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 02:28:11,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1105573.3333333333, ans=0.1 2023-10-03 02:28:13,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1105573.3333333333, ans=0.2 2023-10-03 02:28:14,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:28:17,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:28:17,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:28:17,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1105573.3333333333, ans=10.0 2023-10-03 02:28:18,193 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.15 vs. limit=12.0 2023-10-03 02:28:18,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 02:28:18,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:28:18,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:28:21,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 02:28:23,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:28:24,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:28:35,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:28:36,486 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.77 vs. limit=15.0 2023-10-03 02:28:41,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:28:41,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 02:28:42,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:28:42,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:28:49,789 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 02:28:49,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:28:50,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1105773.3333333333, ans=0.0 2023-10-03 02:28:55,491 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 02:28:58,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:28:58,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:29:00,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:29:00,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:29:03,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:29:06,574 INFO [train.py:1046] (1/4) Epoch 32, batch 1200, loss[loss=0.2208, simple_loss=0.2855, pruned_loss=0.07803, over 19718.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2413, pruned_loss=0.04169, over 4701499.42 frames. ], batch size: 388, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:29:08,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:29:08,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:29:10,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:29:10,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:29:10,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:29:12,892 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.31 vs. limit=10.0 2023-10-03 02:29:13,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:29:15,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:29:16,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:29:16,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:29:16,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1105840.0, ans=0.0 2023-10-03 02:29:19,606 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 02:29:21,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1105906.6666666667, ans=0.125 2023-10-03 02:29:21,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=1105906.6666666667, ans=0.05 2023-10-03 02:29:22,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 02:29:26,882 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.50 vs. limit=22.5 2023-10-03 02:29:27,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:29:30,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:29:32,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:29:33,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:29:33,591 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 02:29:35,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:29:42,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 02:29:42,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:29:42,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 02:29:42,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:29:45,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 02:29:50,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 02:29:50,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:29:52,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:29:53,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:29:54,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:29:55,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:29:55,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:29:57,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.71 vs. limit=12.0 2023-10-03 02:29:58,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:29:58,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 02:29:58,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:29:59,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:29:59,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:30:02,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:30:02,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:30:06,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1106106.6666666667, ans=0.125 2023-10-03 02:30:07,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 02:30:09,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:30:11,107 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.987e+02 2.198e+02 2.518e+02 3.756e+02, threshold=4.395e+02, percent-clipped=0.0 2023-10-03 02:30:12,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 02:30:15,354 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 02:30:16,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:30:19,257 INFO [train.py:1046] (1/4) Epoch 32, batch 1250, loss[loss=0.1622, simple_loss=0.2547, pruned_loss=0.03483, over 24298.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2416, pruned_loss=0.0415, over 4722982.46 frames. ], batch size: 74, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:30:19,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:30:20,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:30:22,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:30:24,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 02:30:27,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:30:29,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:30:29,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 02:30:31,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:30:32,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:30:37,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 02:30:37,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:30:39,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:30:39,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:30:42,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:30:45,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 02:30:45,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:30:45,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:30:47,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:30:47,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:30:48,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1106306.6666666667, ans=0.125 2023-10-03 02:30:49,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:30:50,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:30:57,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 02:30:57,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:31:00,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:31:01,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 02:31:01,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:31:01,487 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 02:31:01,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:31:01,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:31:07,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:31:09,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.22 vs. limit=15.0 2023-10-03 02:31:10,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:31:11,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:31:12,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 02:31:12,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 02:31:12,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 02:31:15,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:31:16,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 02:31:17,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:31:20,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 02:31:21,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:31:23,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 02:31:24,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 02:31:24,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:31:26,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 02:31:27,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:31:29,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 02:31:31,494 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.75 vs. limit=15.0 2023-10-03 02:31:32,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:31:32,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:31:33,836 INFO [train.py:1046] (1/4) Epoch 32, batch 1300, loss[loss=0.1486, simple_loss=0.2345, pruned_loss=0.03139, over 24476.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2421, pruned_loss=0.04182, over 4731222.37 frames. ], batch size: 63, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:31:33,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:31:35,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:31:37,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1106506.6666666667, ans=0.0 2023-10-03 02:31:38,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:31:39,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 02:31:40,930 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.68 vs. limit=15.0 2023-10-03 02:31:42,536 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.73 vs. limit=15.0 2023-10-03 02:31:44,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:31:46,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 02:31:47,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:31:49,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:31:49,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:31:50,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 02:31:54,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:31:56,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:31:56,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 02:31:59,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:32:01,662 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.34 vs. limit=15.0 2023-10-03 02:32:03,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:32:05,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:32:05,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:32:07,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:32:07,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:32:07,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 02:32:09,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 02:32:11,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1106640.0, ans=0.0 2023-10-03 02:32:15,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:32:15,645 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.53 vs. limit=15.0 2023-10-03 02:32:16,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:32:17,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 02:32:18,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 02:32:20,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:32:21,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:32:22,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1106706.6666666667, ans=0.1 2023-10-03 02:32:23,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 02:32:23,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:32:23,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 02:32:24,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:32:28,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:32:28,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:32:28,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1106706.6666666667, ans=0.125 2023-10-03 02:32:31,510 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.21 vs. limit=15.0 2023-10-03 02:32:33,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 02:32:34,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 02:32:34,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 02:32:37,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:32:37,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1106773.3333333333, ans=0.125 2023-10-03 02:32:41,089 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.861e+02 2.113e+02 2.556e+02 3.728e+02, threshold=4.226e+02, percent-clipped=0.0 2023-10-03 02:32:41,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 02:32:42,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:32:48,752 INFO [train.py:1046] (1/4) Epoch 32, batch 1350, loss[loss=0.1517, simple_loss=0.233, pruned_loss=0.03515, over 24338.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2422, pruned_loss=0.04234, over 4721243.11 frames. ], batch size: 61, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:32:50,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 02:32:53,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:32:54,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:32:58,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:32:58,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:33:00,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:33:00,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:33:03,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:33:03,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1106906.6666666667, ans=0.2 2023-10-03 02:33:04,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 02:33:05,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:33:07,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:33:12,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 02:33:12,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:33:14,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:33:14,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 02:33:17,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 02:33:19,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 02:33:21,072 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=12.0 2023-10-03 02:33:21,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:33:21,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 02:33:28,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1106973.3333333333, ans=0.95 2023-10-03 02:33:28,457 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.89 vs. limit=15.0 2023-10-03 02:33:29,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:33:29,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1106973.3333333333, ans=0.95 2023-10-03 02:33:37,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:33:37,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:33:37,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 02:33:37,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1107040.0, ans=0.125 2023-10-03 02:33:40,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1107040.0, ans=0.125 2023-10-03 02:33:42,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:33:43,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 02:33:43,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:33:45,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:33:45,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1107040.0, ans=0.1 2023-10-03 02:33:48,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:33:48,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1107106.6666666667, ans=0.125 2023-10-03 02:33:50,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 02:33:51,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:33:54,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 02:33:57,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 02:34:01,114 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.82 vs. limit=12.0 2023-10-03 02:34:01,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 02:34:03,007 INFO [train.py:1046] (1/4) Epoch 32, batch 1400, loss[loss=0.1629, simple_loss=0.2501, pruned_loss=0.03782, over 24090.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2407, pruned_loss=0.04182, over 4720282.38 frames. ], batch size: 80, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:34:03,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:34:07,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:34:07,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:34:12,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 02:34:13,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 02:34:18,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1107240.0, ans=0.125 2023-10-03 02:34:25,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:34:27,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:34:28,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:34:28,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:34:31,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:34:33,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 02:34:33,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1107306.6666666667, ans=0.125 2023-10-03 02:34:41,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:34:41,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:34:41,957 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1107306.6666666667, ans=0.1 2023-10-03 02:34:47,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 02:34:47,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:34:49,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:34:49,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:34:49,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:34:51,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:34:51,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:34:52,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:34:53,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 02:34:53,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:34:54,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1107373.3333333333, ans=0.2 2023-10-03 02:34:58,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:02,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:35:02,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1107440.0, ans=0.125 2023-10-03 02:35:09,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 02:35:10,698 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.787e+02 1.940e+02 2.244e+02 3.961e+02, threshold=3.881e+02, percent-clipped=0.0 2023-10-03 02:35:10,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 02:35:10,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:35:14,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 02:35:14,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:35:14,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1107440.0, ans=0.125 2023-10-03 02:35:16,732 INFO [train.py:1046] (1/4) Epoch 32, batch 1450, loss[loss=0.1733, simple_loss=0.2516, pruned_loss=0.04755, over 23505.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2402, pruned_loss=0.04145, over 4719628.18 frames. ], batch size: 106, lr: 3.20e-03, grad_scale: 4.0 2023-10-03 02:35:16,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:35:20,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:35:22,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:35:22,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:22,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 02:35:26,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:35:27,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:35:29,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:35:30,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 02:35:30,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:35:31,712 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=6.57 vs. limit=12.0 2023-10-03 02:35:32,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 02:35:32,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:33,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:35:33,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 02:35:33,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:35:35,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:35:35,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 02:35:35,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:35:36,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:35:37,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:41,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:35:46,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:35:47,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:35:50,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:35:50,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:51,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:35:51,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:35:51,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:53,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:35:57,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 02:35:58,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:36:01,527 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 02:36:04,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:36:05,673 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.84 vs. limit=22.5 2023-10-03 02:36:06,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:36:07,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:36:08,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 02:36:11,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1107706.6666666667, ans=0.0 2023-10-03 02:36:12,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:36:14,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 02:36:15,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 02:36:15,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:36:19,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:36:21,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:36:21,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 02:36:22,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 02:36:22,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1107773.3333333333, ans=0.2 2023-10-03 02:36:23,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 02:36:25,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:36:25,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:36:30,827 INFO [train.py:1046] (1/4) Epoch 32, batch 1500, loss[loss=0.169, simple_loss=0.2584, pruned_loss=0.03983, over 24389.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2404, pruned_loss=0.04145, over 4717715.02 frames. ], batch size: 77, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:36:34,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1107840.0, ans=0.125 2023-10-03 02:36:35,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 02:36:35,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:36:35,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:36:37,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:36:37,918 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.69 vs. limit=10.0 2023-10-03 02:36:38,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:36:38,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1107840.0, ans=0.125 2023-10-03 02:36:39,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:36:39,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 02:36:41,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:36:42,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:36:42,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:36:42,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:36:45,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:36:46,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:36:52,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:36:52,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 02:36:52,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:36:54,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:36:54,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:36:54,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1107906.6666666667, ans=0.0 2023-10-03 02:36:54,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1107906.6666666667, ans=0.125 2023-10-03 02:36:58,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 02:37:02,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 02:37:04,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:37:04,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 02:37:06,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 02:37:09,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:37:10,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:37:10,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:37:11,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 02:37:11,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:37:11,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:37:13,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 02:37:14,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:37:18,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:37:18,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 02:37:18,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1108040.0, ans=0.0 2023-10-03 02:37:19,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1108040.0, ans=0.125 2023-10-03 02:37:21,739 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.26 vs. limit=15.0 2023-10-03 02:37:25,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:37:25,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:37:29,757 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 02:37:31,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:37:31,125 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 02:37:32,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:37:33,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:37:33,899 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 02:37:35,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:37:38,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 02:37:40,020 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.890e+02 2.007e+02 2.185e+02 3.461e+02, threshold=4.014e+02, percent-clipped=0.0 2023-10-03 02:37:41,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:37:43,449 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.95 vs. limit=15.0 2023-10-03 02:37:44,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:37:44,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:37:45,544 INFO [train.py:1046] (1/4) Epoch 32, batch 1550, loss[loss=0.1644, simple_loss=0.2507, pruned_loss=0.03912, over 23397.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2416, pruned_loss=0.04188, over 4719578.08 frames. ], batch size: 93, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:37:45,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:37:45,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:37:46,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:37:48,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 02:37:48,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 02:37:48,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:37:49,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 02:37:49,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 02:37:53,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:37:54,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:37:54,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:37:54,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:37:57,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:37:57,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:37:59,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1108240.0, ans=0.125 2023-10-03 02:38:00,213 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 02:38:00,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:00,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:38:01,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:38:03,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:38:03,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 02:38:04,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:38:04,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 02:38:04,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1108240.0, ans=0.125 2023-10-03 02:38:06,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 02:38:06,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 02:38:06,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:07,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:38:10,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:38:13,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 02:38:13,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 02:38:22,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:38:25,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:38:25,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:38:25,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:38:25,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1108306.6666666667, ans=0.0 2023-10-03 02:38:27,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 02:38:31,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:38:32,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:34,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:38:35,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1108373.3333333333, ans=0.125 2023-10-03 02:38:38,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:38:38,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:38:38,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 02:38:38,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:38:40,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:38:41,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:42,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 02:38:42,939 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 02:38:44,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:38:48,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 02:38:51,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1108440.0, ans=0.125 2023-10-03 02:38:54,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:38:57,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:58,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 02:38:59,592 INFO [train.py:1046] (1/4) Epoch 32, batch 1600, loss[loss=0.1507, simple_loss=0.2441, pruned_loss=0.02871, over 24327.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2417, pruned_loss=0.0417, over 4728891.16 frames. ], batch size: 74, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:38:59,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:39:01,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:39:01,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:39:01,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:39:01,594 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.65 vs. limit=22.5 2023-10-03 02:39:02,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:39:04,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:39:04,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 02:39:04,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1108506.6666666667, ans=0.1 2023-10-03 02:39:04,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1108506.6666666667, ans=0.95 2023-10-03 02:39:05,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 02:39:08,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 02:39:10,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:39:11,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 02:39:11,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:39:13,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:39:16,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1108573.3333333333, ans=0.125 2023-10-03 02:39:17,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:39:19,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 02:39:24,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:39:26,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 02:39:26,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:39:27,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 02:39:31,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 02:39:31,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1108640.0, ans=0.1 2023-10-03 02:39:34,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1108640.0, ans=0.125 2023-10-03 02:39:41,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:39:43,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 02:39:43,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:39:44,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:39:44,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:39:47,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 02:39:50,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 02:39:51,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:39:51,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1108706.6666666667, ans=0.125 2023-10-03 02:39:52,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:39:53,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:39:54,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:39:58,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:39:58,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:40:00,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:40:01,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1108773.3333333333, ans=0.125 2023-10-03 02:40:01,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1108773.3333333333, ans=0.125 2023-10-03 02:40:03,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1108773.3333333333, ans=0.0 2023-10-03 02:40:04,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:40:05,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:40:07,257 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.902e+02 2.151e+02 2.646e+02 3.941e+02, threshold=4.301e+02, percent-clipped=0.0 2023-10-03 02:40:07,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 02:40:07,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:40:07,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1108773.3333333333, ans=0.0 2023-10-03 02:40:08,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 02:40:12,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:40:13,664 INFO [train.py:1046] (1/4) Epoch 32, batch 1650, loss[loss=0.1533, simple_loss=0.2336, pruned_loss=0.03648, over 20782.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2419, pruned_loss=0.04197, over 4721496.09 frames. ], batch size: 45, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:40:13,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:40:13,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:40:13,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 02:40:15,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 02:40:15,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 02:40:15,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 02:40:17,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:40:19,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:40:19,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:40:20,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:40:23,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:40:25,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 02:40:27,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:40:27,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:40:27,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:40:27,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:40:29,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 02:40:30,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 02:40:34,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:40:36,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:40:39,573 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.70 vs. limit=10.0 2023-10-03 02:40:46,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 02:40:47,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:40:50,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 02:40:53,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:40:55,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:40:55,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:40:55,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:40:56,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:40:57,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:41:00,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:41:01,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:41:01,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:41:01,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:41:02,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:41:04,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:41:06,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:41:08,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 02:41:09,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:41:09,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 02:41:11,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 02:41:11,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 02:41:11,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:41:13,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:41:14,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:41:14,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:41:14,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 02:41:17,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:41:20,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:41:20,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:41:23,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 02:41:26,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:41:26,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:41:28,038 INFO [train.py:1046] (1/4) Epoch 32, batch 1700, loss[loss=0.1446, simple_loss=0.2242, pruned_loss=0.03253, over 24327.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2416, pruned_loss=0.04219, over 4716805.06 frames. ], batch size: 56, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:41:28,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 02:41:28,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:41:28,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:41:28,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:41:30,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:41:32,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:41:32,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 02:41:35,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:41:43,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:41:45,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:41:52,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:41:52,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:41:52,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:41:52,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:41:53,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 02:41:55,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:41:55,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:41:58,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:42:00,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:42:01,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 02:42:01,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 02:42:03,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:42:06,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 02:42:06,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:42:13,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:42:13,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1109373.3333333333, ans=0.5 2023-10-03 02:42:14,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:42:15,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:42:17,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 02:42:19,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 02:42:19,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:42:21,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:42:21,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 02:42:23,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:42:23,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:42:23,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:42:23,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:42:26,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:42:26,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:42:27,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:42:27,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:42:29,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:42:34,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:42:35,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 02:42:36,920 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.849e+02 2.096e+02 2.324e+02 3.909e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-03 02:42:37,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:42:38,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:42:39,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 02:42:41,317 INFO [train.py:1046] (1/4) Epoch 32, batch 1750, loss[loss=0.1498, simple_loss=0.2397, pruned_loss=0.02995, over 24504.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2409, pruned_loss=0.04181, over 4721741.41 frames. ], batch size: 66, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:42:41,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1109506.6666666667, ans=0.0 2023-10-03 02:42:44,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:42:46,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:42:47,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:42:47,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 02:42:47,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:42:48,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.26 vs. limit=15.0 2023-10-03 02:42:49,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1109506.6666666667, ans=0.0 2023-10-03 02:42:50,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:42:50,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:42:54,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 02:42:58,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:43:00,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 02:43:00,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:43:02,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:43:04,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 02:43:05,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 02:43:08,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:43:08,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 02:43:15,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:43:16,339 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.28 vs. limit=22.5 2023-10-03 02:43:18,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:43:18,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:43:22,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:43:22,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:43:24,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:43:25,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1109706.6666666667, ans=0.125 2023-10-03 02:43:27,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:43:29,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:43:29,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:43:30,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 02:43:32,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:43:35,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 02:43:35,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:43:36,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:43:38,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:43:41,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:43:41,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1109773.3333333333, ans=0.125 2023-10-03 02:43:42,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 02:43:42,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:43:42,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1109773.3333333333, ans=0.125 2023-10-03 02:43:44,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:43:44,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1109773.3333333333, ans=0.2 2023-10-03 02:43:47,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:43:50,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:43:50,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:43:51,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 02:43:51,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:43:52,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:43:52,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:43:52,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:43:52,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:43:54,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:43:54,734 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.11 vs. limit=15.0 2023-10-03 02:43:56,235 INFO [train.py:1046] (1/4) Epoch 32, batch 1800, loss[loss=0.1553, simple_loss=0.2441, pruned_loss=0.03323, over 24301.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2397, pruned_loss=0.04168, over 4705331.55 frames. ], batch size: 74, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:43:58,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:43:59,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:44:00,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 02:44:01,440 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.45 vs. limit=15.0 2023-10-03 02:44:03,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:44:07,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 02:44:08,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:44:12,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:44:15,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:44:15,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:44:16,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:44:18,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:44:18,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 02:44:19,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:44:21,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:44:25,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 02:44:25,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 02:44:25,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 02:44:27,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:44:28,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:44:28,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:44:28,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:44:36,615 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 02:44:37,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:44:38,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1109973.3333333333, ans=0.125 2023-10-03 02:44:39,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:44:40,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 02:44:42,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 02:44:42,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:44:44,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:44:46,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:44:49,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 02:44:51,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1110040.0, ans=0.125 2023-10-03 02:44:54,801 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.16 vs. limit=15.0 2023-10-03 02:44:55,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:44:55,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 02:44:56,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:44:56,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:44:56,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:44:58,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 02:45:01,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:45:01,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:45:04,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 02:45:04,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:45:06,185 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.827e+02 1.959e+02 2.120e+02 2.855e+02, threshold=3.918e+02, percent-clipped=0.0 2023-10-03 02:45:07,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:45:07,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:45:07,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:45:09,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:45:09,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:45:10,448 INFO [train.py:1046] (1/4) Epoch 32, batch 1850, loss[loss=0.1784, simple_loss=0.2502, pruned_loss=0.05335, over 22974.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2404, pruned_loss=0.04161, over 4717637.59 frames. ], batch size: 322, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:45:11,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:45:11,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:45:14,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1110173.3333333333, ans=0.1 2023-10-03 02:45:15,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:45:15,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:45:22,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:45:22,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 02:45:24,461 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.33 vs. limit=15.0 2023-10-03 02:45:26,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 02:45:26,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1110240.0, ans=0.0 2023-10-03 02:45:30,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 02:45:30,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1110240.0, ans=0.125 2023-10-03 02:45:35,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:45:35,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 02:45:35,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 02:45:44,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:45:47,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 02:45:50,173 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.75 vs. limit=22.5 2023-10-03 02:45:50,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:45:51,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:45:55,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 02:45:55,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:45:56,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 02:45:56,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:45:58,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:46:00,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:46:05,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:46:05,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:46:06,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 02:46:06,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:46:07,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1110373.3333333333, ans=0.125 2023-10-03 02:46:08,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:46:10,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:46:13,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 02:46:14,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:46:17,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:46:17,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:46:17,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 02:46:17,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 02:46:20,153 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 02:46:20,229 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 02:46:21,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:46:21,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:46:21,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:46:21,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:46:23,013 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 02:46:24,261 INFO [train.py:1046] (1/4) Epoch 32, batch 1900, loss[loss=0.1815, simple_loss=0.2567, pruned_loss=0.05318, over 23349.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2412, pruned_loss=0.04161, over 4725578.91 frames. ], batch size: 93, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:46:24,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:46:24,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:46:26,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:46:27,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:46:28,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:46:28,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 02:46:30,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:46:30,416 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 02:46:30,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:46:31,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:46:36,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:46:37,342 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.72 vs. limit=22.5 2023-10-03 02:46:37,344 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.54 vs. limit=15.0 2023-10-03 02:46:40,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:46:40,994 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 02:46:42,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 02:46:44,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:46:44,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:46:44,140 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 02:46:44,178 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 02:46:48,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 02:46:49,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:46:53,220 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.73 vs. limit=10.0 2023-10-03 02:46:53,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 02:46:56,060 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.26 vs. limit=22.5 2023-10-03 02:46:57,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 02:46:58,044 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.62 vs. limit=15.0 2023-10-03 02:47:05,244 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.30 vs. limit=15.0 2023-10-03 02:47:06,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 02:47:08,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 02:47:08,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:47:08,813 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 02:47:08,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 02:47:08,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 02:47:10,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 02:47:10,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:47:12,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1110706.6666666667, ans=0.0 2023-10-03 02:47:15,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 02:47:16,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:47:20,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:47:20,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 02:47:22,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1110773.3333333333, ans=0.1 2023-10-03 02:47:23,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:47:26,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 02:47:26,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:47:33,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:47:33,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:47:33,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:47:35,106 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.918e+02 2.080e+02 2.358e+02 3.129e+02, threshold=4.160e+02, percent-clipped=0.0 2023-10-03 02:47:35,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:47:36,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:47:36,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 02:47:37,804 INFO [train.py:1046] (1/4) Epoch 32, batch 1950, loss[loss=0.1842, simple_loss=0.2581, pruned_loss=0.05512, over 23412.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2414, pruned_loss=0.04204, over 4718192.76 frames. ], batch size: 285, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:47:37,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:47:39,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:47:39,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:47:43,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:47:43,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:47:43,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:47:46,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:47:48,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:47:49,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:47:50,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:47:50,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:47:54,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 02:47:54,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 02:47:54,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:47:56,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:47:57,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:47:59,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:47:59,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:00,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:48:03,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:48:03,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:48:03,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:48:05,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:07,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:09,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:48:09,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:48:09,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 02:48:09,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 02:48:10,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 02:48:10,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:48:10,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:48:14,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:17,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:48:21,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:48:23,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:48:23,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:48:24,721 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.29 vs. limit=15.0 2023-10-03 02:48:25,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 02:48:25,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:48:28,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:48:30,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:48:30,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1111040.0, ans=0.1 2023-10-03 02:48:31,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:48:37,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:48:38,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:48:42,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:48:44,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:48:46,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:48:46,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:48:46,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1111106.6666666667, ans=0.125 2023-10-03 02:48:46,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1111106.6666666667, ans=0.0 2023-10-03 02:48:47,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 02:48:47,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:48:49,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:50,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 02:48:51,919 INFO [train.py:1046] (1/4) Epoch 32, batch 2000, loss[loss=0.1673, simple_loss=0.2468, pruned_loss=0.0439, over 23424.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2422, pruned_loss=0.04223, over 4722995.63 frames. ], batch size: 93, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:48:52,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:48:54,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:48:54,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:48:55,896 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.87 vs. limit=6.0 2023-10-03 02:48:56,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:48:58,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:48:59,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:48:59,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1111173.3333333333, ans=0.125 2023-10-03 02:49:02,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 02:49:02,611 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:49:03,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:49:06,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:49:07,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 02:49:09,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:49:09,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:49:11,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1111240.0, ans=0.125 2023-10-03 02:49:11,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1111240.0, ans=0.95 2023-10-03 02:49:15,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:49:15,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 02:49:15,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:17,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:17,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:19,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 02:49:20,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 02:49:22,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 02:49:22,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:49:24,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:49:24,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:49:24,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:25,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:49:28,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:49:28,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 02:49:28,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1111306.6666666667, ans=0.125 2023-10-03 02:49:31,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 02:49:31,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:49:31,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:49:36,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:49:36,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1111373.3333333333, ans=0.0 2023-10-03 02:49:38,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:49:38,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:49:38,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1111373.3333333333, ans=0.0 2023-10-03 02:49:39,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:49:40,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:49:40,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:49:42,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:49:42,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:49:44,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:47,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:49:48,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 02:49:52,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:49:54,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:49:57,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:49:57,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:49:58,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1111440.0, ans=0.0 2023-10-03 02:49:59,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:03,070 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 2.023e+02 2.252e+02 2.571e+02 3.525e+02, threshold=4.504e+02, percent-clipped=0.0 2023-10-03 02:50:03,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:50:03,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:03,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:50:04,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:50:05,928 INFO [train.py:1046] (1/4) Epoch 32, batch 2050, loss[loss=0.1526, simple_loss=0.2217, pruned_loss=0.04174, over 23944.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2414, pruned_loss=0.04191, over 4721672.64 frames. ], batch size: 195, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:50:06,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:50:07,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:07,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1111506.6666666667, ans=0.0 2023-10-03 02:50:08,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:50:09,712 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.57 vs. limit=15.0 2023-10-03 02:50:10,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:15,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:50:17,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:50:18,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:19,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:50:21,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 02:50:21,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:50:24,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:50:25,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:50:32,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:50:32,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:50:35,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 02:50:35,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:50:37,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 02:50:38,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:50:39,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:50:43,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:50:44,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:50:44,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:50:46,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:50:46,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff2.min_abs, batch_count=1111640.0, ans=0.1 2023-10-03 02:50:48,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:50:48,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:50:48,971 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.65 vs. limit=10.0 2023-10-03 02:50:49,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:50:52,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:50:54,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:50:56,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:51:00,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:51:04,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1111773.3333333333, ans=0.0 2023-10-03 02:51:05,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:51:06,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 02:51:11,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:51:13,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:51:14,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:51:16,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 02:51:19,358 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 02:51:19,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:51:20,590 INFO [train.py:1046] (1/4) Epoch 32, batch 2100, loss[loss=0.1645, simple_loss=0.2358, pruned_loss=0.04659, over 23872.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2397, pruned_loss=0.04165, over 4705125.48 frames. ], batch size: 195, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:51:20,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:51:20,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:51:22,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:51:22,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 02:51:22,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 02:51:24,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:51:27,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:51:27,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:51:30,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:51:31,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:51:31,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 02:51:31,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:51:32,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 02:51:32,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 02:51:34,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:51:36,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:51:36,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 02:51:36,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 02:51:42,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 02:51:42,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:51:44,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:51:45,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:51:47,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:51:48,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 02:51:50,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:51:50,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 02:51:51,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 02:51:53,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:51:53,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 02:51:53,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 02:51:53,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 02:51:56,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:51:57,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:52:00,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:52:00,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:52:01,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:04,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:52:04,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 02:52:04,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:52:04,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:52:06,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:06,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 02:52:06,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1112040.0, ans=0.0 2023-10-03 02:52:09,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 02:52:09,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 02:52:14,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:52:16,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:52:18,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 02:52:24,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:52:26,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:52:26,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:52:26,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:52:26,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 02:52:26,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:52:28,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:52:29,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:52:29,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:52:29,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:30,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 02:52:32,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 02:52:33,714 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.908e+02 2.112e+02 2.525e+02 3.507e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 02:52:33,741 INFO [train.py:1046] (1/4) Epoch 32, batch 2150, loss[loss=0.1757, simple_loss=0.2661, pruned_loss=0.04269, over 24333.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2391, pruned_loss=0.04154, over 4716207.43 frames. ], batch size: 77, lr: 3.19e-03, grad_scale: 4.0 2023-10-03 02:52:33,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:52:35,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:52:35,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:52:35,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:52:36,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:52:38,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1112173.3333333333, ans=0.1 2023-10-03 02:52:44,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 02:52:46,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:52:48,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:49,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:52:49,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:52:49,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:52:52,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:52,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:52:52,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:52:55,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:52:55,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 02:52:59,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1112240.0, ans=0.0 2023-10-03 02:53:00,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:53:02,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:53:03,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:03,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:53:03,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:05,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:53:05,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:53:05,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:53:06,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:53:06,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 02:53:06,781 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1112306.6666666667, ans=0.125 2023-10-03 02:53:08,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:53:09,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:53:11,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:53:11,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:53:12,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:53:17,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:53:17,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:53:18,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:53:18,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 02:53:18,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:53:21,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:53:21,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:23,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:53:24,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:53:24,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:53:25,292 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.38 vs. limit=15.0 2023-10-03 02:53:25,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:26,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 02:53:27,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 02:53:29,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:53:29,299 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 02:53:29,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:53:29,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:53:30,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 02:53:30,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:53:30,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 02:53:30,824 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 02:53:30,825 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 02:53:32,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 02:53:33,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:53:33,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:53:33,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:53:35,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:36,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 02:53:37,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:53:37,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:46,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:53:48,051 INFO [train.py:1046] (1/4) Epoch 32, batch 2200, loss[loss=0.1744, simple_loss=0.2467, pruned_loss=0.05099, over 22795.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.239, pruned_loss=0.04136, over 4708955.42 frames. ], batch size: 322, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:53:48,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 02:53:51,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:53:53,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1112506.6666666667, ans=0.125 2023-10-03 02:53:54,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1112506.6666666667, ans=0.125 2023-10-03 02:53:56,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:56,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:53:56,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1112506.6666666667, ans=0.125 2023-10-03 02:53:57,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:53:59,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:53:59,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1112506.6666666667, ans=0.2 2023-10-03 02:54:02,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:54:02,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:54:02,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 02:54:06,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 02:54:08,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:54:14,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 02:54:17,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:54:18,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:54:20,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:54:20,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1112640.0, ans=0.1 2023-10-03 02:54:23,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:54:23,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1112640.0, ans=0.125 2023-10-03 02:54:24,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 02:54:27,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:54:28,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:54:28,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 02:54:31,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:54:34,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:54:35,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:54:36,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:54:38,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 02:54:40,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:54:40,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 02:54:43,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:54:43,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 02:54:43,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:54:45,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:54:46,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:54:46,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:54:47,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:54:48,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:54:50,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:54:51,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 02:54:51,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1112773.3333333333, ans=0.125 2023-10-03 02:54:52,095 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.42 vs. limit=15.0 2023-10-03 02:54:54,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 02:54:55,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:54:57,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1112773.3333333333, ans=0.0 2023-10-03 02:54:57,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1112773.3333333333, ans=0.125 2023-10-03 02:54:58,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:54:58,682 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 02:55:00,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:55:00,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1112840.0, ans=0.125 2023-10-03 02:55:01,288 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.834e+02 1.970e+02 2.169e+02 2.586e+02, threshold=3.939e+02, percent-clipped=0.0 2023-10-03 02:55:01,315 INFO [train.py:1046] (1/4) Epoch 32, batch 2250, loss[loss=0.1547, simple_loss=0.2392, pruned_loss=0.03509, over 24475.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2397, pruned_loss=0.04141, over 4716336.35 frames. ], batch size: 66, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:55:01,388 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 02:55:03,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 02:55:03,211 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 02:55:04,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:55:04,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 02:55:06,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:55:07,535 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 02:55:07,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:55:10,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:55:16,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:55:18,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:55:20,445 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.71 vs. limit=12.0 2023-10-03 02:55:21,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:55:22,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:55:22,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:55:24,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1112906.6666666667, ans=10.0 2023-10-03 02:55:25,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 02:55:26,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:55:26,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:55:28,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 02:55:29,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:55:29,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:55:30,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.68 vs. limit=22.5 2023-10-03 02:55:31,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:55:35,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:55:35,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 02:55:37,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:55:37,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 02:55:38,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:55:38,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1112973.3333333333, ans=0.2 2023-10-03 02:55:41,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:55:46,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:55:47,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:55:48,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1113040.0, ans=0.125 2023-10-03 02:55:49,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:55:49,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:55:52,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:55:53,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:55:55,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1113040.0, ans=0.0 2023-10-03 02:55:55,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1113040.0, ans=0.125 2023-10-03 02:55:56,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1113040.0, ans=0.125 2023-10-03 02:55:58,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:55:59,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:56:03,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 02:56:03,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:56:03,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:56:07,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 02:56:09,156 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.39 vs. limit=15.0 2023-10-03 02:56:11,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 02:56:11,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 02:56:11,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:56:11,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:56:14,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 02:56:15,771 INFO [train.py:1046] (1/4) Epoch 32, batch 2300, loss[loss=0.1642, simple_loss=0.2362, pruned_loss=0.04617, over 23699.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2405, pruned_loss=0.04228, over 4711094.15 frames. ], batch size: 179, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:56:19,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:56:19,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:56:26,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:56:26,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:56:28,006 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 02:56:30,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:56:35,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:56:35,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:56:35,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1113240.0, ans=0.1 2023-10-03 02:56:36,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:56:36,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:56:36,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 02:56:38,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:56:40,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:56:40,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:56:44,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:56:48,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:56:53,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:56:55,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:56:56,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:56:59,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:57:01,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:57:03,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:57:04,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:57:04,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:57:04,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 02:57:06,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1113373.3333333333, ans=0.0 2023-10-03 02:57:08,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 02:57:08,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:57:10,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:57:10,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:57:10,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:57:12,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 02:57:12,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:57:13,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 02:57:13,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:57:13,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:57:13,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 02:57:18,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1113440.0, ans=0.1 2023-10-03 02:57:20,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1113440.0, ans=0.125 2023-10-03 02:57:21,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:57:26,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:57:28,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:57:30,093 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.927e+02 2.149e+02 2.530e+02 4.352e+02, threshold=4.298e+02, percent-clipped=1.0 2023-10-03 02:57:30,120 INFO [train.py:1046] (1/4) Epoch 32, batch 2350, loss[loss=0.1606, simple_loss=0.2387, pruned_loss=0.04126, over 23602.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2422, pruned_loss=0.04291, over 4704137.34 frames. ], batch size: 149, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:57:30,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:57:30,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:57:31,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:57:31,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:57:31,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:57:31,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 02:57:31,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1113506.6666666667, ans=0.0 2023-10-03 02:57:39,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:57:39,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 02:57:44,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 02:57:46,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:57:46,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1113573.3333333333, ans=0.125 2023-10-03 02:57:47,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:57:47,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:57:47,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:57:47,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:57:49,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 02:57:52,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:57:56,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 02:57:59,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1113640.0, ans=0.125 2023-10-03 02:58:00,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:58:03,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:58:03,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:58:05,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:58:06,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 02:58:06,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:58:06,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1113640.0, ans=0.125 2023-10-03 02:58:09,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:58:09,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:58:11,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:58:15,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:58:17,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 02:58:17,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:58:18,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:58:20,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:58:22,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 02:58:24,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:58:25,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1113706.6666666667, ans=0.125 2023-10-03 02:58:25,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1113706.6666666667, ans=0.2 2023-10-03 02:58:26,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 02:58:26,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:58:30,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 02:58:33,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 02:58:35,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:58:35,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:58:35,225 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 02:58:35,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 02:58:37,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 02:58:41,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:58:43,818 INFO [train.py:1046] (1/4) Epoch 32, batch 2400, loss[loss=0.1662, simple_loss=0.2365, pruned_loss=0.04792, over 23765.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2419, pruned_loss=0.04311, over 4704859.71 frames. ], batch size: 179, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 02:58:44,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:58:48,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:58:50,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:58:50,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 02:58:51,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 02:58:59,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 02:58:59,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:59:00,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 02:59:01,227 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.01 vs. limit=15.0 2023-10-03 02:59:02,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:59:02,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:59:02,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 02:59:07,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:59:08,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1113906.6666666667, ans=0.125 2023-10-03 02:59:08,635 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.40 vs. limit=15.0 2023-10-03 02:59:10,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 02:59:13,451 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=16.25 vs. limit=22.5 2023-10-03 02:59:14,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:59:18,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 02:59:21,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:59:22,677 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.69 vs. limit=22.5 2023-10-03 02:59:24,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:59:28,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:59:28,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 02:59:28,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:59:36,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:59:39,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:59:40,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:59:42,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:59:42,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:59:42,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:59:42,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:59:44,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:59:44,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:59:46,652 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.22 vs. limit=15.0 2023-10-03 02:59:48,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:59:49,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:59:49,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 02:59:49,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1114106.6666666667, ans=0.125 2023-10-03 02:59:51,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 02:59:53,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:59:53,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:59:54,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 02:59:55,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 02:59:55,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 02:59:56,001 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 02:59:56,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 02:59:57,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:59:59,078 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.835e+02 1.995e+02 2.171e+02 3.013e+02, threshold=3.990e+02, percent-clipped=0.0 2023-10-03 02:59:59,105 INFO [train.py:1046] (1/4) Epoch 32, batch 2450, loss[loss=0.1692, simple_loss=0.253, pruned_loss=0.04275, over 23726.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.241, pruned_loss=0.04234, over 4711265.39 frames. ], batch size: 85, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 02:59:59,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:00:00,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:00:01,967 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 03:00:02,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:00:03,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:00:04,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:00:06,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:00:08,481 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.81 vs. limit=10.0 2023-10-03 03:00:09,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:00:09,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:00:09,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 03:00:13,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:00:13,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:00:14,280 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:00:19,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:00:19,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:00:19,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:00:19,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 03:00:25,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:00:26,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:00:26,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:00:30,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:00:30,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:00:32,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:00:32,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:00:35,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 03:00:35,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:00:40,146 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.14 vs. limit=15.0 2023-10-03 03:00:43,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:00:44,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:00:44,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:00:46,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:00:46,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:00:46,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:00:48,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 03:00:49,970 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:00:51,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:00:51,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:00:54,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:00:54,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:01:00,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:01:00,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 03:01:00,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:01:02,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:01:02,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 03:01:02,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:01:03,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:01:06,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:01:07,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:01:09,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:01:12,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 03:01:13,442 INFO [train.py:1046] (1/4) Epoch 32, batch 2500, loss[loss=0.1476, simple_loss=0.2053, pruned_loss=0.04497, over 19488.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2403, pruned_loss=0.04181, over 4713669.41 frames. ], batch size: 388, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:01:13,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:01:18,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:01:26,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:01:26,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:01:26,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:01:28,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 03:01:35,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:01:36,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:01:36,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 03:01:36,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:01:38,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 03:01:39,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:01:40,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:01:40,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 03:01:42,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:01:42,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 03:01:42,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:01:46,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:01:48,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:01:49,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:01:50,484 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.96 vs. limit=22.5 2023-10-03 03:01:51,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 03:01:51,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:01:52,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:01:57,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:02:02,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:02:04,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:02:09,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1114706.6666666667, ans=0.125 2023-10-03 03:02:10,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 03:02:12,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 03:02:12,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:02:12,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 03:02:14,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:02:14,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:02:16,200 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 03:02:16,201 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 03:02:16,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 03:02:19,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:02:21,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 03:02:21,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 03:02:23,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:02:23,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 03:02:27,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 03:02:29,563 INFO [train.py:1046] (1/4) Epoch 32, batch 2550, loss[loss=0.1629, simple_loss=0.2368, pruned_loss=0.0445, over 23565.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2403, pruned_loss=0.04174, over 4714496.27 frames. ], batch size: 256, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:02:30,925 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.835e+02 1.976e+02 2.166e+02 3.435e+02, threshold=3.953e+02, percent-clipped=0.0 2023-10-03 03:02:31,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:02:31,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:02:32,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:02:35,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:02:35,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 03:02:36,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:02:39,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 03:02:40,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:02:43,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:02:46,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:02:46,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 03:02:46,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:02:47,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:02:47,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:02:50,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:02:50,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 03:02:50,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 03:02:50,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:02:50,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 03:03:03,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:03:07,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:03:07,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:03:07,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:03:09,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:03:15,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:03:19,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:03:19,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:03:19,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:03:21,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 03:03:21,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:03:25,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:03:25,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:03:30,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:03:30,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 03:03:30,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:03:32,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:03:32,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 03:03:34,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 03:03:36,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:03:41,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:03:41,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:03:42,497 INFO [train.py:1046] (1/4) Epoch 32, batch 2600, loss[loss=0.168, simple_loss=0.2489, pruned_loss=0.04354, over 23359.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.241, pruned_loss=0.04206, over 4724922.76 frames. ], batch size: 105, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:03:43,927 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 03:03:45,411 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 03:03:45,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:03:45,465 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 03:03:46,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 03:03:48,060 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 03:03:50,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:03:51,291 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 03:03:51,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 03:03:53,258 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 03:03:53,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1115173.3333333333, ans=0.0 2023-10-03 03:03:54,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:03:56,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 03:03:57,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 03:03:57,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 03:03:59,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 03:04:02,690 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 03:04:02,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 03:04:04,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1115240.0, ans=0.0 2023-10-03 03:04:08,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:04:08,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:04:08,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:04:08,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 03:04:11,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:04:15,314 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 03:04:24,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:04:24,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:04:24,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 03:04:26,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:04:26,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:04:27,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 03:04:27,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1115373.3333333333, ans=0.1 2023-10-03 03:04:29,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:04:29,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:04:31,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:04:31,907 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.57 vs. limit=15.0 2023-10-03 03:04:34,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1115373.3333333333, ans=0.125 2023-10-03 03:04:35,763 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 03:04:35,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:04:35,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:04:41,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:04:42,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:04:42,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 03:04:44,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:04:46,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:04:46,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:04:47,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1115440.0, ans=0.125 2023-10-03 03:04:52,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 03:04:54,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:04:55,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:04:57,809 INFO [train.py:1046] (1/4) Epoch 32, batch 2650, loss[loss=0.163, simple_loss=0.237, pruned_loss=0.04448, over 23192.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2417, pruned_loss=0.04236, over 4714351.84 frames. ], batch size: 105, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:04:59,101 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.432e+02 1.872e+02 2.008e+02 2.203e+02 2.987e+02, threshold=4.015e+02, percent-clipped=0.0 2023-10-03 03:05:01,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 03:05:01,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:05:01,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:05:01,990 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.04 vs. limit=15.0 2023-10-03 03:05:02,463 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 03:05:02,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:05:02,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1115506.6666666667, ans=0.125 2023-10-03 03:05:03,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:05:07,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:05:07,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:05:09,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:05:11,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 03:05:11,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:05:12,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:05:13,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 03:05:15,309 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 03:05:18,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:05:19,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 03:05:19,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:05:20,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 03:05:24,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:05:24,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:05:24,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:05:24,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:05:28,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 03:05:28,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 03:05:31,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:05:36,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 03:05:36,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:05:38,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:05:38,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:05:39,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:05:40,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:05:41,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:05:42,042 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:05:43,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:05:44,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:05:45,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:05:47,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:05:48,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:05:48,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:05:50,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:05:51,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:05:51,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:05:54,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:05:56,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:05:56,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:05:56,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 03:06:00,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:06:02,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:06:04,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:06:05,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:05,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:06:05,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:07,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:06:07,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 03:06:10,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:06:11,713 INFO [train.py:1046] (1/4) Epoch 32, batch 2700, loss[loss=0.1525, simple_loss=0.2399, pruned_loss=0.03261, over 24495.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2423, pruned_loss=0.04232, over 4724449.22 frames. ], batch size: 66, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:06:11,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 03:06:14,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:06:14,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:14,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:16,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:06:16,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:06:16,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:06:16,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 03:06:16,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 03:06:17,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:06:20,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:06:20,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1115840.0, ans=0.2 2023-10-03 03:06:21,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:06:22,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:06:23,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1115840.0, ans=10.0 2023-10-03 03:06:26,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:06:27,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 03:06:27,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:06:34,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:06:34,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:06:39,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:06:39,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:06:39,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:06:41,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:06:43,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:06:46,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:06:46,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:06:47,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:06:49,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:49,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:06:59,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:06:59,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:07:04,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:07:04,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:07:08,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:07:09,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:07:09,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:07:11,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:11,942 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.06 vs. limit=15.0 2023-10-03 03:07:12,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:07:12,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:07:15,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1116106.6666666667, ans=0.125 2023-10-03 03:07:16,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:07:17,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:07:17,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:07:18,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 03:07:20,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:07:23,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:07:23,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 03:07:24,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 03:07:24,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:07:25,775 INFO [train.py:1046] (1/4) Epoch 32, batch 2750, loss[loss=0.1725, simple_loss=0.2589, pruned_loss=0.04303, over 24016.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2417, pruned_loss=0.04231, over 4712577.16 frames. ], batch size: 80, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:07:27,190 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.923e+02 2.045e+02 2.291e+02 3.532e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-03 03:07:27,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:07:27,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:07:30,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:30,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:07:32,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:35,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:07:35,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 03:07:36,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:07:36,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:36,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 03:07:36,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:07:36,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:07:42,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1116240.0, ans=0.0 2023-10-03 03:07:43,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 03:07:43,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:07:44,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:44,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=1116240.0, ans=0.025 2023-10-03 03:07:46,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:07:46,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:07:46,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:07:48,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:07:48,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:07:48,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:07:49,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1116240.0, ans=0.1 2023-10-03 03:07:52,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:07:52,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:07:52,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:07:53,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:55,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:08:00,006 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:08:04,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:08:04,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1116306.6666666667, ans=0.0 2023-10-03 03:08:07,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:08:07,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:08:11,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:08:11,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:08:11,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:08:18,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:08:18,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:08:18,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 03:08:22,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:08:23,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 03:08:28,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 03:08:32,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:08:32,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 03:08:33,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:08:35,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:08:37,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 03:08:37,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:08:40,176 INFO [train.py:1046] (1/4) Epoch 32, batch 2800, loss[loss=0.155, simple_loss=0.2371, pruned_loss=0.03643, over 24607.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2405, pruned_loss=0.04174, over 4712599.45 frames. ], batch size: 60, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 03:08:40,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 03:08:40,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:08:40,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:08:41,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 03:08:41,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:08:41,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:08:44,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:08:44,579 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 03:08:44,580 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 03:08:50,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:08:51,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:08:51,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:08:54,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:08:57,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 03:08:57,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 03:08:59,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 03:09:01,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:09:01,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:09:01,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:09:03,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:09:05,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:09:05,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 03:09:05,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:09:13,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:09:15,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:09:17,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:09:17,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:09:19,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:09:23,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:09:23,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 03:09:24,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:09:24,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:09:24,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:09:26,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1116706.6666666667, ans=0.0 2023-10-03 03:09:28,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:09:29,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:09:30,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1116706.6666666667, ans=0.125 2023-10-03 03:09:33,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:09:36,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:09:36,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:09:36,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:09:36,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 03:09:38,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:09:40,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:09:40,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 03:09:40,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:09:41,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1116773.3333333333, ans=0.2 2023-10-03 03:09:42,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:09:42,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:09:43,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 03:09:44,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:09:45,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:09:45,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:09:48,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 03:09:52,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:09:52,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 03:09:54,146 INFO [train.py:1046] (1/4) Epoch 32, batch 2850, loss[loss=0.1554, simple_loss=0.2422, pruned_loss=0.03426, over 24512.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2399, pruned_loss=0.04136, over 4721247.21 frames. ], batch size: 63, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 03:09:54,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:09:55,420 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.850e+02 1.983e+02 2.213e+02 2.652e+02, threshold=3.967e+02, percent-clipped=0.0 2023-10-03 03:09:55,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1116840.0, ans=0.125 2023-10-03 03:09:56,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:09:57,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1116840.0, ans=0.0 2023-10-03 03:10:00,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:10:00,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:10:00,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:10:02,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:10:04,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:10:05,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:10:05,795 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1116840.0, ans=0.1 2023-10-03 03:10:06,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 03:10:10,757 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.63 vs. limit=15.0 2023-10-03 03:10:13,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 03:10:13,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:10:15,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 03:10:16,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:10:17,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 03:10:17,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 03:10:21,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:10:21,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1116906.6666666667, ans=0.125 2023-10-03 03:10:21,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1116906.6666666667, ans=0.125 2023-10-03 03:10:31,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:10:31,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1116973.3333333333, ans=0.125 2023-10-03 03:10:32,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:10:32,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:10:34,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:10:34,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:10:34,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:10:34,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1116973.3333333333, ans=0.125 2023-10-03 03:10:36,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:10:36,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 03:10:39,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:10:39,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:10:41,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:10:43,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:10:45,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:10:45,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:10:46,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1117040.0, ans=0.125 2023-10-03 03:10:46,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1117040.0, ans=0.125 2023-10-03 03:10:47,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:10:48,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:10:50,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1117040.0, ans=0.09899494936611666 2023-10-03 03:10:51,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:10:51,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:10:52,218 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:10:53,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:10:54,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:10:56,539 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.81 vs. limit=15.0 2023-10-03 03:10:58,446 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.16 vs. limit=15.0 2023-10-03 03:11:00,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:11:00,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1117106.6666666667, ans=0.125 2023-10-03 03:11:01,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 03:11:01,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 03:11:03,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:11:03,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:11:03,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 03:11:05,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:11:05,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:11:05,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:11:05,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:11:05,201 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 03:11:06,493 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 03:11:06,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:11:06,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1117173.3333333333, ans=0.07 2023-10-03 03:11:07,744 INFO [train.py:1046] (1/4) Epoch 32, batch 2900, loss[loss=0.1572, simple_loss=0.2465, pruned_loss=0.03393, over 24660.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2404, pruned_loss=0.04135, over 4736415.36 frames. ], batch size: 73, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 03:11:07,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:11:08,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.24 vs. limit=15.0 2023-10-03 03:11:11,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 03:11:11,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:11:11,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:11:13,045 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.94 vs. limit=15.0 2023-10-03 03:11:13,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 03:11:18,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:11:18,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 03:11:18,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 03:11:21,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:11:21,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:11:23,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:11:25,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:11:28,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:11:28,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:11:31,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 03:11:32,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 03:11:32,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:11:36,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:11:37,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 03:11:38,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 03:11:41,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:11:41,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 03:11:41,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:11:43,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:11:43,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 03:11:46,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:11:48,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:11:50,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:11:50,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1117306.6666666667, ans=0.125 2023-10-03 03:11:51,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:11:54,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 03:11:54,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 03:11:54,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:11:58,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:12:01,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 03:12:03,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:12:05,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1117373.3333333333, ans=0.09899494936611666 2023-10-03 03:12:05,444 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.49 vs. limit=22.5 2023-10-03 03:12:09,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:12:12,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1117440.0, ans=0.125 2023-10-03 03:12:16,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:12:16,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:12:17,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 03:12:18,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1117440.0, ans=0.0 2023-10-03 03:12:22,464 INFO [train.py:1046] (1/4) Epoch 32, batch 2950, loss[loss=0.1413, simple_loss=0.2165, pruned_loss=0.03304, over 24343.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2413, pruned_loss=0.04194, over 4725309.11 frames. ], batch size: 56, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 03:12:22,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:12:22,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 03:12:22,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1117506.6666666667, ans=0.0 2023-10-03 03:12:23,826 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.841e+02 2.014e+02 2.273e+02 4.138e+02, threshold=4.027e+02, percent-clipped=1.0 2023-10-03 03:12:23,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:12:23,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:12:28,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:12:31,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 03:12:31,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:12:32,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:12:34,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:12:34,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:12:37,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 03:12:37,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 03:12:38,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:12:38,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:12:44,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:12:46,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:12:47,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:12:49,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:12:52,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:12:52,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:12:53,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:12:55,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:12:55,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:12:56,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 03:13:02,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 03:13:02,382 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 03:13:02,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:13:04,571 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 03:13:04,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1117640.0, ans=0.0 2023-10-03 03:13:05,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 03:13:05,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:13:07,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:13:07,304 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 03:13:07,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:13:10,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 03:13:11,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:13:11,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:13:13,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:13:14,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:13:14,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:13:15,784 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 03:13:16,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:13:17,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 03:13:19,709 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.27 vs. limit=15.0 2023-10-03 03:13:20,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:13:22,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:13:22,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 03:13:22,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:13:24,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 03:13:24,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1117773.3333333333, ans=0.2 2023-10-03 03:13:26,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:13:27,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1117773.3333333333, ans=0.0 2023-10-03 03:13:28,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:13:28,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:13:29,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:13:29,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 03:13:32,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:13:32,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:13:32,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:13:34,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:13:34,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:13:36,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:13:37,486 INFO [train.py:1046] (1/4) Epoch 32, batch 3000, loss[loss=0.1897, simple_loss=0.2574, pruned_loss=0.06106, over 23318.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2417, pruned_loss=0.04234, over 4716614.57 frames. ], batch size: 285, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:13:37,486 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 03:13:48,262 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.1960, 4.7312, 4.3385, 4.7491], device='cuda:1') 2023-10-03 03:13:49,431 INFO [train.py:1078] (1/4) Epoch 32, validation: loss=0.3583, simple_loss=0.2851, pruned_loss=0.2157, over 1125622.00 frames. 2023-10-03 03:13:49,432 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 03:13:49,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:13:49,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 03:13:50,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:13:52,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1117840.0, ans=0.07 2023-10-03 03:13:53,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:13:53,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:13:56,451 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 03:13:56,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 03:13:59,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:14:01,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:14:01,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 03:14:01,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:14:04,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1117906.6666666667, ans=0.0 2023-10-03 03:14:08,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 03:14:16,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:14:22,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 03:14:23,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:14:24,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:14:26,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:14:26,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:14:27,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:14:27,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 03:14:31,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 03:14:32,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:14:32,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 03:14:34,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:14:35,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:14:35,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:14:35,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:14:38,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:14:38,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:14:38,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:14:39,484 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.45 vs. limit=22.5 2023-10-03 03:14:41,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:14:42,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 03:14:42,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:14:44,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:14:44,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:14:48,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:14:48,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:14:49,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 03:14:49,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 03:14:49,901 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1118106.6666666667, ans=0.125 2023-10-03 03:14:51,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:14:51,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 03:14:52,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:14:53,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 03:14:55,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:14:57,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:14:57,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 03:14:58,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 03:14:58,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 03:15:00,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:15:01,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:15:01,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 03:15:01,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:03,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:15:04,549 INFO [train.py:1046] (1/4) Epoch 32, batch 3050, loss[loss=0.1663, simple_loss=0.2427, pruned_loss=0.04496, over 23265.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2425, pruned_loss=0.04247, over 4713054.00 frames. ], batch size: 105, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:15:06,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 03:15:07,343 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.893e+02 2.072e+02 2.427e+02 3.731e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 03:15:07,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:15:09,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1118173.3333333333, ans=0.2 2023-10-03 03:15:10,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:15:10,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:15:15,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:17,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 03:15:22,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 03:15:23,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 03:15:24,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:15:27,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:15:31,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:31,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:15:32,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:15:35,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:15:35,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:15:35,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:15:35,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:15:35,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:15:37,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:37,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1118306.6666666667, ans=0.125 2023-10-03 03:15:40,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:15:42,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:15:42,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 03:15:42,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:42,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:15:44,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1118306.6666666667, ans=0.125 2023-10-03 03:15:47,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:15:47,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:15:47,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:15:49,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:15:52,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:15:53,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:15:57,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1118373.3333333333, ans=0.09899494936611666 2023-10-03 03:15:59,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:15:59,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:15:59,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:16:02,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:16:02,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:16:02,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:16:03,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 03:16:05,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:16:05,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1118440.0, ans=0.04949747468305833 2023-10-03 03:16:07,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:16:08,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 03:16:09,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:16:15,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:16:16,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:16:18,172 INFO [train.py:1046] (1/4) Epoch 32, batch 3100, loss[loss=0.1542, simple_loss=0.2362, pruned_loss=0.03606, over 24677.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2422, pruned_loss=0.04235, over 4713272.16 frames. ], batch size: 65, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:16:21,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 03:16:23,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 03:16:25,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 03:16:26,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 03:16:27,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:16:31,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1118506.6666666667, ans=0.95 2023-10-03 03:16:32,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:16:32,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:16:34,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 03:16:38,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:16:41,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1118573.3333333333, ans=0.0 2023-10-03 03:16:42,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 03:16:47,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:16:48,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:16:48,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:16:49,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:16:49,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1118640.0, ans=0.125 2023-10-03 03:16:51,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 03:16:54,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:16:54,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 03:16:54,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:16:56,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:16:57,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 03:16:58,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:17:01,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:17:03,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 03:17:05,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 03:17:05,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:06,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:17:08,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:08,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:08,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:17:09,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:17:09,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:17:10,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:17:12,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:17:12,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:12,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 03:17:16,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:17:18,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 03:17:19,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:17:20,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 03:17:20,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:21,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:22,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 03:17:27,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1118773.3333333333, ans=0.2 2023-10-03 03:17:31,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 03:17:33,072 INFO [train.py:1046] (1/4) Epoch 32, batch 3150, loss[loss=0.1702, simple_loss=0.2479, pruned_loss=0.04625, over 23406.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2412, pruned_loss=0.04174, over 4718837.12 frames. ], batch size: 120, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:17:35,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:17:35,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:36,439 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.898e+02 2.081e+02 2.464e+02 4.773e+02, threshold=4.162e+02, percent-clipped=1.0 2023-10-03 03:17:36,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:17:36,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:17:37,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 03:17:37,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:17:37,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 03:17:40,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 03:17:40,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1118840.0, ans=0.0 2023-10-03 03:17:42,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:45,333 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 03:17:45,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1118840.0, ans=0.125 2023-10-03 03:17:46,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 03:17:47,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1118906.6666666667, ans=0.07 2023-10-03 03:17:48,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:17:48,213 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 03:17:49,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 03:17:52,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 03:17:52,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 03:17:52,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 03:17:52,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:52,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:17:54,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:56,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 03:17:58,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:17:58,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:17:58,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:17:59,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 03:18:03,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 03:18:04,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:18:06,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:18:07,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:18:07,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 03:18:10,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 03:18:11,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:18:11,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 03:18:11,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:18:12,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:18:12,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:18:14,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:18:14,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:18:14,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 03:18:16,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:18:16,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:17,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:18:17,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:18:19,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 03:18:19,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:18:20,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 03:18:20,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:22,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 03:18:24,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 03:18:25,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:18:25,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:18:27,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 03:18:28,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 03:18:28,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:18:31,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:18:33,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:33,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:18:37,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:18:38,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:39,747 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.62 vs. limit=15.0 2023-10-03 03:18:40,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 03:18:46,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:18:46,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 03:18:47,546 INFO [train.py:1046] (1/4) Epoch 32, batch 3200, loss[loss=0.1477, simple_loss=0.2292, pruned_loss=0.03308, over 24516.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2399, pruned_loss=0.04159, over 4705904.32 frames. ], batch size: 63, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:18:50,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:51,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:18:51,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 03:18:55,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:18:58,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1119173.3333333333, ans=0.0 2023-10-03 03:18:59,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:19:00,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1119173.3333333333, ans=0.0 2023-10-03 03:19:03,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:19:07,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1119240.0, ans=0.125 2023-10-03 03:19:11,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:19:21,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 03:19:22,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:19:26,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 03:19:27,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 03:19:29,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:19:31,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:19:31,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:19:35,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 03:19:37,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 03:19:38,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 03:19:39,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 03:19:42,297 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.48 vs. limit=22.5 2023-10-03 03:19:42,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:19:48,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:19:48,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:19:48,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:19:50,331 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 03:19:50,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:19:53,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:19:55,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 03:19:55,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 03:19:56,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 03:19:58,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 03:19:59,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:20:02,516 INFO [train.py:1046] (1/4) Epoch 32, batch 3250, loss[loss=0.1798, simple_loss=0.2641, pruned_loss=0.04773, over 24093.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2399, pruned_loss=0.04163, over 4710530.05 frames. ], batch size: 80, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:20:04,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 03:20:04,424 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 03:20:04,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:20:04,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:05,812 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.990e+02 2.302e+02 2.522e+02 3.377e+02, threshold=4.604e+02, percent-clipped=0.0 2023-10-03 03:20:05,983 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 03:20:08,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1119506.6666666667, ans=0.125 2023-10-03 03:20:10,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:20:14,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:20:14,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1119506.6666666667, ans=0.125 2023-10-03 03:20:18,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:20:18,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 03:20:19,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:20:19,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:20:19,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:20:21,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:20:21,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:20:23,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:24,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:20:25,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:20:25,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:25,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:26,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:20:30,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:20:30,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:20:31,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1119640.0, ans=0.1 2023-10-03 03:20:33,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:20:33,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:35,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:20:37,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:20:37,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:20:41,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 03:20:42,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:20:42,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:20:43,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:20:44,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:20:44,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1119640.0, ans=0.1 2023-10-03 03:20:50,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:20:52,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1119706.6666666667, ans=0.125 2023-10-03 03:20:58,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:20:58,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:20:58,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 03:20:58,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:20:58,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 03:20:58,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:02,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 03:21:02,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 03:21:02,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:21:03,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:21:03,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1119773.3333333333, ans=0.2 2023-10-03 03:21:05,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:21:05,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 03:21:05,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:21:05,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1119773.3333333333, ans=0.125 2023-10-03 03:21:09,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:21:09,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:21:11,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 03:21:11,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:21:14,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:21:14,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 03:21:15,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:21:15,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 03:21:16,789 INFO [train.py:1046] (1/4) Epoch 32, batch 3300, loss[loss=0.1453, simple_loss=0.2212, pruned_loss=0.03472, over 24313.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2406, pruned_loss=0.04204, over 4704047.53 frames. ], batch size: 56, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:21:18,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 03:21:19,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 03:21:19,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:21:22,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:21:24,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:21:24,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:25,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 03:21:25,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:21:29,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:21:29,937 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.98 vs. limit=15.0 2023-10-03 03:21:32,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:21:35,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 03:21:35,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:21:36,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:21:38,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:38,502 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 03:21:39,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:21:39,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:21:41,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:21:41,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:21:42,497 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 03:21:44,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1119906.6666666667, ans=0.1 2023-10-03 03:21:45,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:21:45,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:21:48,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:48,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 03:21:49,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 03:21:49,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:50,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:21:54,769 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 03:21:55,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1119973.3333333333, ans=0.0 2023-10-03 03:21:57,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 03:21:57,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:22:00,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 03:22:02,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:22:06,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 03:22:06,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:22:07,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:22:08,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:22:08,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:22:08,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:22:10,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:22:11,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:22:13,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:22:14,574 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 03:22:15,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 03:22:17,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:22:18,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:22:18,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:22:20,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:22:20,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:22:21,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:22:21,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:22:21,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 03:22:21,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1120106.6666666667, ans=0.125 2023-10-03 03:22:23,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:22:25,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 03:22:29,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 03:22:29,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:22:30,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:22:32,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:22:33,242 INFO [train.py:1046] (1/4) Epoch 32, batch 3350, loss[loss=0.1537, simple_loss=0.2355, pruned_loss=0.03591, over 24589.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2414, pruned_loss=0.04229, over 4719687.87 frames. ], batch size: 60, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:22:33,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:22:34,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:22:36,516 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.816e+02 1.964e+02 2.229e+02 3.119e+02, threshold=3.928e+02, percent-clipped=0.0 2023-10-03 03:22:36,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:22:36,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:22:39,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:22:40,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:22:42,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1120173.3333333333, ans=0.125 2023-10-03 03:22:43,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:22:46,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:22:48,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:22:48,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:22:49,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:22:49,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 03:22:51,272 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 03:22:52,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:22:54,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 03:22:54,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 03:22:54,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1120240.0, ans=0.1 2023-10-03 03:22:54,736 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.46 vs. limit=22.5 2023-10-03 03:22:55,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:22:56,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:22:56,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:22:58,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 03:22:58,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:22:58,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:23:01,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:23:02,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:23:04,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:23:04,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:23:07,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:23:09,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:23:10,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:23:14,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:23:16,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:23:17,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:23:17,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:23:20,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:23:20,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 03:23:20,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:23:20,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 03:23:20,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:23:23,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 03:23:23,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:23:24,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1120373.3333333333, ans=0.125 2023-10-03 03:23:24,685 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.12 vs. limit=6.0 2023-10-03 03:23:25,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:23:32,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:23:32,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 03:23:32,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:23:33,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1120440.0, ans=0.2 2023-10-03 03:23:34,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:23:36,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:23:38,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1120440.0, ans=0.125 2023-10-03 03:23:40,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1120440.0, ans=0.125 2023-10-03 03:23:41,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:23:43,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 03:23:44,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:23:44,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:23:47,646 INFO [train.py:1046] (1/4) Epoch 32, batch 3400, loss[loss=0.1725, simple_loss=0.2379, pruned_loss=0.05357, over 23779.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2421, pruned_loss=0.0425, over 4723621.48 frames. ], batch size: 164, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:23:47,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:23:47,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 03:23:47,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:23:47,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 03:23:49,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:23:50,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:23:51,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:23:52,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:23:52,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 03:23:58,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 03:23:58,883 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 03:23:58,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:24:02,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:24:02,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:24:03,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:24:04,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:24:08,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:24:10,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 03:24:14,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:24:19,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:24:19,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:24:19,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1120640.0, ans=0.2 2023-10-03 03:24:20,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 03:24:27,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:24:31,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 03:24:35,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:24:37,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:24:37,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 03:24:37,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:24:39,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:24:39,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:24:39,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:24:42,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:24:45,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:24:45,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:24:50,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:24:51,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 03:24:57,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:25:00,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 03:25:01,504 INFO [train.py:1046] (1/4) Epoch 32, batch 3450, loss[loss=0.1373, simple_loss=0.2045, pruned_loss=0.03506, over 22746.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2419, pruned_loss=0.04224, over 4728667.05 frames. ], batch size: 323, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:25:03,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 03:25:03,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:25:05,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:25:05,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 03:25:05,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:25:06,246 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.880e+02 2.016e+02 2.211e+02 2.960e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-03 03:25:06,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1120840.0, ans=0.125 2023-10-03 03:25:09,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:25:14,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:25:14,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1120840.0, ans=0.125 2023-10-03 03:25:16,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:25:17,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:25:17,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:25:20,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:25:26,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 03:25:31,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 03:25:31,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 03:25:32,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:25:32,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:25:34,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1120973.3333333333, ans=0.125 2023-10-03 03:25:34,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1120973.3333333333, ans=0.125 2023-10-03 03:25:38,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 03:25:39,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:25:43,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:25:43,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:25:45,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:25:46,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:25:48,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 03:25:48,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:25:49,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1121040.0, ans=0.125 2023-10-03 03:25:50,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:25:52,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:25:54,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 03:25:55,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1121040.0, ans=0.0 2023-10-03 03:25:55,638 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.41 vs. limit=15.0 2023-10-03 03:25:57,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:26:03,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:26:04,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:04,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1121106.6666666667, ans=0.0 2023-10-03 03:26:07,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:26:09,831 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.49 vs. limit=22.5 2023-10-03 03:26:11,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:11,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:26:11,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:26:11,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:26:15,037 INFO [train.py:1046] (1/4) Epoch 32, batch 3500, loss[loss=0.1648, simple_loss=0.2572, pruned_loss=0.03617, over 24290.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2406, pruned_loss=0.04198, over 4708772.73 frames. ], batch size: 74, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:26:19,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:26:21,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:26:21,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 03:26:23,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:26:26,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 03:26:28,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:26:28,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 03:26:33,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:26:34,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:26:36,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:26:36,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:26:37,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 03:26:37,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:39,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:26:39,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 03:26:39,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1121240.0, ans=0.1 2023-10-03 03:26:39,958 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.87 vs. limit=15.0 2023-10-03 03:26:40,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:40,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 03:26:42,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:26:45,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:45,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1121306.6666666667, ans=0.125 2023-10-03 03:26:47,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 03:26:47,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:26:50,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:26:51,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:26:52,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:54,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:26:54,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:26:55,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 03:26:55,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 03:26:57,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 03:26:57,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:26:58,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:59,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:26:59,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:27:03,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 03:27:04,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:27:08,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:27:08,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 03:27:08,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 03:27:08,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:27:12,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:27:12,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:27:15,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:27:16,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 03:27:16,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:27:18,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1121440.0, ans=0.125 2023-10-03 03:27:19,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:27:19,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 03:27:22,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 03:27:23,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:27:25,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:27:25,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:27:25,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:27:26,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1121440.0, ans=10.0 2023-10-03 03:27:27,953 INFO [train.py:1046] (1/4) Epoch 32, batch 3550, loss[loss=0.146, simple_loss=0.2242, pruned_loss=0.03391, over 24626.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2388, pruned_loss=0.04172, over 4691628.46 frames. ], batch size: 60, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:27:29,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:27:32,573 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.881e+02 2.109e+02 2.515e+02 3.801e+02, threshold=4.218e+02, percent-clipped=0.0 2023-10-03 03:27:32,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1121506.6666666667, ans=0.125 2023-10-03 03:27:37,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:27:38,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 03:27:43,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:27:43,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:27:46,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:27:46,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:27:46,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:27:47,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1121573.3333333333, ans=0.2 2023-10-03 03:27:49,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:27:49,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:27:50,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:27:50,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 03:27:52,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:27:55,604 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.91 vs. limit=15.0 2023-10-03 03:27:56,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:27:56,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:27:57,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:27:57,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:27:59,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:27:59,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 03:27:59,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:28:00,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:28:01,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 03:28:05,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:28:05,837 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.15 vs. limit=15.0 2023-10-03 03:28:06,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:28:06,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:28:09,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 03:28:09,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1121640.0, ans=0.125 2023-10-03 03:28:11,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:28:13,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 03:28:14,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:28:16,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:28:16,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:28:19,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 03:28:20,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:28:26,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:28:26,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 03:28:27,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:28:29,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:28:31,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 03:28:36,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 03:28:38,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:28:38,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:28:41,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:28:41,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:28:42,746 INFO [train.py:1046] (1/4) Epoch 32, batch 3600, loss[loss=0.1692, simple_loss=0.2583, pruned_loss=0.04005, over 24404.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2396, pruned_loss=0.04194, over 4692177.72 frames. ], batch size: 69, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:28:42,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:28:48,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:28:49,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:28:51,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:28:52,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:28:52,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:28:52,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 03:28:57,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:28:57,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:29:00,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:29:02,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:29:04,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:29:04,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:29:04,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 03:29:05,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:29:06,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:29:08,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:29:09,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:29:09,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1121906.6666666667, ans=0.0 2023-10-03 03:29:12,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:29:12,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:29:14,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 03:29:20,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:29:21,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:29:21,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 03:29:26,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:29:30,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1122040.0, ans=0.125 2023-10-03 03:29:32,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:29:34,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:29:35,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1122040.0, ans=0.07 2023-10-03 03:29:40,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:29:40,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:29:40,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 03:29:42,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 03:29:42,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 03:29:42,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1122106.6666666667, ans=0.125 2023-10-03 03:29:45,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:29:45,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:29:45,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1122106.6666666667, ans=0.125 2023-10-03 03:29:48,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 03:29:48,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:29:48,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1122106.6666666667, ans=0.0 2023-10-03 03:29:50,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:29:50,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:29:51,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 03:29:52,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 03:29:53,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1122106.6666666667, ans=0.1 2023-10-03 03:29:55,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:29:57,115 INFO [train.py:1046] (1/4) Epoch 32, batch 3650, loss[loss=0.1476, simple_loss=0.2269, pruned_loss=0.03416, over 23577.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2398, pruned_loss=0.04142, over 4708205.50 frames. ], batch size: 149, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:29:57,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 03:30:02,614 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.895e+02 2.042e+02 2.308e+02 4.121e+02, threshold=4.085e+02, percent-clipped=0.0 2023-10-03 03:30:02,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 03:30:04,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:30:05,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1122173.3333333333, ans=0.1 2023-10-03 03:30:08,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 03:30:09,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 03:30:14,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:30:14,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:30:14,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:30:17,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 03:30:17,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:30:19,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 03:30:19,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:30:19,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:30:21,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 03:30:22,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 03:30:22,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:30:22,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:30:25,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:30:28,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 03:30:28,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 03:30:30,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:30:31,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 03:30:32,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:30:32,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:30:35,241 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.27 vs. limit=15.0 2023-10-03 03:30:37,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:30:39,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:30:39,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:30:41,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:30:43,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:30:45,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:30:49,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:30:50,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:30:50,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:30:51,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 03:30:53,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:30:53,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:00,752 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 03:31:03,067 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.97 vs. limit=15.0 2023-10-03 03:31:03,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:31:04,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:04,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:31:04,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:31:06,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:31:06,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1122440.0, ans=0.125 2023-10-03 03:31:07,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:31:09,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 03:31:09,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:31:10,292 INFO [train.py:1046] (1/4) Epoch 32, batch 3700, loss[loss=0.1662, simple_loss=0.2437, pruned_loss=0.04433, over 23898.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2405, pruned_loss=0.04154, over 4719713.86 frames. ], batch size: 195, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:31:13,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:31:14,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:31:16,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:31:17,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:31:17,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 03:31:17,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:31:21,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 03:31:21,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:31:25,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:31:28,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:31:28,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:31:29,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:31:29,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:31:29,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 03:31:32,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:31:32,985 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:31:34,463 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 03:31:42,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:31:42,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:31:44,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:31:44,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 03:31:44,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:31:49,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:49,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 03:31:50,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:51,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:31:55,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:56,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:31:57,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 03:32:00,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:32:00,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 03:32:02,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:32:02,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 03:32:08,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:32:08,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:32:10,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:32:10,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 03:32:12,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:32:12,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:32:12,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1122773.3333333333, ans=0.125 2023-10-03 03:32:13,087 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.46 vs. limit=6.0 2023-10-03 03:32:13,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:32:13,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:32:16,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:32:18,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 03:32:19,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 03:32:19,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:32:19,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:32:21,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:32:23,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:32:24,557 INFO [train.py:1046] (1/4) Epoch 32, batch 3750, loss[loss=0.1683, simple_loss=0.2383, pruned_loss=0.04912, over 23818.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2414, pruned_loss=0.04208, over 4711442.30 frames. ], batch size: 179, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:32:25,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:32:27,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:32:27,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:32:30,020 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.897e+02 2.092e+02 2.385e+02 3.379e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-03 03:32:30,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 03:32:31,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 03:32:33,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 03:32:34,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 03:32:34,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:32:35,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:32:37,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:32:39,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:32:41,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:32:44,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:32:46,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:32:47,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:32:48,210 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1122906.6666666667, ans=0.125 2023-10-03 03:32:49,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:32:51,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 03:32:52,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:32:54,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:32:54,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:32:58,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 03:32:59,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1122973.3333333333, ans=0.125 2023-10-03 03:33:01,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 03:33:01,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1122973.3333333333, ans=0.1 2023-10-03 03:33:02,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:33:02,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:33:03,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1122973.3333333333, ans=0.0 2023-10-03 03:33:05,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:33:09,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1123040.0, ans=0.1 2023-10-03 03:33:10,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:33:11,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 03:33:14,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 03:33:16,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:33:20,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:33:21,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:33:25,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:33:26,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1123106.6666666667, ans=0.125 2023-10-03 03:33:29,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 03:33:31,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 03:33:33,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:33:33,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:33:35,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:33:39,277 INFO [train.py:1046] (1/4) Epoch 32, batch 3800, loss[loss=0.1716, simple_loss=0.261, pruned_loss=0.04115, over 24658.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2423, pruned_loss=0.04235, over 4707566.61 frames. ], batch size: 73, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:33:42,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:33:46,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:33:47,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 03:33:47,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 03:33:49,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:33:51,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:33:51,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 03:33:52,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1123240.0, ans=0.125 2023-10-03 03:33:53,110 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.40 vs. limit=15.0 2023-10-03 03:33:54,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 03:33:54,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:33:55,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:33:57,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:33:58,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:33:58,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:33:59,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 03:34:01,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 03:34:02,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:34:05,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:34:08,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:34:08,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:34:10,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:34:11,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:34:12,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:34:12,937 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1123306.6666666667, ans=0.125 2023-10-03 03:34:14,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:34:18,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:34:19,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 03:34:22,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:34:26,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:34:32,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:34:34,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 03:34:34,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1123373.3333333333, ans=0.125 2023-10-03 03:34:36,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 03:34:36,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:34:38,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:34:38,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:34:41,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 03:34:44,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 03:34:44,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 03:34:44,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:34:45,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:34:47,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1123440.0, ans=0.0 2023-10-03 03:34:51,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:34:52,996 INFO [train.py:1046] (1/4) Epoch 32, batch 3850, loss[loss=0.162, simple_loss=0.2385, pruned_loss=0.04272, over 24460.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.241, pruned_loss=0.0416, over 4711564.18 frames. ], batch size: 58, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:34:53,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:34:57,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:34:58,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 03:34:59,363 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.881e+02 2.039e+02 2.318e+02 3.209e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 03:34:59,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:35:00,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:35:02,678 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1123506.6666666667, ans=0.0 2023-10-03 03:35:02,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1123506.6666666667, ans=0.1 2023-10-03 03:35:05,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:35:08,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:35:11,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 03:35:12,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 03:35:16,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:18,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:35:21,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:35:21,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:35:25,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:26,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:35:26,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1123640.0, ans=0.125 2023-10-03 03:35:27,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:35:27,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:35:29,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:35:30,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:35:31,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1123640.0, ans=0.125 2023-10-03 03:35:32,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:32,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:35:33,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 03:35:33,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 03:35:34,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:35:34,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:36,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:35:37,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:37,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 03:35:39,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 03:35:41,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:35:44,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 03:35:44,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 03:35:49,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:35:50,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:55,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:35:55,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 03:35:58,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 03:36:00,408 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.60 vs. limit=6.0 2023-10-03 03:36:01,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:36:01,879 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.16 vs. limit=6.0 2023-10-03 03:36:02,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:36:02,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1123773.3333333333, ans=0.0 2023-10-03 03:36:03,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:36:03,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:36:04,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:05,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:05,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:36:05,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 03:36:05,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:36:06,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 03:36:08,143 INFO [train.py:1046] (1/4) Epoch 32, batch 3900, loss[loss=0.1608, simple_loss=0.2492, pruned_loss=0.03621, over 24437.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2406, pruned_loss=0.04149, over 4713276.61 frames. ], batch size: 69, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:36:08,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:08,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:36:08,977 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.82 vs. limit=15.0 2023-10-03 03:36:09,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:36:09,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:09,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:36:11,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:36:11,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:36:12,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:36:12,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 03:36:12,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:16,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:36:17,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:36:18,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:36:19,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:36:20,324 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.41 vs. limit=15.0 2023-10-03 03:36:23,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:36:23,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:24,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:36:25,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 03:36:25,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:36:27,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 03:36:27,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:29,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 03:36:30,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 03:36:32,467 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.79 vs. limit=15.0 2023-10-03 03:36:33,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:36:34,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:36:34,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:36:34,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:36:37,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:36:39,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:36:42,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:36:42,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:36:43,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:36:49,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:36:49,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:36:56,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:36:57,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:37:00,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1124040.0, ans=0.0 2023-10-03 03:37:07,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:37:10,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:37:10,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 03:37:10,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 03:37:10,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:37:12,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 03:37:13,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:37:14,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 03:37:20,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:37:21,746 INFO [train.py:1046] (1/4) Epoch 32, batch 3950, loss[loss=0.1565, simple_loss=0.2285, pruned_loss=0.04227, over 23484.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.24, pruned_loss=0.0416, over 4719488.53 frames. ], batch size: 256, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:37:22,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 03:37:22,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:37:25,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:37:26,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:37:28,014 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.850e+02 2.026e+02 2.281e+02 3.100e+02, threshold=4.052e+02, percent-clipped=0.0 2023-10-03 03:37:30,895 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 03:37:30,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:37:32,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 03:37:32,356 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 03:37:32,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:37:36,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:37:36,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:37:36,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:37:37,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 03:37:40,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:37:41,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:37:41,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:37:42,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:37:43,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:37:44,165 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.28 vs. limit=15.0 2023-10-03 03:37:56,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:37:56,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:38:01,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 03:38:07,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 03:38:07,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 03:38:07,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:38:08,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:38:13,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:38:15,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:38:15,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:38:15,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:38:15,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 03:38:19,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:38:21,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:38:26,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 03:38:34,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:38:34,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1124506.6666666667, ans=0.1 2023-10-03 03:38:35,763 INFO [train.py:1046] (1/4) Epoch 32, batch 4000, loss[loss=0.1645, simple_loss=0.2543, pruned_loss=0.03737, over 24547.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2406, pruned_loss=0.042, over 4717434.30 frames. ], batch size: 71, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:38:36,145 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:38:38,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:38:44,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:38:44,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:38:44,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:38:45,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 03:38:46,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:38:47,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 03:38:47,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:38:47,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 03:38:50,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:38:55,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:38:55,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:38:55,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:38:55,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:38:55,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 03:38:55,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:38:56,947 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 03:38:58,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:38:58,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:39:02,486 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 03:39:02,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:39:02,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:39:07,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 03:39:08,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:39:10,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:39:12,360 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 03:39:14,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:39:14,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1124640.0, ans=0.125 2023-10-03 03:39:15,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 03:39:15,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:39:16,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:39:18,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:39:19,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:39:20,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:39:21,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:39:22,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 03:39:22,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:39:24,587 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 03:39:29,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:39:32,725 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.23 vs. limit=15.0 2023-10-03 03:39:33,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 03:39:34,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:39:34,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:39:36,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:39:37,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:39:41,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:39:44,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 03:39:44,874 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.29 vs. limit=15.0 2023-10-03 03:39:45,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 03:39:46,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:39:47,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:39:47,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:39:49,023 INFO [train.py:1046] (1/4) Epoch 32, batch 4050, loss[loss=0.1625, simple_loss=0.2523, pruned_loss=0.03634, over 24650.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2415, pruned_loss=0.042, over 4722695.82 frames. ], batch size: 73, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:39:49,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:39:49,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1124840.0, ans=0.0 2023-10-03 03:39:51,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:39:53,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:39:55,674 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.778e+02 1.970e+02 2.195e+02 3.325e+02, threshold=3.940e+02, percent-clipped=0.0 2023-10-03 03:39:57,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:39:58,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:40:01,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:40:01,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:40:05,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:40:06,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:40:08,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 03:40:11,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 03:40:11,270 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 03:40:13,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:40:21,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 03:40:23,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:40:24,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1124973.3333333333, ans=0.125 2023-10-03 03:40:25,434 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.49 vs. limit=15.0 2023-10-03 03:40:25,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:40:29,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:40:31,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:40:31,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:40:35,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:40:38,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 03:40:38,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:40:39,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:40:40,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 03:40:43,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:40:47,407 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.85 vs. limit=15.0 2023-10-03 03:40:49,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 03:40:52,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:40:52,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:40:53,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 03:40:53,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 03:40:53,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:40:57,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:40:57,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:40:58,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:41:00,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1125106.6666666667, ans=0.125 2023-10-03 03:41:03,127 INFO [train.py:1046] (1/4) Epoch 32, batch 4100, loss[loss=0.1463, simple_loss=0.2355, pruned_loss=0.02856, over 24484.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2424, pruned_loss=0.04239, over 4728572.83 frames. ], batch size: 69, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:41:06,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 03:41:07,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 03:41:09,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 03:41:10,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 03:41:10,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:41:11,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:41:11,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:41:11,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:41:14,636 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 03:41:17,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:41:17,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:41:17,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:41:18,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:41:23,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1125240.0, ans=0.1 2023-10-03 03:41:24,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:41:24,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:41:24,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:41:24,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 03:41:27,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:41:27,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:41:27,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:41:28,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:41:28,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 03:41:31,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:41:31,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 03:41:33,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:41:36,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:41:36,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 03:41:37,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:41:37,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:41:37,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:41:40,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 03:41:40,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:41:42,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:41:43,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 03:41:45,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:41:45,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:41:47,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:41:50,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:41:52,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:41:54,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:42:03,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:42:03,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:42:08,661 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.04 vs. limit=22.5 2023-10-03 03:42:09,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:42:10,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:42:16,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:42:16,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:42:16,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1125506.6666666667, ans=0.0 2023-10-03 03:42:17,482 INFO [train.py:1046] (1/4) Epoch 32, batch 4150, loss[loss=0.1632, simple_loss=0.2462, pruned_loss=0.04016, over 24671.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2419, pruned_loss=0.04191, over 4736827.34 frames. ], batch size: 65, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:42:17,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:42:17,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:42:20,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 03:42:20,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:42:20,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 03:42:21,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 03:42:21,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 03:42:23,089 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.897e+02 2.094e+02 2.297e+02 3.189e+02, threshold=4.189e+02, percent-clipped=0.0 2023-10-03 03:42:23,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:42:26,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1125506.6666666667, ans=0.2 2023-10-03 03:42:27,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:42:27,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:42:31,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:42:33,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:42:34,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:42:36,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 03:42:36,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:42:37,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 03:42:40,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:42:43,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:42:44,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 03:42:47,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 03:42:47,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:42:49,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 03:42:49,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:42:49,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:42:50,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:42:51,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:42:55,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 03:42:58,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:43:00,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:43:01,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 03:43:01,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:43:03,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1125706.6666666667, ans=0.125 2023-10-03 03:43:04,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 03:43:06,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:43:06,831 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.43 vs. limit=15.0 2023-10-03 03:43:08,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:43:10,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:43:11,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 03:43:11,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:43:11,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:43:11,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 03:43:14,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 03:43:14,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:43:14,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:43:14,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:43:16,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1125773.3333333333, ans=0.1 2023-10-03 03:43:17,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 03:43:17,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:43:17,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:43:17,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:43:18,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:43:18,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 03:43:19,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:43:24,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:43:27,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 03:43:30,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:43:31,725 INFO [train.py:1046] (1/4) Epoch 32, batch 4200, loss[loss=0.1442, simple_loss=0.2201, pruned_loss=0.03417, over 24601.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2402, pruned_loss=0.04161, over 4728528.45 frames. ], batch size: 60, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:43:31,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:43:33,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:43:35,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:43:35,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:43:37,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 03:43:39,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 03:43:41,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:43:44,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:43:46,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:43:47,735 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.22 vs. limit=6.0 2023-10-03 03:43:49,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 03:43:51,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:43:51,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:43:52,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 03:43:52,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:43:53,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:43:55,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:43:55,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:43:56,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:43:59,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 03:43:59,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:44:00,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1125973.3333333333, ans=0.0 2023-10-03 03:44:04,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:44:06,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:44:06,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1125973.3333333333, ans=0.1 2023-10-03 03:44:08,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:44:09,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:44:11,897 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.82 vs. limit=12.0 2023-10-03 03:44:12,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:44:12,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 03:44:12,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:44:14,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:44:19,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 03:44:20,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:44:25,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:44:27,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 03:44:29,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:44:31,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1126106.6666666667, ans=0.0 2023-10-03 03:44:35,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:44:37,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:44:39,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 03:44:41,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:44:45,808 INFO [train.py:1046] (1/4) Epoch 32, batch 4250, loss[loss=0.1582, simple_loss=0.2263, pruned_loss=0.04506, over 23718.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2394, pruned_loss=0.04142, over 4726037.33 frames. ], batch size: 232, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:44:47,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:44:47,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:44:50,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:44:51,398 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.847e+02 2.009e+02 2.181e+02 2.689e+02, threshold=4.019e+02, percent-clipped=0.0 2023-10-03 03:44:52,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:44:54,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 03:44:54,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:44:56,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:45:00,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:45:02,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1126240.0, ans=0.125 2023-10-03 03:45:04,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:45:06,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:45:09,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:45:09,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:45:10,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:45:12,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:45:13,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:45:15,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:45:16,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:45:17,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 03:45:21,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 03:45:21,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:45:23,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:45:23,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:45:24,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:45:24,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:45:24,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:45:28,122 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.61 vs. limit=10.0 2023-10-03 03:45:28,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 03:45:30,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:45:30,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1126373.3333333333, ans=0.125 2023-10-03 03:45:35,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:45:36,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:45:36,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1126373.3333333333, ans=0.1 2023-10-03 03:45:37,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 03:45:37,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:45:37,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 03:45:39,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:45:42,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:45:43,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:45:43,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:45:45,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 03:45:46,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 03:45:47,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:45:50,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1126440.0, ans=0.125 2023-10-03 03:45:50,734 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.40 vs. limit=15.0 2023-10-03 03:45:52,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:45:55,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:45:55,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:45:57,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:45:58,285 INFO [train.py:1046] (1/4) Epoch 32, batch 4300, loss[loss=0.146, simple_loss=0.2219, pruned_loss=0.03506, over 24346.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2392, pruned_loss=0.04113, over 4737435.74 frames. ], batch size: 56, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:45:58,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:45:59,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:46:01,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:46:01,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 03:46:01,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1126506.6666666667, ans=0.125 2023-10-03 03:46:04,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:46:08,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:46:08,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:46:13,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:46:19,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:46:19,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 03:46:20,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:46:22,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:46:22,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:46:22,409 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 03:46:25,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:46:26,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:46:29,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 03:46:29,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:46:29,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 03:46:32,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 03:46:34,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:46:37,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:46:37,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:46:39,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:46:39,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:46:39,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1126640.0, ans=0.125 2023-10-03 03:46:41,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:46:42,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 03:46:42,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 03:46:45,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:46:47,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:46:47,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:46:48,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:46:48,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:46:48,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 03:46:48,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 03:46:49,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 03:46:51,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:46:51,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 03:46:51,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 03:46:54,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1126706.6666666667, ans=0.125 2023-10-03 03:46:55,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:46:56,835 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 03:46:56,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:46:57,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1126773.3333333333, ans=0.2 2023-10-03 03:46:59,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:46:59,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:47:02,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 03:47:03,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:47:03,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:47:03,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:47:03,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:47:05,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:47:06,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:47:10,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:47:12,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:47:12,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:47:13,585 INFO [train.py:1046] (1/4) Epoch 32, batch 4350, loss[loss=0.147, simple_loss=0.2282, pruned_loss=0.0329, over 24591.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2398, pruned_loss=0.04133, over 4742369.06 frames. ], batch size: 60, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:47:16,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 03:47:16,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 03:47:19,714 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.807e+02 2.015e+02 2.247e+02 3.972e+02, threshold=4.030e+02, percent-clipped=0.0 2023-10-03 03:47:21,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:47:21,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1126840.0, ans=0.0 2023-10-03 03:47:22,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:47:25,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:47:25,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:47:27,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1126906.6666666667, ans=0.2 2023-10-03 03:47:29,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:47:32,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:47:36,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:47:36,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:47:39,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:47:42,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:47:44,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:47:49,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 03:47:49,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:47:49,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:47:50,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1126973.3333333333, ans=0.0 2023-10-03 03:47:53,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:47:55,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 03:47:57,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:47:59,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 03:48:01,862 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 03:48:03,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:48:03,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1127040.0, ans=0.125 2023-10-03 03:48:04,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:48:06,405 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 03:48:07,741 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 03:48:07,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:48:07,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:48:09,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:48:09,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:48:11,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:48:11,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:48:14,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 03:48:14,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:14,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:48:14,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:16,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 03:48:16,395 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 03:48:16,398 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 03:48:16,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1127106.6666666667, ans=0.125 2023-10-03 03:48:17,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 03:48:20,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:48:20,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:48:20,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:48:21,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:48:24,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 03:48:27,282 INFO [train.py:1046] (1/4) Epoch 32, batch 4400, loss[loss=0.1628, simple_loss=0.2379, pruned_loss=0.04388, over 22741.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2412, pruned_loss=0.04193, over 4727548.37 frames. ], batch size: 322, lr: 3.17e-03, grad_scale: 32.0 2023-10-03 03:48:27,347 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 03:48:27,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:30,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:48:30,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:31,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:48:33,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 03:48:33,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 03:48:33,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 03:48:33,136 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 03:48:34,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 03:48:34,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:48:37,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 03:48:39,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:39,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:48:39,238 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 03:48:42,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:48:42,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 03:48:44,265 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 03:48:47,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 03:48:48,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 03:48:48,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 03:48:49,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:48:50,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:48:51,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:48:52,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:48:54,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 03:48:54,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 03:48:55,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:48:57,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:48:57,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:58,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:48:58,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:48:59,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 03:49:00,005 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 03:49:01,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1127306.6666666667, ans=0.125 2023-10-03 03:49:04,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:49:10,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:49:13,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 03:49:16,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:49:20,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:49:21,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:49:21,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 03:49:23,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:49:23,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:49:23,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:49:23,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 03:49:27,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 03:49:29,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 03:49:29,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 03:49:31,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:49:31,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 03:49:32,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:49:33,308 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.84 vs. limit=15.0 2023-10-03 03:49:35,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:49:38,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 03:49:41,190 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.60 vs. limit=5.0 2023-10-03 03:49:41,426 INFO [train.py:1046] (1/4) Epoch 32, batch 4450, loss[loss=0.1722, simple_loss=0.2584, pruned_loss=0.04304, over 24660.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2417, pruned_loss=0.04218, over 4717308.94 frames. ], batch size: 68, lr: 3.17e-03, grad_scale: 32.0 2023-10-03 03:49:41,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:49:45,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:49:45,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:49:47,470 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.837e+02 2.024e+02 2.337e+02 3.195e+02, threshold=4.048e+02, percent-clipped=0.0 2023-10-03 03:49:52,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:49:52,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1127506.6666666667, ans=0.0 2023-10-03 03:49:53,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:49:57,390 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.21 vs. limit=12.0 2023-10-03 03:49:57,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:49:59,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:50:01,659 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.51 vs. limit=10.0 2023-10-03 03:50:02,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:50:02,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:50:02,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 03:50:02,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:50:03,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:50:03,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:50:03,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:50:06,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 03:50:09,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:50:10,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:50:12,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:50:12,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:50:12,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1127640.0, ans=0.0 2023-10-03 03:50:14,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:50:16,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1127640.0, ans=0.09899494936611666 2023-10-03 03:50:19,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 03:50:19,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 03:50:19,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 03:50:19,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:50:19,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1127640.0, ans=0.125 2023-10-03 03:50:22,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:50:23,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 03:50:26,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 03:50:31,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:50:31,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 03:50:31,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:50:31,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:50:31,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:50:31,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:50:33,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:50:35,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:50:37,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 03:50:38,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:50:38,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:50:40,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:50:42,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:50:42,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 03:50:45,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:50:48,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 03:50:49,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:50:54,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:50:55,311 INFO [train.py:1046] (1/4) Epoch 32, batch 4500, loss[loss=0.1681, simple_loss=0.2371, pruned_loss=0.04952, over 23779.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.242, pruned_loss=0.04248, over 4715435.81 frames. ], batch size: 164, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:50:56,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 03:50:56,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 03:50:58,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:50:59,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1127840.0, ans=0.125 2023-10-03 03:51:03,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:51:04,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:51:05,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:51:06,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:51:06,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:51:06,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:51:09,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1127906.6666666667, ans=0.0 2023-10-03 03:51:15,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:51:17,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:51:20,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:51:20,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:51:22,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:51:29,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:51:33,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:51:36,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:51:38,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:51:38,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 03:51:40,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:51:41,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:51:43,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:51:43,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:51:45,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:51:46,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 03:51:46,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 03:51:46,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:51:51,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:51:51,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:51:55,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:51:56,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:51:56,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:51:58,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 03:51:59,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 03:51:59,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 03:52:02,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 03:52:05,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 03:52:06,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:52:08,139 INFO [train.py:1046] (1/4) Epoch 32, batch 4550, loss[loss=0.1289, simple_loss=0.1889, pruned_loss=0.03447, over 22647.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2405, pruned_loss=0.04193, over 4702933.70 frames. ], batch size: 322, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:52:11,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:52:12,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:52:13,107 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:52:14,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:52:15,515 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.941e+02 2.112e+02 2.362e+02 4.046e+02, threshold=4.225e+02, percent-clipped=0.0 2023-10-03 03:52:19,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:52:22,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:52:24,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:52:24,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:52:24,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:52:26,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:52:26,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:52:30,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:52:35,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 03:52:35,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 03:52:36,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:52:39,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 03:52:40,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 03:52:40,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:52:42,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1128306.6666666667, ans=0.0 2023-10-03 03:52:44,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 03:52:45,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:52:48,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:52:48,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:52:48,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:52:50,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 03:52:52,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:52:55,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:52:55,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:52:56,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:52:57,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 03:52:57,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 03:52:58,558 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.59 vs. limit=22.5 2023-10-03 03:52:59,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:52:59,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 03:53:00,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 03:53:00,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:53:02,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:02,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:53:03,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:53:03,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:53:05,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:53:05,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 03:53:06,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:53:06,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 03:53:07,053 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.50 vs. limit=22.5 2023-10-03 03:53:07,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 03:53:07,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:53:07,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 03:53:11,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:53:11,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:53:14,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:53:15,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:53:16,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:53:16,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:53:18,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1128440.0, ans=0.0 2023-10-03 03:53:19,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:53:21,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:22,890 INFO [train.py:1046] (1/4) Epoch 32, batch 4600, loss[loss=0.167, simple_loss=0.2541, pruned_loss=0.03993, over 24088.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2399, pruned_loss=0.04145, over 4715240.11 frames. ], batch size: 80, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:53:22,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:53:25,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:53:27,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:53:27,991 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.75 vs. limit=15.0 2023-10-03 03:53:28,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:53:29,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 03:53:29,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1128506.6666666667, ans=0.125 2023-10-03 03:53:31,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:53:35,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:53:36,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:53:38,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:45,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 03:53:45,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:46,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1128573.3333333333, ans=0.125 2023-10-03 03:53:49,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:52,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:53:52,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:53:54,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1128640.0, ans=0.125 2023-10-03 03:53:58,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 03:53:58,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:53:59,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:54:02,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff2.min_abs, batch_count=1128640.0, ans=0.1 2023-10-03 03:54:03,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:54:03,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:54:03,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1128640.0, ans=0.1 2023-10-03 03:54:05,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1128706.6666666667, ans=0.0 2023-10-03 03:54:06,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:54:06,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1128706.6666666667, ans=0.125 2023-10-03 03:54:07,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 03:54:09,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1128706.6666666667, ans=0.125 2023-10-03 03:54:10,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 03:54:13,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:16,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:54:18,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:19,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 03:54:19,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:54:19,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 03:54:20,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:20,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:54:21,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:23,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:54:24,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:54:24,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 03:54:26,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 03:54:26,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 03:54:26,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:54:28,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:54:29,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:54:29,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1128773.3333333333, ans=0.1 2023-10-03 03:54:30,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:54:36,615 INFO [train.py:1046] (1/4) Epoch 32, batch 4650, loss[loss=0.175, simple_loss=0.2558, pruned_loss=0.04707, over 23503.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2403, pruned_loss=0.04145, over 4706525.69 frames. ], batch size: 93, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:54:38,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:54:40,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:54:41,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:54:42,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:54:43,537 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.926e+02 2.147e+02 2.476e+02 3.690e+02, threshold=4.294e+02, percent-clipped=0.0 2023-10-03 03:54:43,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:54:43,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:54:44,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:54:48,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 03:54:49,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:54:51,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 03:54:53,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:54:55,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 03:54:55,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:54:55,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 03:54:55,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 03:54:55,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:57,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:54:57,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1128906.6666666667, ans=0.0 2023-10-03 03:54:58,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1128906.6666666667, ans=0.125 2023-10-03 03:54:59,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:55:01,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:55:02,505 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 03:55:05,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:55:06,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 03:55:10,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:55:10,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:55:11,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 03:55:13,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:55:16,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:55:16,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1128973.3333333333, ans=0.2 2023-10-03 03:55:20,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:55:25,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:55:26,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:55:28,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:55:28,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:55:33,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 03:55:33,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 03:55:33,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 03:55:33,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 03:55:36,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:55:43,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:55:43,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:55:43,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 03:55:43,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:55:44,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:55:44,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:55:44,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1129106.6666666667, ans=0.125 2023-10-03 03:55:45,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:55:46,500 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.66 vs. limit=6.0 2023-10-03 03:55:47,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:55:47,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:55:47,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1129106.6666666667, ans=0.0 2023-10-03 03:55:49,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:55:50,651 INFO [train.py:1046] (1/4) Epoch 32, batch 4700, loss[loss=0.1598, simple_loss=0.234, pruned_loss=0.04282, over 23732.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2405, pruned_loss=0.04172, over 4713029.92 frames. ], batch size: 164, lr: 3.17e-03, grad_scale: 8.0 2023-10-03 03:55:52,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:55:52,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:55:52,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:55:53,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 03:55:53,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1129173.3333333333, ans=0.125 2023-10-03 03:55:55,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:55:55,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 03:56:04,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:56:06,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:56:06,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:56:07,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:56:09,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:56:11,164 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.06 vs. limit=22.5 2023-10-03 03:56:13,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 03:56:13,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 03:56:17,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:56:17,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:56:18,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:56:22,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:56:27,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:56:27,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 03:56:31,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:56:36,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 03:56:36,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1129373.3333333333, ans=0.125 2023-10-03 03:56:36,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1129373.3333333333, ans=0.2 2023-10-03 03:56:37,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:56:38,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:56:41,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 03:56:42,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:56:47,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:56:47,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 03:56:49,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:56:49,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:56:53,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:56:53,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:56:54,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 03:56:55,680 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 03:56:57,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:56:58,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:56:58,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:56:58,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 03:57:00,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:57:00,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1129440.0, ans=0.0 2023-10-03 03:57:03,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 03:57:05,060 INFO [train.py:1046] (1/4) Epoch 32, batch 4750, loss[loss=0.1454, simple_loss=0.2258, pruned_loss=0.03256, over 24461.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2406, pruned_loss=0.0416, over 4711391.66 frames. ], batch size: 63, lr: 3.17e-03, grad_scale: 8.0 2023-10-03 03:57:07,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:57:09,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:57:14,007 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.934e+02 2.171e+02 2.465e+02 4.386e+02, threshold=4.342e+02, percent-clipped=1.0 2023-10-03 03:57:14,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:57:14,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:57:15,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 03:57:15,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:57:18,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 03:57:19,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:57:20,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:57:20,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:57:24,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 03:57:29,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:57:30,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 03:57:30,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:57:33,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:57:33,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:57:35,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:57:35,297 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 03:57:35,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 03:57:36,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1129640.0, ans=0.2 2023-10-03 03:57:41,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1129640.0, ans=0.1 2023-10-03 03:57:42,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 03:57:43,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:57:44,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1129640.0, ans=0.0 2023-10-03 03:57:45,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:57:46,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:57:46,863 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 03:57:46,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:57:51,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:57:51,700 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1129706.6666666667, ans=0.1 2023-10-03 03:57:52,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:57:53,643 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.82 vs. limit=12.0 2023-10-03 03:57:55,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 03:57:55,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 03:57:56,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:57:57,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:57:58,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:57:58,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:57:58,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 03:58:00,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1129706.6666666667, ans=0.125 2023-10-03 03:58:01,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 03:58:05,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:09,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:58:09,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 03:58:09,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:58:10,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:58:10,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1129773.3333333333, ans=0.125 2023-10-03 03:58:13,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:58:13,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:58:15,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:58:18,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:58:18,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 03:58:18,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 03:58:19,447 INFO [train.py:1046] (1/4) Epoch 32, batch 4800, loss[loss=0.1856, simple_loss=0.2523, pruned_loss=0.05941, over 23804.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2419, pruned_loss=0.04227, over 4711688.06 frames. ], batch size: 150, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:58:19,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 03:58:24,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:58:24,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:58:24,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 03:58:28,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:58:29,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:30,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1129840.0, ans=0.125 2023-10-03 03:58:34,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:58:36,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:58:37,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:58:37,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1129906.6666666667, ans=0.5 2023-10-03 03:58:39,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 03:58:40,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:58:40,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:58:41,396 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.22 vs. limit=15.0 2023-10-03 03:58:42,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:58:46,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:58:46,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1129906.6666666667, ans=0.125 2023-10-03 03:58:47,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:58:47,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:58:49,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:58:49,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 03:58:49,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:49,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:58:52,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:58:54,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:56,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:56,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:58:56,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:58:56,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:58:58,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 03:58:58,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 03:58:58,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten.whitening_limit, batch_count=1129973.3333333333, ans=15.0 2023-10-03 03:58:59,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:59:01,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:59:01,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:59:01,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:59:01,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:59:01,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1129973.3333333333, ans=0.1 2023-10-03 03:59:02,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:59:04,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:59:04,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1130040.0, ans=0.1 2023-10-03 03:59:07,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:59:08,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:10,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:59:16,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 03:59:16,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:59:17,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:17,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:59:18,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:59:23,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:59:23,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:59:23,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:23,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1130106.6666666667, ans=0.0 2023-10-03 03:59:24,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:59:24,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:59:25,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1130106.6666666667, ans=0.2 2023-10-03 03:59:26,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:59:29,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:59:29,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:29,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:59:30,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 03:59:33,566 INFO [train.py:1046] (1/4) Epoch 32, batch 4850, loss[loss=0.1566, simple_loss=0.2432, pruned_loss=0.03497, over 24478.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2417, pruned_loss=0.04217, over 4721790.79 frames. ], batch size: 66, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:59:33,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 03:59:33,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:59:33,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:59:35,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:59:35,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:38,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:59:38,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1130173.3333333333, ans=0.0 2023-10-03 03:59:42,445 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.847e+02 2.113e+02 2.344e+02 3.781e+02, threshold=4.226e+02, percent-clipped=0.0 2023-10-03 03:59:44,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 03:59:44,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1130173.3333333333, ans=0.125 2023-10-03 03:59:47,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:59:47,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1130240.0, ans=0.125 2023-10-03 03:59:51,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:59:52,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:59:52,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:56,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:59:57,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:59:58,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:59:58,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 04:00:02,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:00:06,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:00:06,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 04:00:06,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:00:06,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 04:00:09,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:00:09,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:00:12,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:00:12,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 04:00:13,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 04:00:15,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:00:22,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:00:22,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 04:00:24,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:00:24,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:00:25,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:00:27,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 04:00:27,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:00:27,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 04:00:28,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:00:30,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:00:30,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 04:00:39,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:00:43,066 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.50 vs. limit=6.0 2023-10-03 04:00:43,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:00:43,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:00:48,771 INFO [train.py:1046] (1/4) Epoch 32, batch 4900, loss[loss=0.1667, simple_loss=0.2349, pruned_loss=0.04927, over 23745.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2411, pruned_loss=0.04162, over 4726068.36 frames. ], batch size: 212, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:00:50,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 04:00:50,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:00:54,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:00:54,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1130506.6666666667, ans=0.125 2023-10-03 04:00:57,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:00:57,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:00:59,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 04:01:05,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 04:01:09,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 04:01:10,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 04:01:10,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:01:10,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:01:10,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:01:10,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:01:11,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:01:11,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 04:01:11,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1130573.3333333333, ans=0.125 2023-10-03 04:01:13,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 04:01:15,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 04:01:15,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:01:17,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:01:19,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:01:19,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:01:21,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1130640.0, ans=0.1 2023-10-03 04:01:22,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:01:22,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 04:01:22,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1130640.0, ans=0.025 2023-10-03 04:01:24,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:01:25,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:01:25,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 04:01:25,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 04:01:29,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 04:01:32,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:01:34,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:01:34,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:01:35,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:01:36,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 04:01:36,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:01:37,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 04:01:38,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:01:41,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:01:43,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:01:46,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 04:01:46,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:01:46,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 04:01:46,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 04:01:53,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:01:55,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:01:55,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 04:01:55,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 04:01:55,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:01:57,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:02:00,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:02:01,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:02:01,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:02:01,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 04:02:02,563 INFO [train.py:1046] (1/4) Epoch 32, batch 4950, loss[loss=0.1478, simple_loss=0.2183, pruned_loss=0.03867, over 23671.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2393, pruned_loss=0.04128, over 4716210.95 frames. ], batch size: 232, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:02:02,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 04:02:07,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:02:07,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 04:02:08,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 04:02:08,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 04:02:10,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:02:10,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 04:02:10,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:10,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:02:11,965 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.926e+02 2.133e+02 2.555e+02 3.988e+02, threshold=4.267e+02, percent-clipped=0.0 2023-10-03 04:02:12,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:02:12,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:02:13,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:02:13,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:02:16,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:02:18,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:02:18,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1130906.6666666667, ans=0.0 2023-10-03 04:02:19,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:19,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:02:22,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 04:02:29,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:31,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:02:32,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:32,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:02:35,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:02:36,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 04:02:37,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 04:02:39,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:02:40,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.12 vs. limit=22.5 2023-10-03 04:02:40,626 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.58 vs. limit=15.0 2023-10-03 04:02:41,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:02:41,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:02:44,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:02:44,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:02:44,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:02:45,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:02:48,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:02:50,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:02:53,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:53,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:02:53,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 04:02:53,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:02:56,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:03:00,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:03:01,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:03:01,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:03:02,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:03:02,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:03:03,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:03:04,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:03:05,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1131106.6666666667, ans=0.04949747468305833 2023-10-03 04:03:06,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:03:06,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:03:06,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 04:03:06,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1131106.6666666667, ans=0.125 2023-10-03 04:03:08,803 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.62 vs. limit=22.5 2023-10-03 04:03:09,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1131106.6666666667, ans=0.1 2023-10-03 04:03:10,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:03:15,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 04:03:15,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 04:03:17,021 INFO [train.py:1046] (1/4) Epoch 32, batch 5000, loss[loss=0.1548, simple_loss=0.228, pruned_loss=0.04085, over 23258.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.239, pruned_loss=0.0412, over 4729721.30 frames. ], batch size: 119, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:03:21,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:03:21,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:03:23,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 04:03:24,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 04:03:25,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:03:27,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 04:03:27,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1131173.3333333333, ans=0.125 2023-10-03 04:03:28,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:03:28,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:03:28,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 04:03:30,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:03:30,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:03:30,683 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.30 vs. limit=12.0 2023-10-03 04:03:32,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 04:03:32,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:03:32,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:03:34,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 04:03:36,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 04:03:36,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:03:36,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 04:03:38,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 04:03:38,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:03:40,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:03:40,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 04:03:40,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 04:03:41,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 04:03:41,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:03:42,369 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.42 vs. limit=15.0 2023-10-03 04:03:43,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:03:44,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 04:03:44,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:03:45,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:03:47,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:03:47,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 04:03:48,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 04:03:48,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:03:48,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:03:53,999 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.92 vs. limit=15.0 2023-10-03 04:03:54,600 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 04:03:58,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:03:58,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:03:58,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:03:59,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1131306.6666666667, ans=0.0 2023-10-03 04:04:01,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 04:04:01,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:04:01,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:04:01,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:04:02,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1131373.3333333333, ans=0.125 2023-10-03 04:04:03,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 04:04:05,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:04:08,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:04:09,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:04:12,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1131373.3333333333, ans=0.125 2023-10-03 04:04:14,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 04:04:14,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1131373.3333333333, ans=0.125 2023-10-03 04:04:18,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:25,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:04:27,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:27,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:04:27,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:04:28,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:04:28,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:04:28,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1131440.0, ans=0.2 2023-10-03 04:04:29,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:31,093 INFO [train.py:1046] (1/4) Epoch 32, batch 5050, loss[loss=0.1642, simple_loss=0.2568, pruned_loss=0.03578, over 24649.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2392, pruned_loss=0.04109, over 4741327.97 frames. ], batch size: 73, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:04:33,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:34,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 04:04:34,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:04:37,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:04:39,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:04:39,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 04:04:40,632 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.847e+02 2.039e+02 2.357e+02 3.411e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 04:04:40,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:04:40,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:04:41,418 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.87 vs. limit=22.5 2023-10-03 04:04:43,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:04:43,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:04:44,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:04:47,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1131573.3333333333, ans=0.1 2023-10-03 04:04:50,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 04:04:51,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:04:52,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1131573.3333333333, ans=0.035 2023-10-03 04:04:53,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:04:53,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 04:04:53,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:04:55,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:04:55,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:04:56,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:04:56,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 04:04:56,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 04:04:57,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:05:01,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:05:03,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:05:05,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 04:05:06,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:05:10,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 04:05:13,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:05:13,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:05:13,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:05:13,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1131640.0, ans=0.0 2023-10-03 04:05:14,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:05:17,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:05:18,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:05:20,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:05:20,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:05:20,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:05:21,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 04:05:21,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:05:23,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:05:26,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:05:26,626 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 04:05:27,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:05:28,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:05:29,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:05:29,398 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 04:05:32,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:05:32,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 04:05:32,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:05:36,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:05:37,135 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:05:37,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1131773.3333333333, ans=0.0 2023-10-03 04:05:38,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:05:38,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 04:05:38,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 04:05:39,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1131773.3333333333, ans=0.0 2023-10-03 04:05:41,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:05:41,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:05:41,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:05:44,728 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 04:05:45,976 INFO [train.py:1046] (1/4) Epoch 32, batch 5100, loss[loss=0.1752, simple_loss=0.2624, pruned_loss=0.04397, over 23443.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2404, pruned_loss=0.04156, over 4722894.83 frames. ], batch size: 93, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:05:46,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:05:48,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 04:05:48,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 04:05:50,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:05:51,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1131840.0, ans=0.125 2023-10-03 04:05:52,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:05:54,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:05:54,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1131840.0, ans=0.09899494936611666 2023-10-03 04:05:54,896 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.89 vs. limit=15.0 2023-10-03 04:05:55,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 04:05:55,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 04:05:56,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1131840.0, ans=0.2 2023-10-03 04:05:59,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:05:59,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:06:01,209 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.06 vs. limit=15.0 2023-10-03 04:06:02,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:06:04,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1131906.6666666667, ans=0.0 2023-10-03 04:06:05,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 04:06:07,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:06:08,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:06:08,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 04:06:11,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:06:11,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:06:11,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 04:06:14,653 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 04:06:14,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:06:14,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 04:06:14,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 04:06:20,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:06:20,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1131973.3333333333, ans=0.125 2023-10-03 04:06:27,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:06:30,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 04:06:30,512 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 04:06:30,520 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 04:06:33,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 04:06:33,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:06:36,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 04:06:41,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 04:06:42,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1132040.0, ans=0.125 2023-10-03 04:06:44,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 04:06:46,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:06:46,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1132106.6666666667, ans=0.0 2023-10-03 04:06:48,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 04:06:53,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:06:53,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 04:06:58,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:06:58,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:06:58,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:06:58,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1132173.3333333333, ans=0.125 2023-10-03 04:07:00,024 INFO [train.py:1046] (1/4) Epoch 32, batch 5150, loss[loss=0.1615, simple_loss=0.2497, pruned_loss=0.03667, over 24049.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2413, pruned_loss=0.04156, over 4736399.15 frames. ], batch size: 80, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:07:00,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:07:00,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:07:01,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:07:01,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 04:07:01,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 04:07:02,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 04:07:02,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:07:02,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 04:07:04,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:07:04,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 04:07:06,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:07:06,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1132173.3333333333, ans=0.125 2023-10-03 04:07:07,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:07:09,079 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.942e+02 2.192e+02 2.524e+02 4.905e+02, threshold=4.384e+02, percent-clipped=1.0 2023-10-03 04:07:11,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 04:07:11,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1132173.3333333333, ans=0.95 2023-10-03 04:07:12,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 04:07:12,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:07:12,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:07:15,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:07:15,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:07:15,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:07:16,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:07:16,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:07:17,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 04:07:18,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:07:18,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:07:18,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1132240.0, ans=0.0 2023-10-03 04:07:20,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 04:07:20,542 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.09 vs. limit=10.0 2023-10-03 04:07:22,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 04:07:24,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:07:29,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:07:31,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 04:07:35,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:07:41,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:07:43,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:07:44,349 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.80 vs. limit=6.0 2023-10-03 04:07:47,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:07:47,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:07:50,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 04:07:52,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:07:54,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:07:54,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:07:57,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:07:57,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:07:59,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 04:08:04,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:08:05,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 04:08:08,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:08:08,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:08:10,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:08:10,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:08:10,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:08:10,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:08:12,499 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.54 vs. limit=22.5 2023-10-03 04:08:13,752 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:08:14,901 INFO [train.py:1046] (1/4) Epoch 32, batch 5200, loss[loss=0.1706, simple_loss=0.2553, pruned_loss=0.04299, over 23583.00 frames. ], tot_loss[loss=0.163, simple_loss=0.242, pruned_loss=0.04194, over 4735554.29 frames. ], batch size: 94, lr: 3.17e-03, grad_scale: 32.0 2023-10-03 04:08:15,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:08:16,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:08:19,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:08:24,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 04:08:25,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:08:26,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:08:28,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:08:29,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:08:29,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:08:32,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 04:08:35,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 04:08:35,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:08:36,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 04:08:39,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:08:40,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 04:08:41,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 04:08:41,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 04:08:44,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 04:08:45,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:08:45,691 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 04:08:45,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:08:47,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:08:47,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:08:49,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 04:08:49,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:08:50,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:08:53,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 04:08:55,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 04:08:55,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 04:08:58,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1132706.6666666667, ans=0.125 2023-10-03 04:09:00,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 04:09:00,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:09:06,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:09:06,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:09:07,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 04:09:07,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:09:08,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 04:09:08,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:09:09,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:09:11,170 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:09:13,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:09:13,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:09:18,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:09:18,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:09:18,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:09:24,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:09:24,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1132773.3333333333, ans=0.0 2023-10-03 04:09:25,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 04:09:25,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:09:25,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:09:27,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:09:28,320 INFO [train.py:1046] (1/4) Epoch 32, batch 5250, loss[loss=0.1497, simple_loss=0.2387, pruned_loss=0.0304, over 24567.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2414, pruned_loss=0.04188, over 4725937.35 frames. ], batch size: 71, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:09:28,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:09:28,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:09:28,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1132840.0, ans=0.1 2023-10-03 04:09:30,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1132840.0, ans=0.125 2023-10-03 04:09:31,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:09:34,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:09:34,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:09:35,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:09:38,124 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.839e+02 2.059e+02 2.239e+02 2.945e+02, threshold=4.119e+02, percent-clipped=0.0 2023-10-03 04:09:39,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:09:41,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:09:44,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:09:47,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:09:48,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 04:09:49,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:09:49,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:10:01,410 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.95 vs. limit=22.5 2023-10-03 04:10:05,018 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1132973.3333333333, ans=0.125 2023-10-03 04:10:10,487 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.44 vs. limit=15.0 2023-10-03 04:10:18,933 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:10:22,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1133106.6666666667, ans=0.0 2023-10-03 04:10:27,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1133106.6666666667, ans=0.1 2023-10-03 04:10:36,644 INFO [train.py:1046] (1/4) Epoch 32, batch 5300, loss[loss=0.1483, simple_loss=0.2271, pruned_loss=0.0347, over 21201.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2399, pruned_loss=0.04189, over 4702909.32 frames. ], batch size: 46, lr: 3.16e-03, grad_scale: 16.0 2023-10-03 04:10:42,908 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.37 vs. limit=15.0 2023-10-03 04:10:51,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:10:51,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 04:10:51,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 04:10:51,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:10:51,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:10:51,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:10:51,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:10:51,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:10:51,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:10:51,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:10:51,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:10:51,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:10:51,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 04:10:52,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 04:10:52,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 04:10:52,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:10:52,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 04:10:52,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 04:10:52,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:10:53,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:10:53,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:10:53,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:10:53,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:10:53,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:10:53,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:10:53,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:10:53,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:10:53,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:10:53,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:10:53,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:10:53,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:10:54,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 04:10:54,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:10:54,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:10:54,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 04:10:54,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 04:10:54,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:10:54,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:10:54,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 04:10:55,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 04:10:55,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 04:10:55,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:10:56,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:10:56,100 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 04:10:56,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 04:10:56,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:10:56,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:10:56,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 04:10:56,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 04:10:56,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 04:10:56,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 04:11:03,399 INFO [train.py:1046] (1/4) Epoch 33, batch 0, loss[loss=0.1306, simple_loss=0.2105, pruned_loss=0.02534, over 21967.00 frames. ], tot_loss[loss=0.1306, simple_loss=0.2105, pruned_loss=0.02534, over 21967.00 frames. ], batch size: 48, lr: 3.12e-03, grad_scale: 32.0 2023-10-03 04:11:03,399 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 04:11:15,271 INFO [train.py:1078] (1/4) Epoch 33, validation: loss=0.326, simple_loss=0.2728, pruned_loss=0.1896, over 1125622.00 frames. 2023-10-03 04:11:15,272 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 04:11:16,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 04:11:16,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:11:18,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:11:19,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.74 vs. limit=15.0 2023-10-03 04:11:20,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1133253.3333333333, ans=0.125 2023-10-03 04:11:22,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:11:22,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:11:24,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:11:24,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 04:11:25,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 04:11:27,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:11:27,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:11:31,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:11:31,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:11:32,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:11:32,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:11:34,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 04:11:35,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:11:42,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:11:42,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:11:42,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1133320.0, ans=0.125 2023-10-03 04:11:45,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 04:11:47,280 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.98 vs. limit=10.0 2023-10-03 04:11:48,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:11:48,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:11:51,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:11:55,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:11:59,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:12:05,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 04:12:06,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1133453.3333333333, ans=0.1 2023-10-03 04:12:08,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 04:12:09,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:12:09,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:12:11,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:12:11,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:12:13,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 04:12:14,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:12:16,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:12:21,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:12:22,672 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 04:12:23,924 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.812e+02 1.985e+02 2.280e+02 3.382e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-03 04:12:24,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:12:26,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:12:26,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1133520.0, ans=0.07 2023-10-03 04:12:27,463 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.32 vs. limit=15.0 2023-10-03 04:12:28,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1133586.6666666667, ans=0.2 2023-10-03 04:12:29,398 INFO [train.py:1046] (1/4) Epoch 33, batch 50, loss[loss=0.2177, simple_loss=0.2845, pruned_loss=0.07551, over 19635.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2446, pruned_loss=0.04336, over 1053973.84 frames. ], batch size: 389, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:12:29,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:12:29,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 04:12:29,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:12:29,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1133586.6666666667, ans=0.04949747468305833 2023-10-03 04:12:30,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:12:32,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:12:32,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:12:34,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:12:37,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 04:12:37,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:12:39,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1133586.6666666667, ans=0.0 2023-10-03 04:12:44,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:12:47,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 04:12:48,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 04:12:50,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:12:52,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:12:52,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:12:53,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:12:54,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:12:54,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 04:12:54,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:13:02,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:13:03,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:13:03,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:13:04,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 04:13:05,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1133720.0, ans=0.125 2023-10-03 04:13:06,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:13:07,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:13:07,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 04:13:09,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:13:10,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 04:13:18,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:13:20,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:13:20,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:13:21,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:13:21,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:13:22,477 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.69 vs. limit=12.0 2023-10-03 04:13:25,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 04:13:25,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 04:13:26,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:13:27,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:13:29,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:13:29,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:13:30,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 04:13:30,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 04:13:31,365 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.91 vs. limit=12.0 2023-10-03 04:13:32,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 04:13:33,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:13:34,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:13:34,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 04:13:34,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 04:13:37,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:13:37,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:13:40,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 04:13:40,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:13:41,506 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.66 vs. limit=22.5 2023-10-03 04:13:41,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:13:43,218 INFO [train.py:1046] (1/4) Epoch 33, batch 100, loss[loss=0.1621, simple_loss=0.2444, pruned_loss=0.0399, over 24627.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2427, pruned_loss=0.04192, over 1877760.34 frames. ], batch size: 68, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:13:44,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:13:46,878 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:13:48,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:13:50,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 04:13:50,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:13:52,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1133920.0, ans=0.1 2023-10-03 04:13:55,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:13:55,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:13:56,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:13:56,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:13:56,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:13:57,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 04:13:59,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1133986.6666666667, ans=0.125 2023-10-03 04:14:00,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:14:00,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:14:00,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:14:00,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:14:01,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1133986.6666666667, ans=0.125 2023-10-03 04:14:03,737 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.09 vs. limit=15.0 2023-10-03 04:14:04,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 04:14:04,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:14:05,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:14:07,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:14:08,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:14:12,734 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 04:14:12,762 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 04:14:14,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:14:14,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:14:17,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:14:19,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:14:20,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:22,069 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1134053.3333333333, ans=0.0 2023-10-03 04:14:26,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:28,022 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 04:14:29,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 04:14:33,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:14:36,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:14:37,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:40,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:14:41,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:14:43,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:14:44,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:46,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:14:48,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:14:48,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:14:48,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:49,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 04:14:49,454 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 04:14:50,693 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.866e+02 1.991e+02 2.230e+02 3.082e+02, threshold=3.982e+02, percent-clipped=0.0 2023-10-03 04:14:51,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:14:52,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:14:53,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:14:53,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:14:53,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 04:14:53,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 04:14:53,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:14:53,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:14:55,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:14:57,221 INFO [train.py:1046] (1/4) Epoch 33, batch 150, loss[loss=0.166, simple_loss=0.2546, pruned_loss=0.03865, over 24520.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2417, pruned_loss=0.04168, over 2509815.91 frames. ], batch size: 71, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:14:57,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:14:57,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:14:57,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:14:58,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:15:03,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:15:03,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:15:03,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:06,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:15:07,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:09,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:15:10,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:12,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1134320.0, ans=0.0 2023-10-03 04:15:13,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 04:15:13,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 04:15:13,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 04:15:15,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1134320.0, ans=0.125 2023-10-03 04:15:17,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:15:17,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:15:17,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:15:18,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:15:18,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:15:20,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:21,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:24,569 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 04:15:24,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:15:30,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:15:35,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:15:35,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1134386.6666666667, ans=0.125 2023-10-03 04:15:36,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 04:15:40,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:15:40,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:15:40,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:15:42,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:15:43,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:15:45,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:15:46,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:15:46,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 04:15:48,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1134453.3333333333, ans=0.125 2023-10-03 04:15:49,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1134453.3333333333, ans=0.125 2023-10-03 04:15:52,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:15:53,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:15:54,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:15:54,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:15:56,157 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.64 vs. limit=15.0 2023-10-03 04:15:57,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:15:58,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 04:16:00,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:16:01,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:16:02,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:16:02,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:16:04,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 04:16:04,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:16:04,876 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 04:16:06,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1134520.0, ans=0.1 2023-10-03 04:16:07,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:16:10,291 INFO [train.py:1046] (1/4) Epoch 33, batch 200, loss[loss=0.239, simple_loss=0.2999, pruned_loss=0.08903, over 19401.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2426, pruned_loss=0.04239, over 2996222.69 frames. ], batch size: 389, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:16:10,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:16:10,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:16:13,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 04:16:14,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:16:15,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:16:18,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1134586.6666666667, ans=0.125 2023-10-03 04:16:19,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 04:16:21,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:16:23,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:16:25,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:16:28,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:16:28,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:16:28,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:16:35,642 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:16:46,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:16:47,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:16:47,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:16:49,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:16:50,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 04:16:50,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:16:53,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:16:54,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:16:55,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1134786.6666666667, ans=0.0 2023-10-03 04:16:56,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:16:56,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:16:58,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 04:16:58,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 04:16:58,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:16:59,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1134786.6666666667, ans=0.125 2023-10-03 04:17:01,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1134786.6666666667, ans=0.125 2023-10-03 04:17:02,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:17:06,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:17:11,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1134853.3333333333, ans=0.125 2023-10-03 04:17:12,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:14,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:17:17,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1134853.3333333333, ans=0.2 2023-10-03 04:17:18,228 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.915e+02 2.133e+02 2.434e+02 3.393e+02, threshold=4.265e+02, percent-clipped=0.0 2023-10-03 04:17:19,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:23,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 04:17:23,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:17:23,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:17:23,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:17:24,368 INFO [train.py:1046] (1/4) Epoch 33, batch 250, loss[loss=0.1587, simple_loss=0.2332, pruned_loss=0.04216, over 23848.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2425, pruned_loss=0.04249, over 3365814.71 frames. ], batch size: 164, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:17:26,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:17:26,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 04:17:27,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:17:27,612 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 04:17:29,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:29,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:17:29,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:30,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:17:32,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:17:33,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:34,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:17:37,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:17:44,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1134986.6666666667, ans=0.1 2023-10-03 04:17:51,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:17:52,131 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.89 vs. limit=12.0 2023-10-03 04:17:53,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:17:54,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:17:55,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=1135053.3333333333, ans=15.0 2023-10-03 04:17:56,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1135053.3333333333, ans=10.0 2023-10-03 04:17:59,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:18:00,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:18:01,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:18:02,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:18:03,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:18:03,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:18:04,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:18:04,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:18:06,724 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.30 vs. limit=22.5 2023-10-03 04:18:07,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 04:18:07,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:18:09,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:18:10,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:18:10,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:18:12,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:18:13,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:18:13,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:18:15,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:18:15,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:18:17,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:18:21,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:18:23,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:18:26,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:18:32,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:18:32,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:18:36,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 04:18:37,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:18:37,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:18:38,776 INFO [train.py:1046] (1/4) Epoch 33, batch 300, loss[loss=0.1418, simple_loss=0.222, pruned_loss=0.03081, over 24301.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2405, pruned_loss=0.04147, over 3665323.80 frames. ], batch size: 56, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:18:38,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 04:18:38,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 04:18:42,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:18:42,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 04:18:46,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:18:48,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:18:48,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1135253.3333333333, ans=0.125 2023-10-03 04:18:49,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1135253.3333333333, ans=0.125 2023-10-03 04:18:51,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:18:52,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 04:18:52,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:18:52,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1135320.0, ans=0.0 2023-10-03 04:18:55,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:18:55,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 04:18:55,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:19:00,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 04:19:00,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1135320.0, ans=0.07 2023-10-03 04:19:04,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:19:04,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 04:19:06,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1135320.0, ans=0.1 2023-10-03 04:19:07,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1135386.6666666667, ans=0.2 2023-10-03 04:19:07,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1135386.6666666667, ans=0.125 2023-10-03 04:19:08,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 04:19:08,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:10,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:19:12,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:12,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 04:19:13,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:19:14,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:19:14,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:19:16,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:19:19,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 04:19:19,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 04:19:20,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:19:22,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:25,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 04:19:26,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:19:31,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:19:33,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:19:33,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 04:19:37,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:37,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:19:41,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:42,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:19:42,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 04:19:44,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 04:19:44,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:19:45,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 04:19:46,847 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.972e+02 2.281e+02 2.681e+02 3.966e+02, threshold=4.562e+02, percent-clipped=0.0 2023-10-03 04:19:47,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:47,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:19:49,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:19:49,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:19:51,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:19:52,680 INFO [train.py:1046] (1/4) Epoch 33, batch 350, loss[loss=0.1521, simple_loss=0.2276, pruned_loss=0.03832, over 23741.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2395, pruned_loss=0.04117, over 3888424.47 frames. ], batch size: 232, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:19:55,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:19:55,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 04:19:58,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:03,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:20:04,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:20:04,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:06,299 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:20:06,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1135653.3333333333, ans=0.125 2023-10-03 04:20:07,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 04:20:07,967 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.11 vs. limit=15.0 2023-10-03 04:20:08,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:20:08,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 04:20:12,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:12,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 04:20:13,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:20:17,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 04:20:18,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:20:22,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:20:22,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:20:22,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:20:23,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:20:23,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:20:25,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:20:25,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:20:26,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:20:26,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:28,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1135720.0, ans=0.125 2023-10-03 04:20:34,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:20:34,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:20:35,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:20:35,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:20:40,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 04:20:40,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:44,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:20:44,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:20:44,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:20:47,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 04:20:50,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:20:50,274 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 04:20:51,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 04:20:51,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:20:51,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1135853.3333333333, ans=0.0 2023-10-03 04:20:53,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:20:53,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 04:20:56,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1135853.3333333333, ans=0.125 2023-10-03 04:20:58,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:20:59,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:21:00,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:21:02,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:21:02,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:21:04,792 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.00 vs. limit=15.0 2023-10-03 04:21:05,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:21:06,773 INFO [train.py:1046] (1/4) Epoch 33, batch 400, loss[loss=0.1561, simple_loss=0.2303, pruned_loss=0.04089, over 22769.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.239, pruned_loss=0.04119, over 4071020.87 frames. ], batch size: 50, lr: 3.11e-03, grad_scale: 32.0 2023-10-03 04:21:08,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:21:11,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:21:11,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 04:21:11,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:21:11,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:21:11,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1135920.0, ans=0.125 2023-10-03 04:21:13,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:21:13,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:21:15,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:21:17,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:21:17,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 04:21:18,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 04:21:18,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:21:20,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 04:21:20,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:21:24,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:21:24,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:21:24,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 04:21:26,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:21:26,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:21:26,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:21:26,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1135986.6666666667, ans=0.2 2023-10-03 04:21:27,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:21:29,336 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 04:21:30,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 04:21:33,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1135986.6666666667, ans=0.1 2023-10-03 04:21:34,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:21:35,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:21:35,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1136053.3333333333, ans=0.125 2023-10-03 04:21:36,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 04:21:37,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1136053.3333333333, ans=0.125 2023-10-03 04:21:38,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 04:21:40,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:21:42,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=1136053.3333333333, ans=15.0 2023-10-03 04:21:42,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:21:51,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 04:21:54,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:21:55,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 04:21:58,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:22:00,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:22:00,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 04:22:01,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1136120.0, ans=0.2 2023-10-03 04:22:04,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:22:07,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:22:09,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:22:11,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:22:12,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 04:22:14,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 04:22:15,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 04:22:16,716 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.810e+02 1.929e+02 2.053e+02 2.839e+02, threshold=3.858e+02, percent-clipped=0.0 2023-10-03 04:22:16,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:22:16,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:22:17,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1136186.6666666667, ans=0.125 2023-10-03 04:22:19,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 04:22:20,306 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.99 vs. limit=15.0 2023-10-03 04:22:20,941 INFO [train.py:1046] (1/4) Epoch 33, batch 450, loss[loss=0.156, simple_loss=0.2433, pruned_loss=0.03435, over 24493.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2395, pruned_loss=0.04121, over 4211092.24 frames. ], batch size: 66, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:22:21,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:22:21,762 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.64 vs. limit=22.5 2023-10-03 04:22:22,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:22:22,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 04:22:23,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 04:22:23,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:22:25,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:22:25,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:22:25,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 04:22:25,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:22:25,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1136253.3333333333, ans=0.125 2023-10-03 04:22:27,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:22:29,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:22:29,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=1136253.3333333333, ans=15.0 2023-10-03 04:22:39,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:22:39,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:22:42,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 04:22:43,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 04:22:46,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:22:49,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:22:50,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:22:53,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:22:53,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:22:56,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 04:22:57,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 04:22:58,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 04:22:59,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:22:59,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:23:01,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:23:02,813 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 04:23:02,822 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 04:23:04,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:23:05,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:23:07,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 04:23:10,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:23:10,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:23:10,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 04:23:12,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 04:23:15,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:23:17,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:23:17,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:23:19,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 04:23:19,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1136520.0, ans=0.0 2023-10-03 04:23:21,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:23:23,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 04:23:24,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 04:23:26,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:23:26,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1136520.0, ans=0.125 2023-10-03 04:23:30,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:23:31,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:23:35,415 INFO [train.py:1046] (1/4) Epoch 33, batch 500, loss[loss=0.1448, simple_loss=0.2306, pruned_loss=0.02945, over 24643.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.24, pruned_loss=0.04125, over 4324072.09 frames. ], batch size: 65, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:23:35,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:23:35,521 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 04:23:37,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1136586.6666666667, ans=0.125 2023-10-03 04:23:38,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:23:39,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:23:41,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:23:41,857 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 04:23:43,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 04:23:43,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:23:46,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 04:23:47,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1136586.6666666667, ans=0.125 2023-10-03 04:23:50,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 04:23:51,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:23:53,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:23:53,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:23:53,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:03,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:03,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:24:04,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:24:04,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:04,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 04:24:04,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:24:09,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:24:09,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:24:11,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:24:11,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:11,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 04:24:15,619 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 04:24:17,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:24:19,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:20,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:20,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:21,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:24:23,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 04:24:26,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:24:27,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:24:31,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:24:34,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:39,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:24:42,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 04:24:42,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:24:42,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:24:45,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 04:24:45,875 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.78 vs. limit=15.0 2023-10-03 04:24:46,563 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.881e+02 2.075e+02 2.361e+02 3.441e+02, threshold=4.149e+02, percent-clipped=0.0 2023-10-03 04:24:46,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 04:24:48,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:24:48,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1136920.0, ans=0.2 2023-10-03 04:24:50,007 INFO [train.py:1046] (1/4) Epoch 33, batch 550, loss[loss=0.1962, simple_loss=0.2659, pruned_loss=0.06324, over 23466.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2417, pruned_loss=0.04225, over 4406475.12 frames. ], batch size: 285, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:24:52,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 04:24:53,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1136920.0, ans=0.125 2023-10-03 04:24:55,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 04:24:55,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:24:55,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 04:24:57,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:24:57,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:24:58,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:58,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:58,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:24:59,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:25:01,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:25:02,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 04:25:04,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:25:06,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:06,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:25:10,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:25:10,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:25:14,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 04:25:14,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 04:25:15,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1136986.6666666667, ans=0.125 2023-10-03 04:25:18,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:25:22,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:25:22,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:25:23,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:25:26,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:26,514 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 04:25:26,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:25:27,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 04:25:31,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:25:31,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:25:31,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:25:33,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:34,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 04:25:36,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 04:25:37,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:25:37,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:25:37,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1137120.0, ans=0.1 2023-10-03 04:25:39,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:25:39,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:25:43,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:25:45,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:25:46,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:25:46,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:48,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 04:25:49,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:25:51,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:25:52,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:25:53,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:54,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:25:55,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 04:25:59,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 04:26:02,186 INFO [train.py:1046] (1/4) Epoch 33, batch 600, loss[loss=0.1425, simple_loss=0.2205, pruned_loss=0.03225, over 23142.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2422, pruned_loss=0.04214, over 4478322.59 frames. ], batch size: 105, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:26:03,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 04:26:06,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:26:06,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:26:06,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:26:13,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:26:14,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:26:17,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 04:26:19,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:26:20,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:26:22,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:26:23,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 04:26:23,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:26:29,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 04:26:32,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:26:32,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:26:33,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:26:33,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1137386.6666666667, ans=0.125 2023-10-03 04:26:37,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:26:37,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:26:39,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:26:47,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:26:49,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1137453.3333333333, ans=0.125 2023-10-03 04:26:52,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:26:52,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:26:52,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:26:57,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 04:27:02,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:27:03,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:27:06,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 04:27:06,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:27:08,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 04:27:10,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:27:10,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:27:14,271 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.854e+02 2.113e+02 2.384e+02 3.554e+02, threshold=4.226e+02, percent-clipped=0.0 2023-10-03 04:27:17,653 INFO [train.py:1046] (1/4) Epoch 33, batch 650, loss[loss=0.1519, simple_loss=0.2355, pruned_loss=0.03417, over 24666.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2418, pruned_loss=0.04203, over 4537120.88 frames. ], batch size: 68, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:27:17,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 04:27:20,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 04:27:20,732 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:27:21,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:27:23,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:27:24,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:27:27,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 04:27:28,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:27:29,432 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.44 vs. limit=15.0 2023-10-03 04:27:33,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1137653.3333333333, ans=0.0 2023-10-03 04:27:34,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:27:34,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:27:37,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:27:40,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 04:27:42,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:27:42,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:27:42,922 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:27:45,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1137720.0, ans=0.2 2023-10-03 04:27:46,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:27:46,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 04:27:48,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:27:48,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:27:48,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 04:27:49,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:27:52,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:27:55,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:27:55,261 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 04:27:55,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:27:55,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:27:59,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:27:59,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:27:59,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:28:01,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:28:02,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 04:28:02,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:28:03,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:28:03,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:28:03,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:28:06,081 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.35 vs. limit=15.0 2023-10-03 04:28:06,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 04:28:06,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 04:28:08,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 04:28:08,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:09,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:28:09,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:28:09,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:28:11,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:28:17,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:19,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:28:21,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:28:22,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:28:22,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:28:22,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1137853.3333333333, ans=0.125 2023-10-03 04:28:24,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:28:31,199 INFO [train.py:1046] (1/4) Epoch 33, batch 700, loss[loss=0.1632, simple_loss=0.2267, pruned_loss=0.04986, over 22839.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2402, pruned_loss=0.04181, over 4573249.44 frames. ], batch size: 322, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:28:31,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:28:31,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:28:31,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:28:31,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:28:36,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 04:28:36,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 04:28:38,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1137920.0, ans=0.125 2023-10-03 04:28:38,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1137920.0, ans=0.125 2023-10-03 04:28:40,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 04:28:42,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:44,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:28:44,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 04:28:46,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1137986.6666666667, ans=0.125 2023-10-03 04:28:49,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:28:52,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:28:54,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:54,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1137986.6666666667, ans=0.125 2023-10-03 04:28:55,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:28:55,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:28:58,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:59,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 04:28:59,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:29:02,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 04:29:04,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1138053.3333333333, ans=0.0 2023-10-03 04:29:05,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 04:29:08,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:29:08,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:29:08,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1138053.3333333333, ans=0.125 2023-10-03 04:29:10,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:29:14,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:29:16,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 04:29:19,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:29:19,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:29:19,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 04:29:23,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:29:25,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:29:27,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1138120.0, ans=0.125 2023-10-03 04:29:28,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:29:33,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:29:33,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 04:29:37,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 04:29:37,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 04:29:39,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1138186.6666666667, ans=0.125 2023-10-03 04:29:41,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:29:41,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:29:42,573 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.96 vs. limit=15.0 2023-10-03 04:29:43,030 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.966e+02 2.295e+02 2.647e+02 3.706e+02, threshold=4.591e+02, percent-clipped=0.0 2023-10-03 04:29:43,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:29:44,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:29:45,850 INFO [train.py:1046] (1/4) Epoch 33, batch 750, loss[loss=0.1662, simple_loss=0.24, pruned_loss=0.04617, over 23731.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2397, pruned_loss=0.04159, over 4614475.14 frames. ], batch size: 212, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:29:45,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 04:29:48,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 04:29:48,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 04:29:48,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 04:29:50,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 04:29:50,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 04:29:52,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:29:53,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 04:29:53,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:29:55,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:29:55,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1138253.3333333333, ans=0.125 2023-10-03 04:29:55,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1138253.3333333333, ans=0.2 2023-10-03 04:29:56,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:29:58,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:29:58,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:29:59,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:30:02,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:30:02,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:30:05,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:30:06,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:30:06,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1138320.0, ans=0.2 2023-10-03 04:30:07,265 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.95 vs. limit=15.0 2023-10-03 04:30:07,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:30:07,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 04:30:08,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:30:11,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:30:12,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:30:14,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 04:30:15,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 04:30:15,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:30:15,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1138386.6666666667, ans=0.125 2023-10-03 04:30:18,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 04:30:18,444 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 04:30:19,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 04:30:19,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:30:19,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 04:30:21,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:30:24,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1138386.6666666667, ans=0.0 2023-10-03 04:30:29,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:30:29,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:30:29,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:30:31,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:30:33,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:30:33,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 04:30:34,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:30:34,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 04:30:34,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:30:38,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:30:39,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 04:30:40,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:30:43,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:30:46,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:30:46,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:30:49,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:30:50,168 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.96 vs. limit=22.5 2023-10-03 04:30:51,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1138520.0, ans=0.2 2023-10-03 04:30:52,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 04:30:52,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:30:54,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:30:56,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:30:56,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:31:00,096 INFO [train.py:1046] (1/4) Epoch 33, batch 800, loss[loss=0.1495, simple_loss=0.2277, pruned_loss=0.03561, over 24475.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2406, pruned_loss=0.0417, over 4641296.00 frames. ], batch size: 63, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:31:00,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:31:00,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:31:05,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:31:05,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:06,380 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.47 vs. limit=15.0 2023-10-03 04:31:07,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:31:07,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:31:10,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:10,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:11,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:15,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:31:16,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:31:19,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 04:31:19,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:20,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:31:20,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:31:20,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:31:22,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 04:31:22,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:31:22,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 04:31:25,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:27,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:31:29,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:31:29,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:31:31,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:32,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:36,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:31:37,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:31:37,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 04:31:39,019 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 04:31:39,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 04:31:39,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:31:40,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:31:41,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:41,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:31:46,577 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 04:31:47,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 04:31:49,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:31:50,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:31:50,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1138786.6666666667, ans=0.0 2023-10-03 04:31:54,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:31:58,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:59,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 04:32:00,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:32:04,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 04:32:10,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:32:11,442 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.861e+02 2.064e+02 2.320e+02 3.416e+02, threshold=4.128e+02, percent-clipped=0.0 2023-10-03 04:32:11,889 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:32:12,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:32:12,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 04:32:14,286 INFO [train.py:1046] (1/4) Epoch 33, batch 850, loss[loss=0.1793, simple_loss=0.2635, pruned_loss=0.0475, over 23928.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.241, pruned_loss=0.04189, over 4658316.95 frames. ], batch size: 86, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:32:14,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:32:14,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:32:15,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 04:32:15,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:32:17,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:32:17,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:32:17,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1138920.0, ans=0.125 2023-10-03 04:32:19,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:32:20,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:32:21,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 04:32:21,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 04:32:21,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 04:32:23,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:32:23,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:32:25,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:32:26,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:32:26,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:32:30,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:32:30,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:32:30,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 04:32:34,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 04:32:37,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:32:38,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 04:32:41,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 04:32:41,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 04:32:44,353 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 04:32:44,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:32:46,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:32:46,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 04:32:49,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:32:49,964 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.94 vs. limit=15.0 2023-10-03 04:32:50,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:32:50,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 04:32:53,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:32:55,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:32:55,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:32:55,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:32:55,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1139053.3333333333, ans=0.1 2023-10-03 04:32:56,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:32:57,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:32:58,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 04:33:01,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1139120.0, ans=0.125 2023-10-03 04:33:02,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:33:02,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:33:02,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:33:04,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:33:04,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:33:06,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:33:06,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:33:07,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1139120.0, ans=0.0 2023-10-03 04:33:08,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:33:08,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:33:10,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:33:13,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1139186.6666666667, ans=0.0 2023-10-03 04:33:16,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1139186.6666666667, ans=0.0 2023-10-03 04:33:18,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 04:33:19,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1139186.6666666667, ans=0.0 2023-10-03 04:33:21,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:33:21,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 04:33:21,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:33:21,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:33:24,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 04:33:28,991 INFO [train.py:1046] (1/4) Epoch 33, batch 900, loss[loss=0.1477, simple_loss=0.2328, pruned_loss=0.03128, over 24306.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2423, pruned_loss=0.04241, over 4673254.84 frames. ], batch size: 61, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:33:29,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:33:33,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:33:33,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 04:33:36,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:33:36,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 04:33:37,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 04:33:37,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:33:37,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:33:39,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:33:39,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:33:47,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:33:47,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:33:48,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:33:52,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:33:56,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 04:33:58,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:34:00,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1139386.6666666667, ans=0.2 2023-10-03 04:34:06,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:34:06,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:34:07,755 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 04:34:07,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 04:34:09,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1139386.6666666667, ans=0.1 2023-10-03 04:34:12,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:34:12,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:34:14,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:34:20,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:34:20,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:34:21,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 04:34:21,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:34:25,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 04:34:27,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:34:27,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:34:30,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:34:30,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:34:34,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 04:34:34,690 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 04:34:37,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 04:34:37,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 04:34:40,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:34:43,679 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.900e+02 2.085e+02 2.299e+02 3.582e+02, threshold=4.170e+02, percent-clipped=0.0 2023-10-03 04:34:43,710 INFO [train.py:1046] (1/4) Epoch 33, batch 950, loss[loss=0.1687, simple_loss=0.2496, pruned_loss=0.04392, over 24511.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2426, pruned_loss=0.04247, over 4692505.30 frames. ], batch size: 63, lr: 3.11e-03, grad_scale: 4.0 2023-10-03 04:34:45,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 04:34:50,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:34:52,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:34:54,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:34:54,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 04:34:55,822 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 04:35:00,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:35:01,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:35:01,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:35:03,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:35:03,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 04:35:03,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 04:35:04,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:06,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 04:35:07,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:35:12,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:12,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:35:12,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:35:12,779 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.02 vs. limit=15.0 2023-10-03 04:35:13,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 04:35:16,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 04:35:17,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:35:19,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:35:24,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:35:24,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:35:26,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 04:35:28,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 04:35:28,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:35:29,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:35:29,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:29,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:35:33,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1139786.6666666667, ans=0.2 2023-10-03 04:35:34,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 04:35:34,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:35:38,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:35:38,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:38,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 04:35:38,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:35:38,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:35:39,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 04:35:43,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:35:44,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:35:48,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1139853.3333333333, ans=0.125 2023-10-03 04:35:51,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:35:51,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 04:35:52,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 04:35:55,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:58,279 INFO [train.py:1046] (1/4) Epoch 33, batch 1000, loss[loss=0.1472, simple_loss=0.2077, pruned_loss=0.04338, over 22721.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2408, pruned_loss=0.04195, over 4695911.79 frames. ], batch size: 322, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:35:59,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 04:36:01,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:36:05,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:36:06,513 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.67 vs. limit=15.0 2023-10-03 04:36:07,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 04:36:07,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 04:36:11,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:36:11,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:36:14,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:36:17,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 04:36:19,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 04:36:22,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 04:36:22,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:36:24,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 04:36:25,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 04:36:25,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 04:36:28,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:36:28,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:36:33,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1140053.3333333333, ans=0.0 2023-10-03 04:36:37,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:36:37,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:36:38,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:36:38,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:36:40,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 04:36:40,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:36:41,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:36:41,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:36:43,010 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 04:36:44,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1140120.0, ans=0.05 2023-10-03 04:36:45,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 04:36:47,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 04:36:49,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 04:36:52,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:36:58,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:36:58,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:36:59,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:37:00,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:37:01,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 04:37:02,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:37:02,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 04:37:04,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 04:37:05,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:37:05,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:37:08,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:37:10,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:37:12,726 INFO [train.py:1046] (1/4) Epoch 33, batch 1050, loss[loss=0.1694, simple_loss=0.2543, pruned_loss=0.04225, over 24660.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2399, pruned_loss=0.04153, over 4704399.02 frames. ], batch size: 68, lr: 3.11e-03, grad_scale: 4.0 2023-10-03 04:37:12,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:37:14,199 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.886e+02 2.130e+02 2.502e+02 4.211e+02, threshold=4.261e+02, percent-clipped=1.0 2023-10-03 04:37:14,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:37:15,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:37:19,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:37:19,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:37:20,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:37:24,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:37:25,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:37:28,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:37:28,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:37:28,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:37:29,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:37:29,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 04:37:31,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:37:31,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 04:37:32,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:37:32,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 04:37:32,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 04:37:38,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:37:39,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:37:39,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:37:42,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 04:37:42,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 04:37:42,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:37:46,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 04:37:50,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 04:37:52,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:37:55,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 04:37:58,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 04:37:58,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:37:58,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:38:02,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:38:05,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 04:38:08,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 04:38:08,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 04:38:08,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:38:08,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:38:11,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 04:38:15,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:38:18,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:38:18,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:38:19,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:38:19,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:38:21,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1140520.0, ans=0.125 2023-10-03 04:38:22,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:38:22,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 04:38:24,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:38:24,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 04:38:25,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 04:38:26,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:38:27,223 INFO [train.py:1046] (1/4) Epoch 33, batch 1100, loss[loss=0.1525, simple_loss=0.2092, pruned_loss=0.04794, over 19336.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2396, pruned_loss=0.04146, over 4705224.23 frames. ], batch size: 389, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:38:30,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:38:34,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:38:39,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:38:40,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:38:40,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:38:41,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 04:38:43,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:38:44,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:38:47,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:38:50,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:38:50,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 04:38:51,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 04:38:53,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:38:53,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:38:55,172 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.30 vs. limit=15.0 2023-10-03 04:38:56,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:38:57,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:39:01,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1140720.0, ans=0.0 2023-10-03 04:39:04,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:39:07,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 04:39:07,679 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 04:39:07,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:39:09,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:39:10,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:39:10,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:39:11,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 04:39:13,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:39:13,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:39:13,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:39:14,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:39:14,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 04:39:19,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1140786.6666666667, ans=0.0 2023-10-03 04:39:20,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:39:20,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 04:39:23,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:39:23,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1140786.6666666667, ans=0.0 2023-10-03 04:39:27,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:39:28,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 04:39:28,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 04:39:30,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:39:32,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1140853.3333333333, ans=0.0 2023-10-03 04:39:33,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:39:33,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:39:37,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 04:39:38,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:39:38,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:39:39,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 04:39:39,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:39:39,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 04:39:41,277 INFO [train.py:1046] (1/4) Epoch 33, batch 1150, loss[loss=0.1587, simple_loss=0.2393, pruned_loss=0.03905, over 23344.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2399, pruned_loss=0.04131, over 4710919.86 frames. ], batch size: 119, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:39:41,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:39:41,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:39:41,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:39:42,685 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.826e+02 2.024e+02 2.251e+02 4.261e+02, threshold=4.048e+02, percent-clipped=0.0 2023-10-03 04:39:45,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:39:46,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:39:48,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:39:48,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:39:49,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 04:39:49,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:39:50,309 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.80 vs. limit=15.0 2023-10-03 04:39:54,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 04:39:54,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:39:54,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:40:00,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 04:40:00,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1140986.6666666667, ans=0.125 2023-10-03 04:40:03,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:40:07,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:40:07,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:40:07,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 04:40:07,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:40:09,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:40:09,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1141053.3333333333, ans=0.125 2023-10-03 04:40:10,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 04:40:11,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:40:13,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:40:13,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1141053.3333333333, ans=0.125 2023-10-03 04:40:20,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:40:27,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:40:27,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 04:40:29,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:40:31,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:40:32,732 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.51 vs. limit=6.0 2023-10-03 04:40:34,889 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 04:40:36,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:40:41,056 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.26 vs. limit=6.0 2023-10-03 04:40:43,112 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 04:40:47,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:40:48,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:40:48,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:40:48,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:40:50,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1141186.6666666667, ans=0.0 2023-10-03 04:40:51,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:40:54,716 INFO [train.py:1046] (1/4) Epoch 33, batch 1200, loss[loss=0.1845, simple_loss=0.2681, pruned_loss=0.05044, over 23708.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2414, pruned_loss=0.04156, over 4722435.70 frames. ], batch size: 85, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:40:56,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:40:56,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:40:56,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1141253.3333333333, ans=0.0 2023-10-03 04:40:57,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:40:57,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:40:57,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:40:59,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:41:01,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:41:04,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:41:04,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:41:06,662 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 04:41:08,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1141320.0, ans=0.0 2023-10-03 04:41:09,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 04:41:13,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:41:15,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:41:17,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:41:21,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:41:21,754 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 04:41:21,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:41:22,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1141386.6666666667, ans=0.0 2023-10-03 04:41:29,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:41:29,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:41:30,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 04:41:30,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:41:34,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 04:41:34,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1141386.6666666667, ans=0.125 2023-10-03 04:41:39,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 04:41:39,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:41:41,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:41:42,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:41:42,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:41:44,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:41:44,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:41:45,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:41:46,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 04:41:46,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:41:46,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:41:48,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:41:50,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:41:50,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:41:54,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 04:41:58,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:41:59,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1141520.0, ans=0.0 2023-10-03 04:42:00,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 04:42:01,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1141520.0, ans=0.0 2023-10-03 04:42:02,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1141520.0, ans=0.2 2023-10-03 04:42:03,674 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 04:42:05,774 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.48 vs. limit=15.0 2023-10-03 04:42:06,912 INFO [train.py:1046] (1/4) Epoch 33, batch 1250, loss[loss=0.1617, simple_loss=0.2362, pruned_loss=0.04362, over 23839.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2419, pruned_loss=0.04156, over 4732614.63 frames. ], batch size: 212, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:42:06,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:42:08,440 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.69 vs. limit=22.5 2023-10-03 04:42:08,773 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.967e+02 2.213e+02 2.630e+02 3.265e+02, threshold=4.425e+02, percent-clipped=0.0 2023-10-03 04:42:08,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:42:11,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:42:12,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:42:12,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1141586.6666666667, ans=0.125 2023-10-03 04:42:14,143 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.97 vs. limit=15.0 2023-10-03 04:42:14,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 04:42:17,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:42:19,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:42:19,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 04:42:20,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:42:21,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:42:26,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:42:27,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:42:27,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:42:27,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:42:30,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:42:33,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 04:42:33,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:42:33,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:42:36,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:42:36,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:42:38,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:42:40,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 04:42:43,298 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:42:45,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 04:42:46,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:42:48,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:42:49,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 04:42:49,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:42:51,173 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 04:42:51,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:42:51,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:42:53,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:42:57,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:42:58,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:42:59,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 04:42:59,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 04:42:59,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 04:43:05,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:43:05,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 04:43:05,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:43:08,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 04:43:08,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:43:10,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 04:43:10,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:43:10,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:43:12,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 04:43:13,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:43:15,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 04:43:16,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:43:18,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:43:18,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:43:19,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:43:21,095 INFO [train.py:1046] (1/4) Epoch 33, batch 1300, loss[loss=0.2212, simple_loss=0.2865, pruned_loss=0.07793, over 19747.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2422, pruned_loss=0.04168, over 4737833.48 frames. ], batch size: 388, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:43:22,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:43:22,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 04:43:25,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1141920.0, ans=0.0 2023-10-03 04:43:28,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:43:29,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:43:31,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:43:32,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:43:34,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:43:34,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 04:43:37,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1141986.6666666667, ans=0.1 2023-10-03 04:43:40,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:43:42,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:43:44,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 04:43:44,931 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.90 vs. limit=6.0 2023-10-03 04:43:46,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 04:43:50,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:43:50,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1142053.3333333333, ans=0.125 2023-10-03 04:43:52,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:43:53,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:43:54,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:43:56,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:43:56,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 04:43:56,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 04:44:01,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:44:01,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:44:02,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 04:44:02,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 04:44:03,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:44:07,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:44:07,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 04:44:08,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:44:08,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 04:44:11,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:44:15,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:44:15,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:44:19,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 04:44:20,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 04:44:20,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 04:44:25,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:44:27,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 04:44:28,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1142186.6666666667, ans=0.125 2023-10-03 04:44:29,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:44:32,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1142186.6666666667, ans=0.0 2023-10-03 04:44:35,327 INFO [train.py:1046] (1/4) Epoch 33, batch 1350, loss[loss=0.1438, simple_loss=0.2261, pruned_loss=0.03077, over 24355.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2409, pruned_loss=0.04164, over 4727193.34 frames. ], batch size: 61, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:44:35,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 04:44:36,746 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.912e+02 2.066e+02 2.352e+02 3.364e+02, threshold=4.132e+02, percent-clipped=0.0 2023-10-03 04:44:38,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:44:43,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:44:46,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:44:46,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:44:48,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:44:48,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:44:52,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:44:53,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 04:44:54,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:44:54,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:44:57,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 04:44:59,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:45:00,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:45:00,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 04:45:02,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 04:45:03,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 04:45:04,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:45:04,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 04:45:13,139 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:45:17,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:45:27,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:45:27,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:45:27,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 04:45:31,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:45:32,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 04:45:32,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:45:33,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:45:35,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:45:38,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 04:45:38,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:45:43,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 04:45:45,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 04:45:48,696 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.75 vs. limit=6.0 2023-10-03 04:45:49,689 INFO [train.py:1046] (1/4) Epoch 33, batch 1400, loss[loss=0.1571, simple_loss=0.2365, pruned_loss=0.03885, over 23367.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2396, pruned_loss=0.04148, over 4718491.97 frames. ], batch size: 105, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:45:49,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 04:45:52,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:45:53,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:45:55,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:45:57,465 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.96 vs. limit=12.0 2023-10-03 04:45:59,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 04:46:00,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1142586.6666666667, ans=0.125 2023-10-03 04:46:01,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 04:46:10,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:46:12,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1142653.3333333333, ans=0.125 2023-10-03 04:46:13,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:46:13,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:46:15,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 04:46:19,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:46:22,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 04:46:22,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1142720.0, ans=0.2 2023-10-03 04:46:31,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:46:31,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:46:33,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 04:46:35,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:46:35,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:46:37,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:46:37,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:46:38,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:46:38,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:46:38,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1142786.6666666667, ans=0.2 2023-10-03 04:46:40,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:46:41,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 04:46:43,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:46:47,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:46:50,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:46:57,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 04:46:57,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:46:58,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:47:01,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 04:47:03,141 INFO [train.py:1046] (1/4) Epoch 33, batch 1450, loss[loss=0.1541, simple_loss=0.2347, pruned_loss=0.03679, over 24473.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2386, pruned_loss=0.04141, over 4696945.36 frames. ], batch size: 63, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:47:03,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:47:04,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:47:05,801 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.841e+02 1.972e+02 2.239e+02 2.970e+02, threshold=3.944e+02, percent-clipped=0.0 2023-10-03 04:47:07,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:47:08,758 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.59 vs. limit=15.0 2023-10-03 04:47:09,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:47:09,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:09,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 04:47:13,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:47:13,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:47:14,806 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=13.11 vs. limit=15.0 2023-10-03 04:47:16,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:47:16,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 04:47:17,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:47:18,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 04:47:19,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:19,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:47:19,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 04:47:21,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:47:22,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:47:22,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 04:47:22,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:47:24,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:47:25,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:28,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:47:31,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:47:31,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:47:31,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1143053.3333333333, ans=0.125 2023-10-03 04:47:32,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:47:32,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:34,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:47:34,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:47:34,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1143053.3333333333, ans=0.0 2023-10-03 04:47:35,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:35,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:47:41,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 04:47:43,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:47:46,192 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 04:47:46,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1143120.0, ans=0.0 2023-10-03 04:47:47,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:47:48,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:47:50,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:47:52,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 04:47:56,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:47:57,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 04:47:59,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 04:47:59,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:48:03,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:48:03,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:48:05,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 04:48:07,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 04:48:07,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 04:48:11,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:48:12,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 04:48:17,073 INFO [train.py:1046] (1/4) Epoch 33, batch 1500, loss[loss=0.152, simple_loss=0.2348, pruned_loss=0.03463, over 24631.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2394, pruned_loss=0.04171, over 4694161.19 frames. ], batch size: 65, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:48:22,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 04:48:24,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:48:24,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:48:25,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:48:25,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:48:27,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:48:27,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 04:48:29,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:48:30,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 04:48:30,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:48:31,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:48:33,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:48:33,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:48:39,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:48:39,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 04:48:39,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:48:39,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:48:40,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:48:42,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 04:48:47,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 04:48:48,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:48:49,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 04:48:51,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 04:48:52,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:48:52,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:48:52,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:48:55,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 04:48:55,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:48:57,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:48:57,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 04:48:57,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:49:01,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:49:01,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 04:49:03,716 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.16 vs. limit=15.0 2023-10-03 04:49:06,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:49:06,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:49:11,429 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 04:49:11,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:49:11,492 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 04:49:12,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:49:15,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:49:15,615 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 04:49:17,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:49:19,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 04:49:21,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:49:21,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1143520.0, ans=0.09899494936611666 2023-10-03 04:49:24,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:49:24,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:49:25,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:49:25,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:49:27,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:49:28,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 04:49:29,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 04:49:30,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:49:30,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 04:49:31,406 INFO [train.py:1046] (1/4) Epoch 33, batch 1550, loss[loss=0.147, simple_loss=0.2195, pruned_loss=0.0372, over 24501.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2408, pruned_loss=0.042, over 4701621.88 frames. ], batch size: 58, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:49:31,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 04:49:34,159 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.691e+02 1.973e+02 2.319e+02 2.706e+02 3.781e+02, threshold=4.639e+02, percent-clipped=0.0 2023-10-03 04:49:34,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:49:35,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:49:35,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1143586.6666666667, ans=0.125 2023-10-03 04:49:36,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:49:36,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:49:38,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:49:38,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:49:42,331 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 04:49:42,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:49:44,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:49:44,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:49:47,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:49:48,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 04:49:49,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:49:49,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 04:49:51,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 04:49:51,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 04:49:51,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:49:51,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:49:54,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:49:55,340 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.32 vs. limit=15.0 2023-10-03 04:49:55,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1143653.3333333333, ans=0.125 2023-10-03 04:49:57,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 04:49:57,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 04:50:03,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:50:03,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=1143720.0, ans=10.0 2023-10-03 04:50:05,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1143720.0, ans=0.125 2023-10-03 04:50:07,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:50:07,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:50:07,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:50:08,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 04:50:08,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1143720.0, ans=0.1 2023-10-03 04:50:15,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:50:15,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:50:15,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1143786.6666666667, ans=0.0 2023-10-03 04:50:18,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:50:19,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:50:19,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:50:19,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 04:50:19,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:50:21,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1143786.6666666667, ans=0.125 2023-10-03 04:50:22,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:50:22,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:50:22,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 04:50:22,508 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 04:50:25,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:50:31,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 04:50:36,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:50:38,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:50:38,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 04:50:41,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:50:41,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:50:41,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:50:41,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:50:41,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:50:45,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:50:45,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 04:50:46,766 INFO [train.py:1046] (1/4) Epoch 33, batch 1600, loss[loss=0.1585, simple_loss=0.2476, pruned_loss=0.03467, over 24657.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2415, pruned_loss=0.04159, over 4713668.32 frames. ], batch size: 73, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:50:46,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 04:50:48,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 04:50:50,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:50:52,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 04:50:52,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:50:53,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1143920.0, ans=0.125 2023-10-03 04:50:54,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:50:56,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1143920.0, ans=0.125 2023-10-03 04:50:56,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1143920.0, ans=0.125 2023-10-03 04:50:59,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:51:04,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 04:51:06,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:51:08,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 04:51:08,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:51:10,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 04:51:12,853 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.83 vs. limit=15.0 2023-10-03 04:51:15,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 04:51:21,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:51:21,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 04:51:22,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:51:22,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:51:22,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:51:22,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1144053.3333333333, ans=0.125 2023-10-03 04:51:24,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 04:51:26,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 04:51:30,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:51:30,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:51:30,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:51:31,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:51:34,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:51:34,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1144120.0, ans=0.125 2023-10-03 04:51:35,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:51:37,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:51:42,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1144120.0, ans=0.1 2023-10-03 04:51:43,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:51:45,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:51:46,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 04:51:46,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:51:48,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 04:51:53,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:51:54,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:51:56,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:51:56,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 04:51:57,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 04:51:57,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 04:51:57,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 04:52:00,267 INFO [train.py:1046] (1/4) Epoch 33, batch 1650, loss[loss=0.1486, simple_loss=0.2334, pruned_loss=0.03193, over 24514.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2418, pruned_loss=0.04162, over 4718695.77 frames. ], batch size: 63, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:52:01,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:52:01,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:52:01,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:52:01,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:52:03,800 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.880e+02 2.031e+02 2.196e+02 3.045e+02, threshold=4.062e+02, percent-clipped=0.0 2023-10-03 04:52:05,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:52:06,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 04:52:08,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:52:08,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:52:08,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:52:08,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:52:09,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1144253.3333333333, ans=0.0 2023-10-03 04:52:09,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1144253.3333333333, ans=0.1 2023-10-03 04:52:10,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 04:52:10,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 04:52:12,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1144253.3333333333, ans=0.1 2023-10-03 04:52:16,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:52:18,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:52:22,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1144320.0, ans=0.05 2023-10-03 04:52:26,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 04:52:26,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:52:29,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 04:52:32,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:52:36,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:52:36,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:52:36,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:52:37,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:52:37,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:52:39,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1144386.6666666667, ans=0.0 2023-10-03 04:52:41,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:52:41,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:52:43,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:52:43,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:52:44,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:52:45,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:52:48,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:52:49,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 04:52:51,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:52:51,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 04:52:52,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 04:52:52,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 04:52:52,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:52:53,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:52:54,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:52:55,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:52:55,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 04:52:58,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:53:00,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:53:00,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:53:04,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 04:53:09,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:53:09,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:53:09,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 04:53:09,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:53:09,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:53:09,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:53:14,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:53:14,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:53:14,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 04:53:16,255 INFO [train.py:1046] (1/4) Epoch 33, batch 1700, loss[loss=0.1579, simple_loss=0.2456, pruned_loss=0.03508, over 24484.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2413, pruned_loss=0.04168, over 4717894.26 frames. ], batch size: 66, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:53:18,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:53:26,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:53:26,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1144586.6666666667, ans=0.0 2023-10-03 04:53:28,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:53:29,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1144653.3333333333, ans=0.1 2023-10-03 04:53:29,481 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.61 vs. limit=15.0 2023-10-03 04:53:34,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:53:34,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:53:34,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1144653.3333333333, ans=0.125 2023-10-03 04:53:35,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:53:35,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:53:37,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 04:53:40,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:53:40,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:53:42,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:53:44,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 04:53:45,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 04:53:45,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 04:53:47,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:53:47,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1144720.0, ans=0.035 2023-10-03 04:53:49,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 04:53:49,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1144720.0, ans=0.0 2023-10-03 04:53:50,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:53:58,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1144786.6666666667, ans=0.125 2023-10-03 04:54:00,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:54:01,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:54:03,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:54:04,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 04:54:04,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 04:54:04,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:54:04,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1144786.6666666667, ans=10.0 2023-10-03 04:54:05,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:54:05,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 04:54:07,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:54:07,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:54:07,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:54:07,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:54:11,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:54:11,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:54:11,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:54:13,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:54:13,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:54:18,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:54:19,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 04:54:23,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:54:24,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:54:24,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1144853.3333333333, ans=0.1 2023-10-03 04:54:26,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 04:54:30,278 INFO [train.py:1046] (1/4) Epoch 33, batch 1750, loss[loss=0.1773, simple_loss=0.2461, pruned_loss=0.0543, over 23855.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2403, pruned_loss=0.04147, over 4712118.69 frames. ], batch size: 164, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:54:30,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:54:32,923 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.900e+02 2.115e+02 2.470e+02 3.706e+02, threshold=4.230e+02, percent-clipped=0.0 2023-10-03 04:54:33,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:54:33,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 04:54:34,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 04:54:34,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:54:37,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:54:37,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:54:42,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 04:54:42,899 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.13 vs. limit=22.5 2023-10-03 04:54:45,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:54:47,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 04:54:47,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:54:49,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:54:53,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 04:54:53,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 04:54:56,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:54:56,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 04:55:01,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1145053.3333333333, ans=0.07 2023-10-03 04:55:03,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:55:04,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:55:04,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:55:07,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:55:07,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:55:07,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1145053.3333333333, ans=0.0 2023-10-03 04:55:10,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:55:11,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:55:13,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:55:13,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:55:15,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 04:55:17,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:55:20,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 04:55:21,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:55:22,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:55:24,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:55:26,292 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.16 vs. limit=15.0 2023-10-03 04:55:28,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:55:28,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 04:55:28,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1145186.6666666667, ans=0.125 2023-10-03 04:55:29,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:55:29,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:55:35,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:55:37,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:55:39,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:55:40,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 04:55:40,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:55:40,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:55:40,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:55:40,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:55:40,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:55:42,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:55:43,425 INFO [train.py:1046] (1/4) Epoch 33, batch 1800, loss[loss=0.1742, simple_loss=0.2487, pruned_loss=0.0498, over 23738.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2393, pruned_loss=0.04132, over 4713411.56 frames. ], batch size: 164, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:55:45,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:55:45,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:55:48,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 04:55:50,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:55:52,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 04:55:53,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:55:56,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:55:59,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:55:59,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:56:00,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:56:03,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:56:03,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 04:56:03,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:03,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1145320.0, ans=0.2 2023-10-03 04:56:07,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:10,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 04:56:10,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1145320.0, ans=0.0 2023-10-03 04:56:12,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 04:56:12,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 04:56:13,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:56:13,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:56:13,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:56:15,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:56:24,636 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 04:56:24,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1145386.6666666667, ans=0.1 2023-10-03 04:56:26,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:56:26,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:28,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 04:56:28,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 04:56:28,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:56:29,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:56:31,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:56:35,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 04:56:40,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:56:42,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 04:56:42,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:56:42,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:56:43,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:56:44,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 04:56:46,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:56:46,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:56:50,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 04:56:50,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:56:50,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1145520.0, ans=0.125 2023-10-03 04:56:52,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:56:52,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:56:52,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:54,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:54,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:56:57,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:56:57,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:56:59,222 INFO [train.py:1046] (1/4) Epoch 33, batch 1850, loss[loss=0.1753, simple_loss=0.2486, pruned_loss=0.05097, over 23682.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2396, pruned_loss=0.04104, over 4721098.57 frames. ], batch size: 232, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:57:00,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:57:00,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:57:03,506 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.866e+02 2.062e+02 2.280e+02 4.556e+02, threshold=4.123e+02, percent-clipped=1.0 2023-10-03 04:57:06,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:57:06,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 04:57:07,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1145586.6666666667, ans=0.2 2023-10-03 04:57:08,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 04:57:11,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 04:57:16,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:57:16,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 04:57:16,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 04:57:17,022 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.10 vs. limit=15.0 2023-10-03 04:57:18,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1145653.3333333333, ans=0.0 2023-10-03 04:57:26,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:57:27,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 04:57:30,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:57:30,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:57:35,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 04:57:35,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:57:35,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 04:57:38,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:57:41,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:57:42,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:57:44,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1145786.6666666667, ans=0.1 2023-10-03 04:57:45,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:57:45,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:57:45,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 04:57:45,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:57:48,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:57:50,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:57:52,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 04:57:53,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:57:56,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:57:58,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:57:58,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 04:57:58,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 04:57:59,937 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 04:58:01,350 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 04:58:01,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1145853.3333333333, ans=0.2 2023-10-03 04:58:04,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:58:04,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:58:04,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:58:04,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:58:04,230 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 04:58:04,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:58:04,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:58:05,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:58:07,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:58:08,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:58:09,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 04:58:10,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1145853.3333333333, ans=0.2 2023-10-03 04:58:11,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:58:11,147 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 04:58:11,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:58:12,487 INFO [train.py:1046] (1/4) Epoch 33, batch 1900, loss[loss=0.1687, simple_loss=0.2475, pruned_loss=0.04493, over 24665.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2405, pruned_loss=0.04131, over 4717460.09 frames. ], batch size: 65, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:58:12,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:58:16,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:58:21,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:58:23,142 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 04:58:23,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 04:58:25,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:58:26,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:58:26,560 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 04:58:26,584 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 04:58:26,904 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:58:30,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 04:58:32,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:58:34,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1145986.6666666667, ans=0.025 2023-10-03 04:58:36,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 04:58:38,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 04:58:43,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1146053.3333333333, ans=0.125 2023-10-03 04:58:46,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 04:58:46,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1146053.3333333333, ans=0.125 2023-10-03 04:58:49,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 04:58:49,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:58:49,349 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 04:58:49,983 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.49 vs. limit=10.0 2023-10-03 04:58:51,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 04:58:51,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 04:58:52,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 04:58:52,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:58:58,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 04:59:00,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:59:05,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:59:05,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 04:59:06,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:59:10,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 04:59:10,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:59:12,794 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.60 vs. limit=15.0 2023-10-03 04:59:14,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:59:14,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:59:16,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:59:16,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:59:16,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1146186.6666666667, ans=0.125 2023-10-03 04:59:19,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:59:19,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 04:59:19,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:59:23,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:59:23,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:59:25,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:59:25,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:59:25,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:59:26,644 INFO [train.py:1046] (1/4) Epoch 33, batch 1950, loss[loss=0.1653, simple_loss=0.2522, pruned_loss=0.03922, over 24624.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2409, pruned_loss=0.04114, over 4733039.22 frames. ], batch size: 73, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:59:26,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:59:30,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:59:32,077 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.922e+02 2.146e+02 2.746e+02 4.413e+02, threshold=4.292e+02, percent-clipped=1.0 2023-10-03 04:59:33,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:59:33,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:59:33,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:59:35,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 04:59:36,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 04:59:37,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:59:37,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:59:41,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:59:41,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:59:41,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:59:43,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:59:46,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:59:46,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:59:46,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:59:46,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:59:50,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:59:54,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:59:54,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:59:54,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 04:59:54,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 04:59:55,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 04:59:55,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:59:55,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:59:56,404 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.39 vs. limit=15.0 2023-10-03 05:00:02,294 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.39 vs. limit=22.5 2023-10-03 05:00:03,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:00:04,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:00:07,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:00:12,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:00:12,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:00:12,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 05:00:13,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:00:16,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1146453.3333333333, ans=0.125 2023-10-03 05:00:17,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:00:19,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:00:19,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:00:26,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:00:26,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:00:29,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:00:32,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:00:35,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:00:35,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:00:37,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 05:00:37,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:00:38,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:00:40,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 05:00:40,949 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.71 vs. limit=15.0 2023-10-03 05:00:41,736 INFO [train.py:1046] (1/4) Epoch 33, batch 2000, loss[loss=0.1634, simple_loss=0.2415, pruned_loss=0.04267, over 23479.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2412, pruned_loss=0.04114, over 4740881.15 frames. ], batch size: 134, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 05:00:41,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:00:41,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1146586.6666666667, ans=0.1 2023-10-03 05:00:45,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:00:46,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:00:46,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:00:48,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:00:49,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:00:51,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 05:00:51,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:00:53,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:00:58,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 05:00:58,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:01:02,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:01:04,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:01:05,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 05:01:05,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:06,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1146653.3333333333, ans=0.0 2023-10-03 05:01:08,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:08,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:10,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 05:01:11,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:01:13,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 05:01:13,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:01:15,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:01:16,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 05:01:16,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:18,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:01:18,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:01:19,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 05:01:22,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 05:01:22,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:01:22,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:01:27,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:01:27,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:01:27,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:01:29,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:01:30,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:01:31,147 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:01:32,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:01:32,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:01:32,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:01:34,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:36,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:01:38,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 05:01:44,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:01:46,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:01:48,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:01:48,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:01:52,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:54,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1146853.3333333333, ans=0.5 2023-10-03 05:01:55,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:01:55,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:55,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:01:56,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:01:58,185 INFO [train.py:1046] (1/4) Epoch 33, batch 2050, loss[loss=0.1736, simple_loss=0.2602, pruned_loss=0.04354, over 24411.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2409, pruned_loss=0.04103, over 4732470.41 frames. ], batch size: 77, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 05:01:59,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:01:59,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:02:02,867 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.906e+02 2.037e+02 2.269e+02 3.118e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-03 05:02:02,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:02:03,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:02:09,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:02:11,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:02:11,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:02:13,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:02:13,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1146986.6666666667, ans=0.125 2023-10-03 05:02:15,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 05:02:15,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:02:15,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:02:17,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:02:27,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:02:27,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:02:29,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 05:02:30,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:02:32,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 05:02:32,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:02:35,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:02:38,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:02:39,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 05:02:40,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:02:41,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:02:43,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:02:43,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:02:43,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1147120.0, ans=0.0 2023-10-03 05:02:45,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1147120.0, ans=0.125 2023-10-03 05:02:46,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:02:47,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1147120.0, ans=0.1 2023-10-03 05:02:48,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:02:48,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1147120.0, ans=0.0 2023-10-03 05:02:49,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:02:50,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:02:53,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:02:58,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:02:59,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 05:03:02,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:03:04,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:03:05,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:03:07,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 05:03:10,481 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 05:03:10,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:03:10,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:03:10,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1147186.6666666667, ans=0.125 2023-10-03 05:03:11,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:03:13,694 INFO [train.py:1046] (1/4) Epoch 33, batch 2100, loss[loss=0.1456, simple_loss=0.1947, pruned_loss=0.04829, over 19354.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2393, pruned_loss=0.04106, over 4716938.47 frames. ], batch size: 389, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 05:03:13,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:03:13,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 05:03:15,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 05:03:16,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:03:19,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:03:19,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:03:22,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:03:23,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:03:23,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 05:03:25,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:03:25,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 05:03:25,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 05:03:27,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:03:27,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:03:27,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 05:03:27,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 05:03:35,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 05:03:35,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:03:38,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:03:38,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:03:40,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:03:41,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 05:03:43,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:03:43,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 05:03:44,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 05:03:46,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:03:46,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 05:03:46,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 05:03:47,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 05:03:49,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:03:49,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:03:52,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:03:53,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:03:55,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:03:56,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:03:56,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 05:03:56,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:03:56,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:03:56,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1147453.3333333333, ans=0.2 2023-10-03 05:03:57,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:03:57,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 05:03:59,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 05:04:01,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 05:04:04,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:04:07,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:04:07,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 05:04:10,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1147453.3333333333, ans=0.125 2023-10-03 05:04:13,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:04:14,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:04:15,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:04:15,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:04:15,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 05:04:16,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:04:18,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:04:18,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:04:20,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:04:20,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:04:22,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 05:04:23,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 05:04:23,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:04:25,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:04:25,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:04:26,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:04:26,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:04:27,907 INFO [train.py:1046] (1/4) Epoch 33, batch 2150, loss[loss=0.1646, simple_loss=0.2313, pruned_loss=0.04896, over 22780.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2389, pruned_loss=0.04088, over 4701841.03 frames. ], batch size: 322, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 05:04:32,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 05:04:33,937 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.872e+02 2.028e+02 2.280e+02 3.324e+02, threshold=4.056e+02, percent-clipped=0.0 2023-10-03 05:04:34,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:04:35,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:04:37,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:04:37,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:04:38,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:04:40,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:04:40,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:04:40,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:04:44,126 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.22 vs. limit=15.0 2023-10-03 05:04:44,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:04:44,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 05:04:49,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:04:50,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:04:52,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:04:52,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:04:52,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:04:52,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1147653.3333333333, ans=0.0 2023-10-03 05:04:53,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:04:53,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:04:53,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:04:54,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:04:54,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 05:04:56,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:04:57,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:04:59,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:05:00,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:05:02,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:05:05,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:05:06,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:05:08,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:05:08,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 05:05:08,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:05:09,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:05:10,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:12,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:05:12,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:05:13,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:15,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:15,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 05:05:17,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 05:05:17,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:05:19,185 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 05:05:19,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:19,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:05:20,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 05:05:20,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:05:20,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 05:05:20,680 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 05:05:20,681 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 05:05:20,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 05:05:22,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:22,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:05:22,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1147786.6666666667, ans=0.025 2023-10-03 05:05:22,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1147786.6666666667, ans=0.125 2023-10-03 05:05:23,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:05:23,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1147786.6666666667, ans=0.0 2023-10-03 05:05:24,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:24,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 05:05:26,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:26,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:37,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:05:37,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 05:05:41,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:05:42,626 INFO [train.py:1046] (1/4) Epoch 33, batch 2200, loss[loss=0.1637, simple_loss=0.2492, pruned_loss=0.03908, over 24443.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2389, pruned_loss=0.04112, over 4704147.83 frames. ], batch size: 66, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 05:05:45,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:47,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:05:47,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:05:47,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1147920.0, ans=0.125 2023-10-03 05:05:48,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:05:51,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:51,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:05:51,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 05:05:55,505 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.19 vs. limit=15.0 2023-10-03 05:05:58,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 05:06:00,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:06:04,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 05:06:08,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:06:08,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:06:09,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:06:13,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:06:14,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 05:06:16,311 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.59 vs. limit=15.0 2023-10-03 05:06:16,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:06:18,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:06:20,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 05:06:22,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:06:24,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:06:26,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:06:27,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:06:28,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 05:06:30,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:06:31,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 05:06:34,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:06:34,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:06:34,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:06:37,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:06:37,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:06:38,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:06:38,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:06:40,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 05:06:41,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:06:43,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:06:46,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 05:06:46,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:06:48,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:06:49,794 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 05:06:51,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:06:52,531 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 05:06:54,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 05:06:54,406 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 05:06:55,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:06:56,982 INFO [train.py:1046] (1/4) Epoch 33, batch 2250, loss[loss=0.174, simple_loss=0.2475, pruned_loss=0.05025, over 23675.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.24, pruned_loss=0.04129, over 4706169.78 frames. ], batch size: 232, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:06:57,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 05:06:58,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:06:59,850 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 05:07:01,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:07:02,536 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.842e+02 2.039e+02 2.201e+02 2.888e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 05:07:02,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:07:08,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:07:10,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1148320.0, ans=0.125 2023-10-03 05:07:10,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1148320.0, ans=0.0 2023-10-03 05:07:11,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:07:12,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:07:14,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:07:16,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:07:18,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 05:07:18,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:07:18,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:07:21,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 05:07:23,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:07:23,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:07:26,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:07:30,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:07:32,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 05:07:32,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 05:07:33,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 05:07:34,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:07:36,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:07:39,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:07:41,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:07:42,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:07:42,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:07:45,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:07:45,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1148453.3333333333, ans=0.125 2023-10-03 05:07:45,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_na.min_abs, batch_count=1148453.3333333333, ans=0.02 2023-10-03 05:07:46,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:07:51,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:07:54,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 05:07:55,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten.whitening_limit, batch_count=1148520.0, ans=22.5 2023-10-03 05:08:00,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:08:00,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:08:02,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:08:06,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 05:08:09,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:08:09,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 05:08:09,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:08:09,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:08:10,314 INFO [train.py:1046] (1/4) Epoch 33, batch 2300, loss[loss=0.1736, simple_loss=0.2579, pruned_loss=0.04462, over 23774.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2407, pruned_loss=0.0415, over 4717196.07 frames. ], batch size: 85, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:08:13,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 05:08:15,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:08:15,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:08:21,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:08:22,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:08:22,818 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 05:08:23,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1148586.6666666667, ans=0.0 2023-10-03 05:08:26,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:08:29,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1148653.3333333333, ans=0.0 2023-10-03 05:08:34,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:08:34,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 05:08:34,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:08:36,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:08:36,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 05:08:36,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:08:37,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:08:38,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:08:41,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:08:44,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:08:47,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:08:52,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:08:53,876 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.82 vs. limit=15.0 2023-10-03 05:08:54,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:08:57,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:08:58,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:09:01,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:09:01,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:09:02,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:09:02,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 05:09:05,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 05:09:05,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:09:07,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:09:07,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:09:09,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:09:09,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1148853.3333333333, ans=0.07 2023-10-03 05:09:10,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 05:09:10,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:09:10,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 05:09:10,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:09:10,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:09:11,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 05:09:17,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:09:22,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:09:25,772 INFO [train.py:1046] (1/4) Epoch 33, batch 2350, loss[loss=0.1663, simple_loss=0.2493, pruned_loss=0.04169, over 23250.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2425, pruned_loss=0.04252, over 4703963.70 frames. ], batch size: 105, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:09:27,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:09:27,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:09:27,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:09:28,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:09:28,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:09:28,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:09:28,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 05:09:31,608 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.933e+02 2.128e+02 2.511e+02 4.744e+02, threshold=4.255e+02, percent-clipped=2.0 2023-10-03 05:09:34,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:09:34,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 05:09:40,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 05:09:43,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:09:46,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:09:46,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:09:46,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:09:46,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:09:47,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 05:09:51,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:09:54,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 05:09:57,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:10:00,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:10:00,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:10:02,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:10:04,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 05:10:05,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:10:07,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:10:07,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:10:08,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:10:08,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1149120.0, ans=0.125 2023-10-03 05:10:11,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:10:14,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 05:10:14,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:10:16,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:10:16,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:10:18,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 05:10:18,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:10:22,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 05:10:22,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:10:27,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 05:10:31,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 05:10:31,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:10:31,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 05:10:31,338 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 05:10:31,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 05:10:32,221 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.74 vs. limit=10.0 2023-10-03 05:10:34,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 05:10:35,795 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1149186.6666666667, ans=0.2 2023-10-03 05:10:37,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:10:39,598 INFO [train.py:1046] (1/4) Epoch 33, batch 2400, loss[loss=0.1399, simple_loss=0.2191, pruned_loss=0.03032, over 24565.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2419, pruned_loss=0.0422, over 4705928.02 frames. ], batch size: 60, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:10:41,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:10:45,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:10:47,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:10:49,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 05:10:49,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 05:10:55,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:10:55,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:10:58,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 05:10:59,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:10:59,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:01,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 05:11:04,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:05,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 05:11:05,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1149320.0, ans=0.0 2023-10-03 05:11:09,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:11:14,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 05:11:18,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:11:18,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:24,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:11:24,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 05:11:24,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1149453.3333333333, ans=0.125 2023-10-03 05:11:25,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:11:27,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1149453.3333333333, ans=0.07 2023-10-03 05:11:29,435 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:11:30,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:11:33,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:11:36,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:11:36,694 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.92 vs. limit=22.5 2023-10-03 05:11:37,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:11:37,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 05:11:37,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:11:37,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:11:38,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:11:38,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:11:42,628 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:11:43,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:11:43,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:11:43,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 05:11:46,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 05:11:48,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:11:49,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.whiten.whitening_limit, batch_count=1149520.0, ans=12.0 2023-10-03 05:11:49,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:11:49,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 05:11:49,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 05:11:51,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 05:11:51,185 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 05:11:52,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 05:11:52,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:11:54,465 INFO [train.py:1046] (1/4) Epoch 33, batch 2450, loss[loss=0.1685, simple_loss=0.2395, pruned_loss=0.04872, over 23675.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2407, pruned_loss=0.04196, over 4707704.96 frames. ], batch size: 232, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:11:54,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:54,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:11:54,634 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 05:11:56,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:57,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:11:59,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:11:59,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:12:01,673 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.804e+02 1.981e+02 2.295e+02 3.038e+02, threshold=3.963e+02, percent-clipped=0.0 2023-10-03 05:12:03,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:12:03,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:12:04,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 05:12:08,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:12:08,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:12:11,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:12:11,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:12:11,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:12:13,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 05:12:16,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:12:18,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:12:18,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:12:21,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:12:21,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:12:24,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:12:24,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:12:26,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 05:12:27,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:12:32,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1149720.0, ans=0.125 2023-10-03 05:12:33,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:12:36,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:12:36,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:12:36,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:12:37,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:12:38,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:12:38,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 05:12:42,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:12:42,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:12:45,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:12:46,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:12:52,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:12:52,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 05:12:54,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:12:55,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:12:55,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 05:12:55,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:12:55,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:12:59,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:13:01,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:13:03,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:13:07,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 05:13:07,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:13:08,717 INFO [train.py:1046] (1/4) Epoch 33, batch 2500, loss[loss=0.1477, simple_loss=0.2358, pruned_loss=0.02983, over 24468.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2393, pruned_loss=0.0414, over 4705572.01 frames. ], batch size: 63, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:13:10,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1149920.0, ans=0.0 2023-10-03 05:13:14,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:13:22,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:13:22,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:13:24,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:13:24,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 05:13:29,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:13:30,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:13:32,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 05:13:32,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 05:13:34,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 05:13:35,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:13:35,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:13:35,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 05:13:35,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:13:37,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 05:13:37,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:13:41,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:13:42,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:13:45,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:13:45,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 05:13:47,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:13:49,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:13:51,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:13:57,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:13:57,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1150120.0, ans=0.125 2023-10-03 05:13:58,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:14:02,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 05:14:07,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 05:14:07,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:14:07,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 05:14:08,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:14:08,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:14:09,923 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 05:14:09,924 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 05:14:09,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 05:14:10,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1150186.6666666667, ans=0.125 2023-10-03 05:14:12,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:14:14,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 05:14:14,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 05:14:16,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:14:16,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 05:14:19,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 05:14:22,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:14:23,575 INFO [train.py:1046] (1/4) Epoch 33, batch 2550, loss[loss=0.1631, simple_loss=0.2345, pruned_loss=0.04589, over 22764.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2398, pruned_loss=0.0411, over 4712016.86 frames. ], batch size: 322, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:14:25,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:14:25,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:14:28,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:14:29,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 05:14:29,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:14:31,059 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.976e+02 2.259e+02 2.608e+02 3.805e+02, threshold=4.518e+02, percent-clipped=0.0 2023-10-03 05:14:33,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 05:14:35,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:14:38,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:14:39,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:14:40,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 05:14:40,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:14:40,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:14:40,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:14:43,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:14:43,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 05:14:44,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 05:14:44,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:14:44,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 05:14:58,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:15:02,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:15:02,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:15:02,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:15:03,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:15:10,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:15:13,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:15:13,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:15:13,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:15:14,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 05:15:15,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:15:16,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:15:16,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:15:22,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:15:22,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 05:15:22,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:15:22,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:15:22,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:15:24,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:15:24,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=1150520.0, ans=10.0 2023-10-03 05:15:24,479 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:15:25,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1150520.0, ans=0.2 2023-10-03 05:15:26,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:15:34,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:15:35,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:15:37,244 INFO [train.py:1046] (1/4) Epoch 33, batch 2600, loss[loss=0.1517, simple_loss=0.23, pruned_loss=0.03668, over 24642.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2404, pruned_loss=0.04089, over 4714032.89 frames. ], batch size: 60, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:15:37,400 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 05:15:40,324 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 05:15:40,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:15:41,665 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 05:15:41,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 05:15:42,973 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 05:15:45,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:15:45,763 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 05:15:48,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 05:15:49,482 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.09 vs. limit=15.0 2023-10-03 05:15:50,373 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 05:15:51,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:15:53,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 05:15:55,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 05:15:56,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:15:58,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 05:15:58,353 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 05:16:00,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 05:16:00,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1150653.3333333333, ans=0.05 2023-10-03 05:16:06,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:16:06,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:16:06,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:16:06,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 05:16:08,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:16:14,760 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 05:16:19,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1150720.0, ans=0.0 2023-10-03 05:16:20,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:16:20,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:16:22,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 05:16:22,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:16:22,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:16:23,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 05:16:25,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:16:25,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:16:27,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:16:31,887 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 05:16:33,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:16:33,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:16:37,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:16:40,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:16:40,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 05:16:41,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:16:43,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:16:44,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:16:48,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 05:16:50,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:16:51,609 INFO [train.py:1046] (1/4) Epoch 33, batch 2650, loss[loss=0.1665, simple_loss=0.2478, pruned_loss=0.04261, over 23328.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2408, pruned_loss=0.04106, over 4717260.52 frames. ], batch size: 105, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:16:51,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:16:53,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1150920.0, ans=0.0 2023-10-03 05:16:55,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 05:16:55,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:16:57,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:16:58,440 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 05:16:58,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:16:59,709 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.879e+02 2.016e+02 2.278e+02 3.478e+02, threshold=4.033e+02, percent-clipped=0.0 2023-10-03 05:16:59,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:17:00,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1150920.0, ans=0.0 2023-10-03 05:17:02,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:17:04,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:17:06,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:17:07,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=1150986.6666666667, ans=15.0 2023-10-03 05:17:07,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 05:17:07,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:17:08,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:17:11,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 05:17:11,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1150986.6666666667, ans=0.1 2023-10-03 05:17:14,359 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 05:17:15,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:17:17,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 05:17:17,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:17:18,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 05:17:21,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:17:21,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:17:21,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:17:22,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:17:28,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 05:17:28,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1151053.3333333333, ans=0.125 2023-10-03 05:17:29,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 05:17:31,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:17:36,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 05:17:36,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:17:36,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:17:38,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:17:38,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:17:38,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:17:40,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:17:41,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:17:42,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:17:44,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:17:45,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:17:46,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:17:47,724 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.20 vs. limit=15.0 2023-10-03 05:17:48,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:17:48,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:17:51,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:17:51,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:17:53,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:17:54,472 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.16 vs. limit=15.0 2023-10-03 05:17:55,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:17:55,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:17:55,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 05:17:58,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:18:00,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:18:01,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:18:03,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:04,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:18:04,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:06,234 INFO [train.py:1046] (1/4) Epoch 33, batch 2700, loss[loss=0.1506, simple_loss=0.2347, pruned_loss=0.03327, over 24492.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2409, pruned_loss=0.04113, over 4721173.28 frames. ], batch size: 66, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:18:06,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:18:06,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 05:18:08,082 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:18:08,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1151253.3333333333, ans=0.1 2023-10-03 05:18:10,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:18:13,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 05:18:15,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:18:15,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:15,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:16,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:18:16,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:18:16,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:18:16,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 05:18:17,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 05:18:18,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:18:19,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:18:19,968 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.96 vs. limit=6.0 2023-10-03 05:18:20,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:18:22,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:18:24,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:18:26,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 05:18:27,410 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.17 vs. limit=15.0 2023-10-03 05:18:27,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:18:30,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:18:30,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:18:36,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:18:36,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:18:38,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:18:38,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:18:41,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:18:42,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:18:42,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:18:43,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:18:46,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1151386.6666666667, ans=0.2 2023-10-03 05:18:48,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:48,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:18:52,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1151453.3333333333, ans=0.1 2023-10-03 05:18:55,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:18:55,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:18:59,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:18:59,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:19:01,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1151453.3333333333, ans=0.125 2023-10-03 05:19:01,863 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.76 vs. limit=15.0 2023-10-03 05:19:03,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:19:04,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:19:04,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:19:06,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:08,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:19:09,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:19:11,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:19:12,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:19:12,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:19:15,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 05:19:15,850 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.35 vs. limit=15.0 2023-10-03 05:19:16,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:19:17,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:19:19,328 INFO [train.py:1046] (1/4) Epoch 33, batch 2750, loss[loss=0.1693, simple_loss=0.2542, pruned_loss=0.04221, over 23978.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2417, pruned_loss=0.0421, over 4708332.82 frames. ], batch size: 86, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:19:19,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 05:19:19,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 05:19:20,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:19:23,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:19:23,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:19:26,307 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 2.013e+02 2.192e+02 2.661e+02 5.400e+02, threshold=4.383e+02, percent-clipped=1.0 2023-10-03 05:19:26,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:26,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:19:26,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:29,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:19:29,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 05:19:31,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:19:31,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:31,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 05:19:31,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:19:31,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:19:39,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 05:19:41,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:19:41,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:42,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:19:42,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:19:43,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:19:45,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:19:45,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:19:46,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:19:47,180 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.55 vs. limit=6.0 2023-10-03 05:19:50,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:19:50,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 05:19:50,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:19:50,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1151720.0, ans=0.0 2023-10-03 05:19:51,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:53,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:19:53,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1151720.0, ans=0.1 2023-10-03 05:19:58,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:20:00,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:20:00,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:20:05,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:20:05,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:20:05,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:20:09,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1151786.6666666667, ans=0.95 2023-10-03 05:20:11,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:20:11,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:20:11,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 05:20:15,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:20:18,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 05:20:22,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 05:20:24,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:20:25,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 05:20:25,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:20:28,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:20:28,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 05:20:28,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:20:30,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1151853.3333333333, ans=0.0 2023-10-03 05:20:31,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 05:20:31,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:20:33,189 INFO [train.py:1046] (1/4) Epoch 33, batch 2800, loss[loss=0.134, simple_loss=0.2149, pruned_loss=0.02657, over 24638.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2396, pruned_loss=0.04162, over 4713360.26 frames. ], batch size: 60, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:20:33,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:20:33,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 05:20:33,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:20:33,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:20:36,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:20:36,554 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 05:20:36,555 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 05:20:41,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:20:41,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:20:42,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:20:45,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:20:46,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 05:20:49,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 05:20:51,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 05:20:52,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:20:52,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:20:52,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:20:56,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:20:58,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:20:58,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 05:20:59,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:21:07,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1152053.3333333333, ans=0.125 2023-10-03 05:21:08,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:21:10,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=1152053.3333333333, ans=10.0 2023-10-03 05:21:11,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:21:11,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:21:13,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:21:14,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:21:19,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:21:19,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 05:21:19,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:21:20,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:21:20,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:21:22,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1152120.0, ans=0.1 2023-10-03 05:21:24,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:21:24,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:21:26,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:21:29,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:21:29,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:21:29,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:21:31,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 05:21:31,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:21:31,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:21:31,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 05:21:31,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:21:32,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:21:34,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:21:34,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 05:21:35,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:21:35,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:21:36,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:21:37,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1152186.6666666667, ans=0.125 2023-10-03 05:21:38,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 05:21:44,048 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.00 vs. limit=22.5 2023-10-03 05:21:44,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:21:44,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:21:46,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:21:47,512 INFO [train.py:1046] (1/4) Epoch 33, batch 2850, loss[loss=0.1628, simple_loss=0.2326, pruned_loss=0.04648, over 23515.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2395, pruned_loss=0.04165, over 4714696.18 frames. ], batch size: 256, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:21:47,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:21:49,406 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.10 vs. limit=15.0 2023-10-03 05:21:51,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:21:51,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:21:51,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:21:54,430 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.804e+02 2.039e+02 2.498e+02 3.555e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 05:21:54,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:21:54,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:21:57,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:21:57,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 05:22:05,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 05:22:05,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:22:06,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 05:22:08,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:22:10,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 05:22:10,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 05:22:12,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:22:25,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:22:27,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:22:27,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:22:28,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:22:28,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:22:28,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:22:30,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:22:30,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 05:22:33,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:22:33,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:22:35,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:22:35,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:22:36,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:22:36,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:22:38,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:22:41,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:22:42,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:22:44,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:22:45,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:22:47,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:22:52,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:22:54,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 05:22:54,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 05:22:55,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:22:55,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:22:55,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 05:22:57,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:22:57,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:22:57,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:22:57,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:22:57,314 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 05:22:57,352 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 05:22:57,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:22:58,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:23:01,886 INFO [train.py:1046] (1/4) Epoch 33, batch 2900, loss[loss=0.157, simple_loss=0.2359, pruned_loss=0.03905, over 23677.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2401, pruned_loss=0.04161, over 4727723.96 frames. ], batch size: 149, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:23:04,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 05:23:04,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:23:04,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1152586.6666666667, ans=0.125 2023-10-03 05:23:04,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1152586.6666666667, ans=0.125 2023-10-03 05:23:06,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:23:06,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 05:23:08,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1152586.6666666667, ans=0.1 2023-10-03 05:23:10,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:23:10,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 05:23:12,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 05:23:14,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:23:14,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:23:15,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:23:17,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:23:19,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1152653.3333333333, ans=0.025 2023-10-03 05:23:21,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:23:21,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:23:24,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:23:24,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 05:23:24,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:23:27,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:23:28,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 05:23:29,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 05:23:31,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:23:32,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 05:23:32,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:23:34,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:23:34,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 05:23:37,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:23:39,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:23:42,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:23:43,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1152720.0, ans=0.125 2023-10-03 05:23:43,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1152720.0, ans=0.0 2023-10-03 05:23:46,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:23:48,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 05:23:48,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 05:23:48,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:23:52,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:23:54,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 05:23:55,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:23:59,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:24:04,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1152853.3333333333, ans=0.1 2023-10-03 05:24:07,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:24:07,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:24:08,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 05:24:11,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:24:11,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 05:24:11,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:24:13,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:24:15,900 INFO [train.py:1046] (1/4) Epoch 33, batch 2950, loss[loss=0.1572, simple_loss=0.248, pruned_loss=0.03316, over 24548.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2412, pruned_loss=0.04175, over 4727505.92 frames. ], batch size: 71, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:24:18,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:24:20,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 05:24:20,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:24:20,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:24:22,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:24:23,321 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.464e+02 1.788e+02 1.939e+02 2.108e+02 3.552e+02, threshold=3.878e+02, percent-clipped=0.0 2023-10-03 05:24:23,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:24:24,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 05:24:26,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 05:24:26,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:24:26,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:24:31,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:24:31,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1152986.6666666667, ans=0.125 2023-10-03 05:24:33,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:24:36,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:24:36,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:24:39,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:24:39,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:24:41,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:24:42,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:24:42,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:24:45,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 05:24:51,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 05:24:51,851 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 05:24:53,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:24:56,019 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 05:24:56,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 05:24:57,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:24:57,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:24:57,365 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 05:24:57,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:24:58,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 05:25:00,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:25:00,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1153120.0, ans=0.125 2023-10-03 05:25:01,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:25:03,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:25:04,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:25:04,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:25:04,353 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 05:25:04,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:25:05,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 05:25:09,830 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.57 vs. limit=15.0 2023-10-03 05:25:11,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:25:13,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:25:13,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 05:25:13,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:25:14,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 05:25:17,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:25:20,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:25:20,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:25:22,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:25:22,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 05:25:25,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:25:25,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:25:25,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:25:26,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:25:26,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:25:27,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:25:29,295 INFO [train.py:1046] (1/4) Epoch 33, batch 3000, loss[loss=0.1686, simple_loss=0.2392, pruned_loss=0.04902, over 23638.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2419, pruned_loss=0.04204, over 4732918.83 frames. ], batch size: 149, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:25:29,295 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 05:25:41,088 INFO [train.py:1078] (1/4) Epoch 33, validation: loss=0.3581, simple_loss=0.2789, pruned_loss=0.2187, over 1125622.00 frames. 2023-10-03 05:25:41,089 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 05:25:41,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:25:41,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 05:25:43,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:25:45,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:25:46,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:25:49,009 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 05:25:49,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 05:25:52,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:25:53,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:25:53,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 05:25:53,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:25:56,736 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:25:59,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:26:04,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=1153320.0, ans=15.0 2023-10-03 05:26:11,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:26:12,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1153386.6666666667, ans=0.0 2023-10-03 05:26:15,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1153386.6666666667, ans=0.0 2023-10-03 05:26:19,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 05:26:19,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:26:21,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:26:21,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:26:21,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:26:25,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:26:25,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 05:26:26,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 05:26:27,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:26:29,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 05:26:30,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:26:32,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:26:32,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:26:32,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:26:35,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1153453.3333333333, ans=0.1 2023-10-03 05:26:36,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:26:36,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:26:36,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:26:39,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:26:41,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 05:26:41,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:26:41,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:26:41,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1153520.0, ans=0.2 2023-10-03 05:26:42,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:26:45,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:26:45,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:26:46,074 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.08 vs. limit=15.0 2023-10-03 05:26:47,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 05:26:47,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 05:26:48,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:26:48,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 05:26:48,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:26:50,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 05:26:53,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:26:53,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:26:55,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 05:26:55,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 05:26:55,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 05:26:56,575 INFO [train.py:1046] (1/4) Epoch 33, batch 3050, loss[loss=0.1761, simple_loss=0.2413, pruned_loss=0.05544, over 23446.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2426, pruned_loss=0.04213, over 4735743.42 frames. ], batch size: 285, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:26:56,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:26:58,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:26:58,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 05:26:58,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:26:58,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:27:00,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 05:27:02,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:27:03,730 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 1.901e+02 2.048e+02 2.289e+02 3.124e+02, threshold=4.095e+02, percent-clipped=0.0 2023-10-03 05:27:05,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:27:05,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:27:05,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1153586.6666666667, ans=0.1 2023-10-03 05:27:08,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:27:12,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 05:27:18,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1153653.3333333333, ans=0.125 2023-10-03 05:27:19,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 05:27:21,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 05:27:21,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:27:24,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:27:27,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:27:27,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:27:27,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:27:30,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:27:30,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:27:30,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:27:31,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:27:31,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:27:33,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:27:34,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:27:36,018 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1153720.0, ans=0.0 2023-10-03 05:27:40,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:27:40,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 05:27:40,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:27:40,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:27:43,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:27:44,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:27:44,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:27:44,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:27:48,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:27:50,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:27:54,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:27:55,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:27:55,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:27:57,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:27:57,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 05:27:58,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:28:00,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 05:28:01,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:28:01,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:01,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 05:28:03,748 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.82 vs. limit=15.0 2023-10-03 05:28:04,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:28:09,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:28:10,401 INFO [train.py:1046] (1/4) Epoch 33, batch 3100, loss[loss=0.1667, simple_loss=0.2445, pruned_loss=0.04444, over 23616.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2429, pruned_loss=0.04241, over 4728978.82 frames. ], batch size: 106, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:28:10,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:28:14,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:28:15,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 05:28:19,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 05:28:19,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 05:28:21,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:28:23,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:28:23,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:26,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 05:28:28,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1153986.6666666667, ans=0.125 2023-10-03 05:28:30,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:35,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 05:28:39,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1154053.3333333333, ans=0.1 2023-10-03 05:28:40,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 05:28:40,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:28:41,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:28:41,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:28:41,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 05:28:44,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:28:44,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 05:28:44,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:28:46,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:47,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 05:28:48,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1154053.3333333333, ans=0.2 2023-10-03 05:28:49,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:28:53,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:28:55,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 05:28:55,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 05:28:55,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1154120.0, ans=0.125 2023-10-03 05:28:56,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:28:57,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:59,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:28:59,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:28:59,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:29:00,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:29:00,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:29:03,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:29:03,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:29:03,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:29:03,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 05:29:07,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:29:08,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 05:29:11,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:29:11,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 05:29:12,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:29:12,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:29:12,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 05:29:23,092 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.60 vs. limit=15.0 2023-10-03 05:29:23,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 05:29:24,786 INFO [train.py:1046] (1/4) Epoch 33, batch 3150, loss[loss=0.1563, simple_loss=0.2248, pruned_loss=0.04391, over 23897.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2409, pruned_loss=0.04191, over 4714200.25 frames. ], batch size: 195, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:29:26,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:29:27,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:29:27,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1154253.3333333333, ans=0.125 2023-10-03 05:29:28,316 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.35 vs. limit=15.0 2023-10-03 05:29:29,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:29:29,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:29:29,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 05:29:30,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:29:31,692 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.823e+02 1.977e+02 2.153e+02 2.836e+02, threshold=3.954e+02, percent-clipped=0.0 2023-10-03 05:29:31,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 05:29:33,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 05:29:34,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:29:35,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1154253.3333333333, ans=0.125 2023-10-03 05:29:38,066 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 05:29:39,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 05:29:39,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:29:40,882 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 05:29:42,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 05:29:42,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 05:29:43,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 05:29:43,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 05:29:43,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:29:43,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:29:43,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:29:45,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 05:29:46,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:29:46,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:29:48,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:29:49,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:29:54,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 05:29:55,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:29:57,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:29:58,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:29:58,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 05:30:01,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 05:30:01,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:30:03,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 05:30:03,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 05:30:04,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:30:04,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:30:06,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:30:06,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:30:08,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 05:30:09,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:30:09,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:09,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1154453.3333333333, ans=0.1 2023-10-03 05:30:10,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:30:10,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:30:12,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 05:30:12,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:30:14,959 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.74 vs. limit=12.0 2023-10-03 05:30:15,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 05:30:15,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:17,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 05:30:18,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 05:30:18,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:30:19,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:30:19,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1154453.3333333333, ans=0.125 2023-10-03 05:30:21,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 05:30:22,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 05:30:23,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:30:24,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:30:25,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:25,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:30:31,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:30:32,031 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.99 vs. limit=22.5 2023-10-03 05:30:32,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:33,258 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.07 vs. limit=15.0 2023-10-03 05:30:35,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 05:30:39,160 INFO [train.py:1046] (1/4) Epoch 33, batch 3200, loss[loss=0.1513, simple_loss=0.2204, pruned_loss=0.04108, over 23960.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2398, pruned_loss=0.04136, over 4711501.83 frames. ], batch size: 196, lr: 3.09e-03, grad_scale: 32.0 2023-10-03 05:30:40,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:30:40,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 05:30:43,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:45,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:30:45,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 05:30:46,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:30:50,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:30:55,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:31:03,608 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.85 vs. limit=15.0 2023-10-03 05:31:04,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:31:13,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 05:31:14,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:31:16,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 05:31:17,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 05:31:20,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:31:22,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:31:23,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:31:26,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 05:31:28,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 05:31:29,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 05:31:29,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1154786.6666666667, ans=0.125 2023-10-03 05:31:32,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 05:31:35,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:31:41,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:31:41,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:31:41,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:31:42,672 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 05:31:42,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:31:45,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:31:47,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 05:31:47,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 05:31:49,397 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.30 vs. limit=15.0 2023-10-03 05:31:49,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 05:31:51,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 05:31:51,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:31:53,018 INFO [train.py:1046] (1/4) Epoch 33, batch 3250, loss[loss=0.1583, simple_loss=0.2282, pruned_loss=0.04416, over 23762.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2396, pruned_loss=0.04128, over 4722103.16 frames. ], batch size: 212, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:31:53,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1154920.0, ans=0.125 2023-10-03 05:31:54,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:31:54,702 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 05:31:55,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:31:55,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:31:57,375 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 05:32:00,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:32:01,521 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.860e+02 2.002e+02 2.246e+02 3.741e+02, threshold=4.004e+02, percent-clipped=0.0 2023-10-03 05:32:02,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:32:09,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:32:09,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 05:32:10,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:32:11,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:32:11,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:32:12,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:32:14,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:32:15,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:32:15,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:32:15,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:32:17,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:32:17,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:32:17,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:32:18,932 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.15 vs. limit=15.0 2023-10-03 05:32:19,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:32:21,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1155053.3333333333, ans=0.1 2023-10-03 05:32:21,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1155053.3333333333, ans=0.125 2023-10-03 05:32:22,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:32:24,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:32:24,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:32:27,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:32:27,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:32:27,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:32:30,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1155053.3333333333, ans=0.04949747468305833 2023-10-03 05:32:31,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 05:32:31,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:32:31,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:32:31,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1155053.3333333333, ans=0.1 2023-10-03 05:32:33,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1155053.3333333333, ans=0.1 2023-10-03 05:32:34,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:32:34,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:32:40,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:32:49,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:32:49,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:32:49,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 05:32:49,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:32:49,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 05:32:50,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:32:52,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 05:32:53,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 05:32:53,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:32:55,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:32:56,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:32:57,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 05:32:57,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:33:01,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:33:01,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:33:03,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 05:33:03,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:33:06,447 INFO [train.py:1046] (1/4) Epoch 33, batch 3300, loss[loss=0.1518, simple_loss=0.2272, pruned_loss=0.03822, over 23300.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2404, pruned_loss=0.04134, over 4729694.73 frames. ], batch size: 51, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:33:06,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:33:06,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 05:33:09,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:33:09,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 05:33:12,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 05:33:12,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 05:33:12,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:33:15,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:33:17,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:33:17,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:33:20,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 05:33:20,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:33:22,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:33:22,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:33:27,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 05:33:27,798 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.85 vs. limit=15.0 2023-10-03 05:33:28,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:33:28,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:33:28,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1155320.0, ans=0.0 2023-10-03 05:33:29,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:33:31,334 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 05:33:31,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:33:32,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 05:33:32,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:33:32,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:33:34,146 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 05:33:36,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:33:36,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:33:37,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1155386.6666666667, ans=0.025 2023-10-03 05:33:37,652 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.52 vs. limit=12.0 2023-10-03 05:33:38,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:33:38,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 05:33:40,261 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.91 vs. limit=15.0 2023-10-03 05:33:41,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 05:33:41,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:33:42,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:33:44,216 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 05:33:45,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 05:33:47,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:33:47,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1155386.6666666667, ans=0.125 2023-10-03 05:33:49,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 05:33:52,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:33:53,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:33:53,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:33:57,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:33:58,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:33:58,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:33:58,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:33:59,263 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.71 vs. limit=8.0 2023-10-03 05:34:01,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:34:01,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:34:02,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:34:02,552 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 05:34:03,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 05:34:05,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:34:06,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:34:06,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:34:07,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:34:07,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:34:09,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:34:09,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:09,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:34:11,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:34:12,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:34:14,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 05:34:15,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:15,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:18,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:34:18,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:34:18,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:34:21,349 INFO [train.py:1046] (1/4) Epoch 33, batch 3350, loss[loss=0.1728, simple_loss=0.2407, pruned_loss=0.05245, over 23654.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2417, pruned_loss=0.04193, over 4716152.02 frames. ], batch size: 232, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:34:21,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:34:21,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:24,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:34:27,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:28,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:34:30,134 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.952e+02 2.092e+02 2.361e+02 3.355e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-03 05:34:30,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:31,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:34:33,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:34:34,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:34:35,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 05:34:37,141 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 05:34:38,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:34:41,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 05:34:41,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 05:34:43,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:34:43,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:34:43,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:34:44,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 05:34:44,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:44,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:34:46,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:48,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:48,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:49,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:34:53,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:34:57,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:57,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:34:59,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1155720.0, ans=0.125 2023-10-03 05:35:01,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:35:02,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:35:04,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:35:04,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:35:06,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:35:09,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 05:35:09,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:35:09,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 05:35:09,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:35:11,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 05:35:12,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:35:14,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:35:20,844 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.60 vs. limit=22.5 2023-10-03 05:35:21,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:35:22,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 05:35:22,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:35:22,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:35:24,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:35:26,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1155853.3333333333, ans=0.1 2023-10-03 05:35:29,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:35:31,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 05:35:31,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:35:32,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:35:32,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1155853.3333333333, ans=0.0 2023-10-03 05:35:33,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:35:33,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 05:35:33,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:35:33,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 05:35:35,248 INFO [train.py:1046] (1/4) Epoch 33, batch 3400, loss[loss=0.1706, simple_loss=0.2465, pruned_loss=0.0474, over 23490.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2421, pruned_loss=0.042, over 4722212.26 frames. ], batch size: 106, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:35:36,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:35:36,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:35:36,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:35:37,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:35:39,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 05:35:44,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 05:35:44,068 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 05:35:44,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:35:48,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:35:48,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:35:48,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:35:49,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:35:56,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:35:58,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 05:36:03,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1155986.6666666667, ans=0.125 2023-10-03 05:36:04,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:36:04,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:36:04,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1156053.3333333333, ans=0.09899494936611666 2023-10-03 05:36:05,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:36:07,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 05:36:12,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:36:15,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 05:36:15,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1156053.3333333333, ans=0.0 2023-10-03 05:36:17,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1156053.3333333333, ans=0.125 2023-10-03 05:36:21,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:36:22,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:36:22,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 05:36:22,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:36:22,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:36:24,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:36:25,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:36:27,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:36:32,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:36:32,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:36:32,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1156120.0, ans=0.125 2023-10-03 05:36:36,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:36:37,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 05:36:43,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 05:36:44,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 05:36:47,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1156186.6666666667, ans=0.0 2023-10-03 05:36:48,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 05:36:49,749 INFO [train.py:1046] (1/4) Epoch 33, batch 3450, loss[loss=0.1606, simple_loss=0.2581, pruned_loss=0.03157, over 24673.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2423, pruned_loss=0.04215, over 4708955.58 frames. ], batch size: 73, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:36:49,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:36:52,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:36:52,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 05:36:52,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:36:57,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:36:57,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1156253.3333333333, ans=0.2 2023-10-03 05:36:59,360 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.487e+02 1.812e+02 1.985e+02 2.188e+02 3.320e+02, threshold=3.971e+02, percent-clipped=0.0 2023-10-03 05:37:00,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:37:02,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:37:03,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:37:03,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:37:05,854 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.09 vs. limit=15.0 2023-10-03 05:37:06,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:37:08,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1156320.0, ans=0.0 2023-10-03 05:37:13,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 05:37:16,612 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.14 vs. limit=15.0 2023-10-03 05:37:17,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 05:37:18,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 05:37:19,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:37:20,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:37:26,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 05:37:26,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:37:28,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1156386.6666666667, ans=0.2 2023-10-03 05:37:31,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:37:31,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:37:31,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_ff2.min_abs, batch_count=1156386.6666666667, ans=0.1 2023-10-03 05:37:32,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:37:33,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:37:35,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 05:37:35,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:37:36,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:37:36,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1156453.3333333333, ans=0.2 2023-10-03 05:37:39,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:37:41,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 05:37:43,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1156453.3333333333, ans=0.0 2023-10-03 05:37:44,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:37:49,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:37:52,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:37:53,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:37:56,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1156520.0, ans=0.1 2023-10-03 05:38:00,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:00,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:38:00,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:38:00,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:38:03,160 INFO [train.py:1046] (1/4) Epoch 33, batch 3500, loss[loss=0.1663, simple_loss=0.2414, pruned_loss=0.04558, over 18082.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2406, pruned_loss=0.04158, over 4708025.98 frames. ], batch size: 39, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:38:05,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:38:08,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:38:08,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 05:38:11,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 05:38:14,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 05:38:16,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:38:16,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 05:38:20,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:38:20,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:38:21,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:38:21,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:38:23,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 05:38:23,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:23,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:38:24,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 05:38:27,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:27,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:38:30,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:38:34,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:35,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 05:38:35,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:38:38,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:38:38,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:38:39,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:40,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:38:40,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:38:41,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1156720.0, ans=0.0 2023-10-03 05:38:42,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 05:38:45,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 05:38:45,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 05:38:45,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:38:47,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:48,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:38:48,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:38:49,495 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.14 vs. limit=15.0 2023-10-03 05:38:52,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 05:38:52,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:38:58,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:39:00,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 05:39:00,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 05:39:02,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:39:02,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1156853.3333333333, ans=0.125 2023-10-03 05:39:04,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:39:04,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:39:07,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:39:10,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 05:39:11,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:39:11,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:39:13,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 05:39:14,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 05:39:16,001 INFO [train.py:1046] (1/4) Epoch 33, batch 3550, loss[loss=0.167, simple_loss=0.2349, pruned_loss=0.04959, over 23831.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2393, pruned_loss=0.04137, over 4708107.03 frames. ], batch size: 212, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:39:18,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:39:19,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:39:19,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:39:19,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:39:23,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:39:24,904 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.443e+02 1.866e+02 2.004e+02 2.192e+02 2.799e+02, threshold=4.007e+02, percent-clipped=0.0 2023-10-03 05:39:27,012 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.74 vs. limit=10.0 2023-10-03 05:39:28,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1156920.0, ans=0.035 2023-10-03 05:39:31,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:39:31,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 05:39:36,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:39:36,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:39:38,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:39:39,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:39:39,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:39:39,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1156986.6666666667, ans=0.125 2023-10-03 05:39:42,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:39:42,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:39:43,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:39:43,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 05:39:44,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:39:48,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:39:48,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:39:48,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1157053.3333333333, ans=0.125 2023-10-03 05:39:49,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:39:49,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:39:49,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:39:49,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 05:39:51,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:39:52,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:39:52,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 05:39:55,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1157053.3333333333, ans=0.0 2023-10-03 05:39:58,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:39:59,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:40:01,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:01,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1157120.0, ans=0.1 2023-10-03 05:40:02,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 05:40:02,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:40:04,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 05:40:04,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:40:06,527 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1157120.0, ans=0.0 2023-10-03 05:40:07,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:40:07,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:40:10,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 05:40:10,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:40:16,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:40:16,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 05:40:16,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:40:16,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1157186.6666666667, ans=0.125 2023-10-03 05:40:20,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:40:22,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 05:40:23,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1157186.6666666667, ans=0.1 2023-10-03 05:40:26,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 05:40:27,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:40:28,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:40:30,817 INFO [train.py:1046] (1/4) Epoch 33, batch 3600, loss[loss=0.1599, simple_loss=0.2359, pruned_loss=0.04202, over 23635.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.239, pruned_loss=0.04119, over 4708972.67 frames. ], batch size: 232, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:40:30,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:40:30,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:40:32,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:40:36,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:40:38,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:39,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:40:41,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:40:41,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1157253.3333333333, ans=0.125 2023-10-03 05:40:42,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:42,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 05:40:45,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:40:45,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:50,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:40:52,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:40:52,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:40:54,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:40:54,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 05:40:55,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:40:57,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:57,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:40:57,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1157320.0, ans=0.125 2023-10-03 05:41:00,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:41:03,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:41:03,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:41:04,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 05:41:10,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:41:13,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 05:41:13,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 05:41:15,364 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:41:17,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:41:23,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:41:26,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:41:34,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:41:34,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:41:34,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 05:41:36,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 05:41:36,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1157520.0, ans=0.125 2023-10-03 05:41:37,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 05:41:39,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:41:40,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:41:42,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 05:41:42,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:41:42,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:41:42,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:41:44,537 INFO [train.py:1046] (1/4) Epoch 33, batch 3650, loss[loss=0.1614, simple_loss=0.2385, pruned_loss=0.04217, over 23546.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2395, pruned_loss=0.04108, over 4715701.25 frames. ], batch size: 256, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:41:44,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 05:41:45,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 05:41:48,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:41:50,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 05:41:53,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1157586.6666666667, ans=0.2 2023-10-03 05:41:54,152 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.881e+02 2.096e+02 2.323e+02 3.707e+02, threshold=4.192e+02, percent-clipped=0.0 2023-10-03 05:41:55,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 05:41:55,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:41:59,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 05:42:01,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 05:42:06,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:42:06,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:42:06,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:42:10,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:42:11,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:42:11,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 05:42:13,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:42:13,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:42:13,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 05:42:15,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 05:42:16,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:42:16,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:42:18,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:42:19,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 05:42:20,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 05:42:22,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:42:24,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 05:42:26,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:42:26,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:42:28,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1157786.6666666667, ans=0.0 2023-10-03 05:42:29,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:42:32,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:42:32,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:42:35,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:42:35,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:42:38,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:42:41,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:42:41,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:42:41,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:42:42,522 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.55 vs. limit=15.0 2023-10-03 05:42:43,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:42:44,023 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.89 vs. limit=15.0 2023-10-03 05:42:45,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:42:45,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:42:48,889 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.43 vs. limit=22.5 2023-10-03 05:42:51,927 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 05:42:54,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:42:54,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:42:54,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1157853.3333333333, ans=0.125 2023-10-03 05:42:56,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:42:56,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:42:57,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:42:58,797 INFO [train.py:1046] (1/4) Epoch 33, batch 3700, loss[loss=0.164, simple_loss=0.2385, pruned_loss=0.04473, over 23773.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2401, pruned_loss=0.04092, over 4729543.51 frames. ], batch size: 232, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:42:58,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:43:01,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 05:43:01,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:43:03,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:43:06,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:43:06,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:43:09,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:43:09,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 05:43:09,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:43:10,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1157920.0, ans=0.125 2023-10-03 05:43:11,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 05:43:11,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:43:13,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1157986.6666666667, ans=0.0 2023-10-03 05:43:14,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 05:43:14,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1157986.6666666667, ans=0.125 2023-10-03 05:43:17,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:43:17,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:43:19,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:43:19,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:43:20,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 05:43:22,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:43:24,824 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 05:43:30,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:43:32,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 05:43:32,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:43:32,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 05:43:33,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:43:35,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:43:36,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 05:43:37,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:43:40,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:43:42,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:43:42,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:43:44,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 05:43:48,655 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.07 vs. limit=22.5 2023-10-03 05:43:49,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:43:49,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 05:43:50,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:43:50,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 05:43:54,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:43:54,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:43:54,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1158120.0, ans=10.0 2023-10-03 05:43:57,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:43:57,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 05:43:58,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:43:58,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:44:00,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:44:00,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:44:04,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:44:06,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 05:44:06,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 05:44:07,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:44:07,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:44:08,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:44:09,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:44:11,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:44:13,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:44:13,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:44:15,091 INFO [train.py:1046] (1/4) Epoch 33, batch 3750, loss[loss=0.2307, simple_loss=0.293, pruned_loss=0.08416, over 19622.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2419, pruned_loss=0.0419, over 4709562.34 frames. ], batch size: 388, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:44:16,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 05:44:17,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 05:44:18,427 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.65 vs. limit=15.0 2023-10-03 05:44:20,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:44:22,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 05:44:22,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:44:23,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:44:23,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:44:24,744 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.972e+02 2.162e+02 2.375e+02 3.648e+02, threshold=4.325e+02, percent-clipped=0.0 2023-10-03 05:44:26,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:44:26,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1158253.3333333333, ans=0.0 2023-10-03 05:44:28,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:44:33,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:44:35,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1158320.0, ans=0.125 2023-10-03 05:44:36,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:44:37,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:44:41,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:44:41,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 05:44:42,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:44:44,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:44:44,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:44:46,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1158386.6666666667, ans=0.125 2023-10-03 05:44:47,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 05:44:52,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 05:44:53,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:44:53,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:44:55,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1158386.6666666667, ans=0.1 2023-10-03 05:44:56,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:44:59,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:45:01,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 05:45:02,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 05:45:05,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:45:09,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:45:11,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:45:14,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:45:17,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 05:45:20,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 05:45:21,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1158520.0, ans=0.125 2023-10-03 05:45:23,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:45:23,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:45:25,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:45:29,740 INFO [train.py:1046] (1/4) Epoch 33, batch 3800, loss[loss=0.1685, simple_loss=0.2569, pruned_loss=0.04003, over 24666.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2404, pruned_loss=0.04111, over 4716304.67 frames. ], batch size: 73, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:45:33,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:45:36,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:45:36,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 05:45:36,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1158586.6666666667, ans=0.125 2023-10-03 05:45:38,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 05:45:38,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:45:41,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:45:41,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 05:45:44,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 05:45:44,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:45:44,794 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.21 vs. limit=15.0 2023-10-03 05:45:46,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:45:47,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:45:47,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:45:47,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:45:51,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 05:45:55,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 05:45:55,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:45:56,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:45:59,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:45:59,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 05:46:02,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:46:02,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:46:03,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:46:04,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:46:09,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:46:09,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 05:46:09,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1158720.0, ans=0.0 2023-10-03 05:46:11,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:46:12,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1158786.6666666667, ans=0.0 2023-10-03 05:46:17,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:46:22,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:46:22,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1158786.6666666667, ans=0.0 2023-10-03 05:46:25,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 05:46:26,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 05:46:27,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:46:29,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:46:29,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:46:31,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 05:46:35,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 05:46:35,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 05:46:35,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:46:36,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:46:40,642 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.31 vs. limit=15.0 2023-10-03 05:46:42,948 INFO [train.py:1046] (1/4) Epoch 33, batch 3850, loss[loss=0.1457, simple_loss=0.2291, pruned_loss=0.03113, over 24606.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2399, pruned_loss=0.04116, over 4711071.35 frames. ], batch size: 60, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:46:43,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:46:44,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:46:49,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:46:49,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 05:46:51,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:46:51,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:46:54,123 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.876e+02 2.130e+02 2.387e+02 3.615e+02, threshold=4.261e+02, percent-clipped=0.0 2023-10-03 05:46:54,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:46:57,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:47:00,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 05:47:01,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 05:47:07,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:08,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:47:11,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:47:11,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:47:12,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff3.min_abs, batch_count=1159053.3333333333, ans=0.2 2023-10-03 05:47:13,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:13,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:47:14,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:47:14,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:47:15,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:17,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:19,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:19,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:47:19,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 05:47:20,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 05:47:22,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:47:22,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:25,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:47:25,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:25,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 05:47:26,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 05:47:27,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:47:29,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 05:47:32,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 05:47:34,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1159120.0, ans=0.0 2023-10-03 05:47:36,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:47:38,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:42,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:47:42,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 05:47:45,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 05:47:46,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:47:46,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:47:51,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:47:51,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:47:51,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:52,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:52,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:47:52,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 05:47:52,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:47:55,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 05:47:55,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:55,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:47:56,818 INFO [train.py:1046] (1/4) Epoch 33, batch 3900, loss[loss=0.1598, simple_loss=0.2325, pruned_loss=0.04357, over 23674.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2393, pruned_loss=0.04093, over 4707735.17 frames. ], batch size: 256, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:47:56,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:47:58,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:48:00,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:48:00,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:48:00,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:48:00,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:48:00,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 05:48:01,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:48:04,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:48:05,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:48:07,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:48:07,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:48:08,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:48:08,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:48:10,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:48:12,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 05:48:12,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:48:14,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 05:48:14,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:48:16,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 05:48:19,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 05:48:24,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:48:25,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:48:25,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:48:25,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:48:30,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:48:31,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:48:31,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1159386.6666666667, ans=0.125 2023-10-03 05:48:35,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:48:35,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:48:35,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:48:41,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:48:42,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:48:46,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1159453.3333333333, ans=0.125 2023-10-03 05:48:48,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:48:50,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:48:59,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1159520.0, ans=0.125 2023-10-03 05:49:00,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:49:02,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:49:02,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 05:49:04,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 05:49:04,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:49:05,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 05:49:06,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:49:07,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 05:49:11,133 INFO [train.py:1046] (1/4) Epoch 33, batch 3950, loss[loss=0.16, simple_loss=0.2341, pruned_loss=0.04291, over 23789.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2387, pruned_loss=0.04081, over 4707940.75 frames. ], batch size: 212, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:49:14,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:49:15,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 05:49:17,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:49:17,768 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.09 vs. limit=15.0 2023-10-03 05:49:19,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:49:19,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:49:21,203 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.827e+02 2.038e+02 2.266e+02 3.676e+02, threshold=4.077e+02, percent-clipped=0.0 2023-10-03 05:49:26,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1159653.3333333333, ans=0.125 2023-10-03 05:49:27,654 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 05:49:28,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:49:28,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 05:49:30,336 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 05:49:30,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:49:32,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1159653.3333333333, ans=0.2 2023-10-03 05:49:33,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:49:33,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 05:49:33,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:49:35,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 05:49:36,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:49:38,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:49:38,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:49:39,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:49:39,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:49:44,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1159720.0, ans=0.125 2023-10-03 05:49:48,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:49:48,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:49:52,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 05:49:59,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 05:49:59,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 05:50:00,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:50:00,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:50:00,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1159786.6666666667, ans=0.0 2023-10-03 05:50:03,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1159786.6666666667, ans=0.0 2023-10-03 05:50:08,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:50:08,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:50:08,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:50:08,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:50:08,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 05:50:12,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:50:14,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:50:18,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 05:50:25,605 INFO [train.py:1046] (1/4) Epoch 33, batch 4000, loss[loss=0.1533, simple_loss=0.2479, pruned_loss=0.02934, over 24679.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2393, pruned_loss=0.04137, over 4682437.05 frames. ], batch size: 68, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:50:25,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1159920.0, ans=0.125 2023-10-03 05:50:28,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:50:31,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1159920.0, ans=0.125 2023-10-03 05:50:35,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:50:41,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1159986.6666666667, ans=0.125 2023-10-03 05:50:42,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:50:42,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:50:42,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:50:44,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 05:50:44,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:50:46,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 05:50:46,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:50:46,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 05:50:47,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:50:50,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:50:50,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:50:50,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:50:51,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:50:51,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 05:50:53,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:50:53,972 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.84 vs. limit=15.0 2023-10-03 05:50:55,749 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 05:50:55,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:50:57,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:50:59,223 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 05:50:59,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:50:59,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:51:02,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1160053.3333333333, ans=0.0 2023-10-03 05:51:04,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 05:51:05,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:51:07,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:51:09,249 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 05:51:10,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:51:11,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 05:51:11,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:51:12,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:51:13,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:51:15,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:51:15,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:51:16,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:51:19,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 05:51:19,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:51:20,675 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 05:51:23,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:51:26,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 05:51:28,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:51:30,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:51:30,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:51:32,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:51:34,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1160186.6666666667, ans=0.0 2023-10-03 05:51:37,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:51:38,828 INFO [train.py:1046] (1/4) Epoch 33, batch 4050, loss[loss=0.1448, simple_loss=0.2287, pruned_loss=0.03047, over 24344.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2396, pruned_loss=0.04085, over 4703539.54 frames. ], batch size: 61, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:51:38,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 05:51:40,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 05:51:41,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:51:42,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:51:42,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:51:44,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:51:45,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:51:49,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:51:50,390 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.813e+02 1.958e+02 2.207e+02 3.335e+02, threshold=3.917e+02, percent-clipped=0.0 2023-10-03 05:51:53,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:51:53,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:51:54,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:51:54,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:51:58,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:52:00,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:52:05,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 05:52:07,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 05:52:07,523 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 05:52:10,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:52:17,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 05:52:18,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:52:22,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:52:24,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1160453.3333333333, ans=0.125 2023-10-03 05:52:25,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:52:25,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:52:25,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:52:29,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:52:30,661 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.88 vs. limit=15.0 2023-10-03 05:52:32,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 05:52:33,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:52:35,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:52:35,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 05:52:35,535 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:52:40,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:52:46,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 05:52:46,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:52:46,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:52:49,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 05:52:49,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 05:52:49,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:52:51,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:52:52,502 INFO [train.py:1046] (1/4) Epoch 33, batch 4100, loss[loss=0.2119, simple_loss=0.2742, pruned_loss=0.07475, over 19381.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.241, pruned_loss=0.04139, over 4704410.39 frames. ], batch size: 389, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:52:52,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:52:52,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:52:58,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 05:52:59,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 05:53:02,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 05:53:02,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 05:53:02,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:53:04,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:53:05,290 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.69 vs. limit=6.0 2023-10-03 05:53:05,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:53:05,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:53:07,072 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 05:53:09,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:53:11,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:53:11,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:53:12,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:53:15,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:53:15,845 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.03 vs. limit=6.0 2023-10-03 05:53:16,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:53:17,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:53:17,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 05:53:19,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:53:19,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:53:19,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:53:19,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:53:20,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 05:53:23,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:53:24,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1160720.0, ans=10.0 2023-10-03 05:53:25,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 05:53:26,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:53:29,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:53:29,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 05:53:30,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:53:32,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:53:32,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:53:34,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 05:53:36,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 05:53:36,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:53:40,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 05:53:40,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:53:41,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:53:44,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:53:48,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:53:50,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:53:51,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:53:57,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:53:57,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:54:02,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:54:04,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:54:06,926 INFO [train.py:1046] (1/4) Epoch 33, batch 4150, loss[loss=0.156, simple_loss=0.2424, pruned_loss=0.03477, over 24440.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2411, pruned_loss=0.0411, over 4712539.80 frames. ], batch size: 69, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:54:07,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:54:07,616 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.79 vs. limit=15.0 2023-10-03 05:54:09,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:54:11,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:54:11,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:54:14,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 05:54:14,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:54:15,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 05:54:16,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 05:54:16,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 05:54:17,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:54:20,399 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.882e+02 2.112e+02 2.535e+02 4.235e+02, threshold=4.223e+02, percent-clipped=3.0 2023-10-03 05:54:21,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:54:21,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:54:25,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:54:25,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:54:26,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:54:29,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 05:54:29,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:54:30,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:54:34,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:54:38,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:54:38,328 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:54:39,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 05:54:41,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 05:54:41,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:54:43,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 05:54:43,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:54:43,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:54:46,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:54:47,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:54:50,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 05:54:51,732 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1161120.0, ans=0.125 2023-10-03 05:54:51,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1161120.0, ans=0.125 2023-10-03 05:54:53,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:54:54,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:54:56,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 05:54:57,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:54:57,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 05:55:00,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:55:00,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:55:01,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:55:03,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 05:55:03,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:03,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:55:04,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:55:05,199 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.81 vs. limit=15.0 2023-10-03 05:55:09,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 05:55:09,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:55:09,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:55:11,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:55:11,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 05:55:11,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:55:11,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:55:13,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:55:14,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:55:14,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 05:55:14,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:55:19,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:55:21,535 INFO [train.py:1046] (1/4) Epoch 33, batch 4200, loss[loss=0.1554, simple_loss=0.2094, pruned_loss=0.05074, over 19463.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2396, pruned_loss=0.04151, over 4686770.23 frames. ], batch size: 388, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 05:55:21,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 05:55:22,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1161253.3333333333, ans=0.0 2023-10-03 05:55:23,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:55:24,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:55:26,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:55:26,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:55:26,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:55:29,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 05:55:32,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 05:55:32,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.32 vs. limit=6.0 2023-10-03 05:55:33,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:33,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1161253.3333333333, ans=0.0 2023-10-03 05:55:34,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:55:37,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:55:39,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 05:55:43,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:55:43,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:43,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 05:55:43,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:55:43,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1161320.0, ans=0.1 2023-10-03 05:55:44,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:46,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:55:46,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:55:47,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:55:50,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 05:55:50,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:56,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:55:56,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:55:59,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:56:00,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:56:03,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:56:03,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 05:56:03,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:56:04,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:56:09,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:56:10,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:56:14,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1161453.3333333333, ans=0.125 2023-10-03 05:56:17,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:56:19,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1161453.3333333333, ans=0.125 2023-10-03 05:56:20,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 05:56:20,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1161520.0, ans=0.125 2023-10-03 05:56:21,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:56:24,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1161520.0, ans=0.125 2023-10-03 05:56:26,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:56:27,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:56:29,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 05:56:33,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:56:35,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1161586.6666666667, ans=0.125 2023-10-03 05:56:36,462 INFO [train.py:1046] (1/4) Epoch 33, batch 4250, loss[loss=0.168, simple_loss=0.2472, pruned_loss=0.04446, over 23376.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2384, pruned_loss=0.04081, over 4699185.64 frames. ], batch size: 105, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 05:56:37,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:56:37,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:56:40,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:56:44,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:56:45,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 05:56:45,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:56:48,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:56:49,885 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.839e+02 2.000e+02 2.260e+02 3.065e+02, threshold=4.000e+02, percent-clipped=0.0 2023-10-03 05:56:51,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:56:56,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:56:56,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:56:57,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:56:57,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:56:58,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:57:00,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:57:01,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:57:04,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:57:04,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:57:06,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 05:57:08,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 05:57:10,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:57:12,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:57:12,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:57:13,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:57:13,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:57:13,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:57:16,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1161720.0, ans=0.125 2023-10-03 05:57:17,031 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.66 vs. limit=15.0 2023-10-03 05:57:17,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 05:57:17,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:57:23,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:57:24,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:57:25,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 05:57:25,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:57:26,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 05:57:28,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:57:29,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:57:31,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:57:31,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:57:33,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 05:57:34,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:57:35,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:57:38,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:57:41,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:57:42,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:57:44,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:57:46,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:57:46,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1161853.3333333333, ans=0.1 2023-10-03 05:57:47,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:57:47,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:57:47,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 05:57:49,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:57:50,781 INFO [train.py:1046] (1/4) Epoch 33, batch 4300, loss[loss=0.1589, simple_loss=0.22, pruned_loss=0.04893, over 19534.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2378, pruned_loss=0.04077, over 4697429.17 frames. ], batch size: 388, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 05:57:54,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:57:54,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:57:55,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:58:01,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1161920.0, ans=0.125 2023-10-03 05:58:04,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:58:04,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 05:58:05,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:58:06,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:58:06,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:58:06,931 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 05:58:10,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:58:10,892 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.51 vs. limit=10.0 2023-10-03 05:58:11,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:58:14,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 05:58:14,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:58:14,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 05:58:17,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 05:58:19,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:58:22,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:58:22,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:58:23,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:58:24,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1162053.3333333333, ans=0.2 2023-10-03 05:58:25,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:58:26,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:58:26,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 05:58:28,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 05:58:30,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:58:33,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:58:33,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 05:58:33,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:58:33,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:58:33,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 05:58:33,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 05:58:33,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 05:58:35,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:58:35,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 05:58:36,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 05:58:36,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1162120.0, ans=0.125 2023-10-03 05:58:40,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:58:42,279 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 05:58:43,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:58:45,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:58:45,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:58:47,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 05:58:48,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:58:48,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:58:50,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:58:50,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:58:50,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:58:53,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:58:55,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1162186.6666666667, ans=0.2 2023-10-03 05:58:56,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:58:56,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:58:56,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:59:02,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 05:59:03,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:59:04,718 INFO [train.py:1046] (1/4) Epoch 33, batch 4350, loss[loss=0.1585, simple_loss=0.2322, pruned_loss=0.04243, over 23757.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2393, pruned_loss=0.04114, over 4700783.47 frames. ], batch size: 195, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 05:59:06,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:59:07,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:59:10,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:59:10,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:59:11,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1162253.3333333333, ans=0.0 2023-10-03 05:59:15,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:59:18,255 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.918e+02 2.252e+02 2.552e+02 4.017e+02, threshold=4.505e+02, percent-clipped=1.0 2023-10-03 05:59:19,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:59:23,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:59:23,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:59:25,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:59:27,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:59:27,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1162320.0, ans=0.125 2023-10-03 05:59:28,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:59:34,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 05:59:34,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:59:35,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:59:38,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:59:43,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 05:59:46,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:59:47,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 05:59:52,614 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 05:59:52,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:59:52,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:59:52,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1162453.3333333333, ans=0.0 2023-10-03 05:59:54,036 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 05:59:54,096 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 05:59:54,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:59:55,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:59:55,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:59:55,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1162453.3333333333, ans=15.0 2023-10-03 05:59:56,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:59:58,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:59:58,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:00:02,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 06:00:02,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:02,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:00:02,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:02,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 06:00:04,983 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 06:00:04,987 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 06:00:05,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 06:00:07,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:00:07,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:00:07,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:00:08,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1162520.0, ans=0.125 2023-10-03 06:00:09,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:00:10,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 06:00:13,355 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 06:00:13,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:17,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:00:17,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:18,436 INFO [train.py:1046] (1/4) Epoch 33, batch 4400, loss[loss=0.1486, simple_loss=0.2354, pruned_loss=0.03091, over 24523.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.24, pruned_loss=0.04161, over 4691294.86 frames. ], batch size: 63, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 06:00:19,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:00:21,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 06:00:22,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1162586.6666666667, ans=0.05 2023-10-03 06:00:23,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 06:00:23,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 06:00:23,211 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 06:00:24,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 06:00:24,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:00:27,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 06:00:29,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:30,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:00:30,464 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 06:00:34,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:00:34,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 06:00:36,121 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 06:00:39,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 06:00:40,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 06:00:41,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 06:00:41,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:00:41,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:00:43,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:00:43,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:00:46,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 06:00:46,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 06:00:46,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:00:48,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:00:48,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:50,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:00:50,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:00:50,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 06:00:50,869 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 06:00:57,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:01:02,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:01:04,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 06:01:07,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:01:08,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:01:11,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:01:11,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 06:01:11,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:01:11,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:01:11,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:01:13,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:01:18,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 06:01:20,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 06:01:21,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 06:01:21,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:01:21,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 06:01:21,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:01:27,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:01:28,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 06:01:28,795 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1162853.3333333333, ans=0.125 2023-10-03 06:01:30,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:01:30,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1162853.3333333333, ans=0.125 2023-10-03 06:01:33,221 INFO [train.py:1046] (1/4) Epoch 33, batch 4450, loss[loss=0.1532, simple_loss=0.2437, pruned_loss=0.03128, over 24286.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2406, pruned_loss=0.04158, over 4693284.59 frames. ], batch size: 74, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 06:01:33,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:01:33,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:01:40,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:01:41,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:01:43,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:01:45,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:01:47,590 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.835e+02 1.974e+02 2.226e+02 3.952e+02, threshold=3.948e+02, percent-clipped=0.0 2023-10-03 06:01:49,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:01:51,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:01:52,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 06:01:52,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:01:53,014 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=11.36 vs. limit=15.0 2023-10-03 06:01:53,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:01:53,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:01:53,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:01:55,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 06:01:59,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:01:59,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:02:01,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:02:01,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:02:01,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1163053.3333333333, ans=0.125 2023-10-03 06:02:02,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:02:07,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 06:02:08,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 06:02:08,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 06:02:08,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:02:11,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:02:11,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1163053.3333333333, ans=0.0 2023-10-03 06:02:12,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 06:02:17,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:02:22,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:02:22,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 06:02:22,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:02:22,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:02:22,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:02:22,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:02:25,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:02:28,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 06:02:28,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1163120.0, ans=0.125 2023-10-03 06:02:29,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 06:02:31,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:02:31,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1163186.6666666667, ans=0.1 2023-10-03 06:02:32,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:02:35,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:02:36,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:02:36,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 06:02:40,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:02:43,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 06:02:43,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1163186.6666666667, ans=0.0 2023-10-03 06:02:43,570 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.42 vs. limit=10.0 2023-10-03 06:02:44,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:02:47,076 INFO [train.py:1046] (1/4) Epoch 33, batch 4500, loss[loss=0.1571, simple_loss=0.2452, pruned_loss=0.03448, over 24308.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2416, pruned_loss=0.04227, over 4687282.27 frames. ], batch size: 74, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:02:48,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:02:50,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 06:02:50,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 06:02:53,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:02:57,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1163253.3333333333, ans=0.125 2023-10-03 06:02:58,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:02:58,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:03:00,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:03:00,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:03:01,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:03:01,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:03:01,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1163320.0, ans=0.125 2023-10-03 06:03:07,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1163320.0, ans=0.0 2023-10-03 06:03:10,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1163320.0, ans=0.125 2023-10-03 06:03:13,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:03:13,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:03:15,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:03:16,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:03:18,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:03:23,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:03:28,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:03:28,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1163386.6666666667, ans=0.125 2023-10-03 06:03:32,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:03:36,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:03:36,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 06:03:38,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:03:39,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:03:39,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:03:41,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:03:42,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:03:42,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 06:03:42,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:03:42,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:03:42,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1163453.3333333333, ans=0.05 2023-10-03 06:03:50,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:03:50,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:03:51,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:03:51,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1163520.0, ans=0.125 2023-10-03 06:03:54,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:03:54,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:03:54,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 06:03:57,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 06:03:57,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 06:04:00,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 06:04:02,236 INFO [train.py:1046] (1/4) Epoch 33, batch 4550, loss[loss=0.1529, simple_loss=0.21, pruned_loss=0.04786, over 19299.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2411, pruned_loss=0.04165, over 4699069.33 frames. ], batch size: 388, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:04:05,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 06:04:05,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:04:09,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:04:09,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:04:11,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:04:14,625 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.59 vs. limit=15.0 2023-10-03 06:04:15,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:04:16,479 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.864e+02 2.150e+02 2.511e+02 3.546e+02, threshold=4.299e+02, percent-clipped=0.0 2023-10-03 06:04:16,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:04:18,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:04:18,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:04:18,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:21,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:04:21,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:04:24,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:04:27,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 06:04:27,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 06:04:29,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:04:30,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 06:04:33,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 06:04:33,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:04:34,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1163720.0, ans=0.125 2023-10-03 06:04:37,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 06:04:39,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:04:41,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:43,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:43,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:04:44,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 06:04:47,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:04:49,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:50,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:04:50,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:04:52,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 06:04:52,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 06:04:53,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:04:53,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 06:04:55,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 06:04:55,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:04:57,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:04:57,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:04:58,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:58,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:05:00,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:05:01,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 06:05:03,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:05:03,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 06:05:03,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1163853.3333333333, ans=0.1 2023-10-03 06:05:04,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 06:05:04,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:05:04,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 06:05:07,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:05:08,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:05:10,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:05:10,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:05:10,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:05:11,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:05:13,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:05:15,549 INFO [train.py:1046] (1/4) Epoch 33, batch 4600, loss[loss=0.1305, simple_loss=0.1824, pruned_loss=0.03929, over 19011.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2392, pruned_loss=0.04123, over 4681305.07 frames. ], batch size: 388, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:05:16,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:17,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:05:19,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:05:19,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:05:20,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:05:21,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 06:05:23,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:05:29,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:05:29,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:05:32,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:32,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1163986.6666666667, ans=0.125 2023-10-03 06:05:38,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 06:05:40,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:42,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:45,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:05:45,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:05:51,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 06:05:51,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:05:51,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:05:57,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:57,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:05:57,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1164053.3333333333, ans=0.125 2023-10-03 06:05:59,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:06:03,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 06:06:05,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:06:07,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:08,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1164120.0, ans=0.1 2023-10-03 06:06:09,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:06:10,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:10,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 06:06:12,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:06:12,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 06:06:12,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:13,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:06:15,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:15,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:06:16,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:06:16,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 06:06:17,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 06:06:18,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 06:06:18,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:06:20,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:06:21,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:06:23,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:06:25,017 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:06:30,861 INFO [train.py:1046] (1/4) Epoch 33, batch 4650, loss[loss=0.1547, simple_loss=0.2445, pruned_loss=0.03247, over 24440.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.238, pruned_loss=0.04056, over 4684877.56 frames. ], batch size: 69, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:06:34,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:06:35,119 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.56 vs. limit=12.0 2023-10-03 06:06:36,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:06:36,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:06:37,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:06:38,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:06:38,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:06:39,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:06:39,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1164253.3333333333, ans=0.0 2023-10-03 06:06:42,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 06:06:45,057 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.854e+02 2.077e+02 2.382e+02 3.489e+02, threshold=4.154e+02, percent-clipped=0.0 2023-10-03 06:06:45,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:06:46,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 06:06:47,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:06:49,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 06:06:49,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:06:50,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 06:06:50,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 06:06:50,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:50,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:06:55,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:06:57,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:06:57,304 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 06:06:58,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:07:00,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 06:07:03,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:07:03,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:07:04,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 06:07:04,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:07:09,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:07:12,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:07:15,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:07:16,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:07:16,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1164453.3333333333, ans=0.025 2023-10-03 06:07:18,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:07:18,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:07:21,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 06:07:21,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 06:07:22,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 06:07:22,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 06:07:23,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1164453.3333333333, ans=0.2 2023-10-03 06:07:24,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:07:24,653 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.88 vs. limit=6.0 2023-10-03 06:07:30,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:07:30,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:07:31,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 06:07:31,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:07:33,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:07:33,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:07:35,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1164520.0, ans=0.0 2023-10-03 06:07:36,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:07:37,033 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.02 vs. limit=22.5 2023-10-03 06:07:37,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:07:37,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:07:39,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:07:42,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:07:42,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:07:43,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:07:43,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 06:07:44,799 INFO [train.py:1046] (1/4) Epoch 33, batch 4700, loss[loss=0.1599, simple_loss=0.2417, pruned_loss=0.03899, over 24283.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2398, pruned_loss=0.04096, over 4700027.10 frames. ], batch size: 61, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:07:44,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:07:46,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 06:07:52,912 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.51 vs. limit=22.5 2023-10-03 06:07:55,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:07:55,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:07:55,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1164586.6666666667, ans=0.125 2023-10-03 06:07:56,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:07:57,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:07:59,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:08:05,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 06:08:05,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 06:08:09,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:08:09,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:08:09,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:08:13,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1164720.0, ans=0.2 2023-10-03 06:08:13,742 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:08:14,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:08:17,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1164720.0, ans=0.95 2023-10-03 06:08:18,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:08:20,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 06:08:23,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:08:29,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 06:08:30,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:08:32,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:36,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 06:08:37,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:08:43,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:08:43,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 06:08:43,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:44,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:08:46,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:08:46,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1164853.3333333333, ans=0.2 2023-10-03 06:08:47,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:08:47,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 06:08:47,604 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 06:08:49,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:08:50,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:50,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:50,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 06:08:51,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:56,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 06:08:58,780 INFO [train.py:1046] (1/4) Epoch 33, batch 4750, loss[loss=0.1873, simple_loss=0.2699, pruned_loss=0.05236, over 24065.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2401, pruned_loss=0.04099, over 4701547.01 frames. ], batch size: 80, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:08:58,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:09:00,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:09:05,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:09:05,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:09:06,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 06:09:08,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:09:09,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 06:09:11,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1164920.0, ans=0.2 2023-10-03 06:09:12,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:09:12,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:09:13,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1164986.6666666667, ans=0.125 2023-10-03 06:09:14,314 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.898e+02 2.045e+02 2.268e+02 3.285e+02, threshold=4.090e+02, percent-clipped=0.0 2023-10-03 06:09:14,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:09:19,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 06:09:25,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:09:26,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 06:09:28,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:09:31,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:09:31,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:09:31,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1165053.3333333333, ans=0.125 2023-10-03 06:09:32,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:09:34,292 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 06:09:34,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 06:09:38,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 06:09:38,964 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.03 vs. limit=22.5 2023-10-03 06:09:40,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:09:42,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:09:45,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:09:45,603 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 06:09:45,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:09:48,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:09:51,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:09:51,918 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.88 vs. limit=15.0 2023-10-03 06:09:52,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 06:09:52,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 06:09:52,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:09:52,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:09:54,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:09:55,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 06:09:55,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 06:09:58,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 06:10:00,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:02,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:10:02,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 06:10:02,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:10:04,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:10:07,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:10:07,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:07,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:10:11,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:10:11,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 06:10:13,201 INFO [train.py:1046] (1/4) Epoch 33, batch 4800, loss[loss=0.182, simple_loss=0.2702, pruned_loss=0.04687, over 24058.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2411, pruned_loss=0.04118, over 4706825.02 frames. ], batch size: 80, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:10:13,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 06:10:14,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 06:10:17,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:10:17,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:10:19,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 06:10:24,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:24,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:29,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:10:29,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:10:29,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:31,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 06:10:31,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:10:32,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:10:33,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:10:37,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:10:38,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:10:38,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:10:41,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:10:41,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 06:10:41,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:43,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:10:44,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:10:47,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:49,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:49,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:10:49,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 06:10:50,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:52,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 06:10:52,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 06:10:54,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:54,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:10:54,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:10:55,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:10:55,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:10:56,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:10:58,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:11:02,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:11:05,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:05,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:11:09,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 06:11:09,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:11:09,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:11,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:11:11,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:11:15,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:11:17,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:11:17,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:18,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:11:18,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:11:20,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:11:24,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:11:24,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:24,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:11:25,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 06:11:27,682 INFO [train.py:1046] (1/4) Epoch 33, batch 4850, loss[loss=0.1478, simple_loss=0.2326, pruned_loss=0.03154, over 24453.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2409, pruned_loss=0.04091, over 4716002.81 frames. ], batch size: 66, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:11:29,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 06:11:29,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:11:29,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:11:29,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:11:29,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:32,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:11:38,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 06:11:38,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:11:42,839 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.890e+02 2.112e+02 2.331e+02 3.669e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 06:11:44,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:11:45,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:11:45,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:45,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1165653.3333333333, ans=0.125 2023-10-03 06:11:50,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:11:50,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:11:51,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:11:51,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 06:11:54,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:11:57,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:11:59,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:12:00,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:12:00,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 06:12:01,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:12:01,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:12:04,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1165720.0, ans=0.1 2023-10-03 06:12:06,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:12:06,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 06:12:06,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 06:12:07,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:12:15,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:12:15,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 06:12:17,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:12:17,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:12:18,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:12:18,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 06:12:18,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:12:20,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 06:12:20,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:12:21,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:12:24,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 06:12:30,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:12:34,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:12:34,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:12:40,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 06:12:40,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:12:42,283 INFO [train.py:1046] (1/4) Epoch 33, batch 4900, loss[loss=0.1422, simple_loss=0.2052, pruned_loss=0.03959, over 23491.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.239, pruned_loss=0.04108, over 4701744.78 frames. ], batch size: 285, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:12:47,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:12:48,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:12:48,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:12:51,888 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.12 vs. limit=22.5 2023-10-03 06:12:52,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 06:12:56,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 06:13:01,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 06:13:01,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 06:13:01,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:13:01,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:13:02,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:13:02,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:13:02,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:13:02,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 06:13:04,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1165986.6666666667, ans=0.125 2023-10-03 06:13:05,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 06:13:06,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:13:08,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:13:10,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:13:11,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:13:12,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:13:14,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:13:14,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 06:13:18,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:13:18,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:13:18,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 06:13:18,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 06:13:21,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 06:13:23,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:13:23,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:13:23,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:13:25,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:13:25,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 06:13:25,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:13:25,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 06:13:28,586 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.73 vs. limit=22.5 2023-10-03 06:13:29,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:13:31,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1166120.0, ans=0.125 2023-10-03 06:13:32,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 06:13:35,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:13:38,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 06:13:38,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:13:39,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 06:13:39,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 06:13:47,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:13:47,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:13:48,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 06:13:48,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 06:13:49,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:13:50,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1166186.6666666667, ans=0.125 2023-10-03 06:13:51,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:13:55,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:13:55,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:13:55,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:13:55,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 06:13:56,760 INFO [train.py:1046] (1/4) Epoch 33, batch 4950, loss[loss=0.1502, simple_loss=0.219, pruned_loss=0.04068, over 23833.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2377, pruned_loss=0.04055, over 4703566.13 frames. ], batch size: 212, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:13:56,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:13:59,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:13:59,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 06:14:01,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 06:14:03,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 06:14:03,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:14:04,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 06:14:04,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:04,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:14:04,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:14:04,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:07,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:14:07,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:14:10,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:14:10,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:14:11,695 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.885e+02 2.044e+02 2.302e+02 2.905e+02, threshold=4.089e+02, percent-clipped=0.0 2023-10-03 06:14:11,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:13,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:14:15,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:14:19,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:21,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:14:24,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:24,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:25,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:14:27,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 06:14:27,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 06:14:29,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:31,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:14:31,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:14:32,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:14:32,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:14:34,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:14:35,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:14:37,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:14:37,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1166386.6666666667, ans=0.0 2023-10-03 06:14:39,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:14:39,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1166386.6666666667, ans=0.035 2023-10-03 06:14:40,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:40,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:41,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 06:14:41,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:14:43,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:14:49,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:14:51,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:14:51,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:14:51,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:51,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:14:52,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:14:53,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:14:55,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:14:55,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:14:56,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 06:15:00,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:15:04,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 06:15:05,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 06:15:10,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:15:11,571 INFO [train.py:1046] (1/4) Epoch 33, batch 5000, loss[loss=0.1634, simple_loss=0.2494, pruned_loss=0.03869, over 23699.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2384, pruned_loss=0.04044, over 4706662.16 frames. ], batch size: 85, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:15:11,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:15:13,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 06:15:14,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 06:15:16,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:15:17,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 06:15:17,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:15:17,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 06:15:19,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 06:15:19,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:15:19,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:15:19,724 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:15:20,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 06:15:20,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:15:20,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:15:21,528 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.92 vs. limit=15.0 2023-10-03 06:15:23,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 06:15:23,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 06:15:23,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:15:25,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 06:15:25,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:15:25,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:15:25,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:15:26,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 06:15:26,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 06:15:26,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1166653.3333333333, ans=0.1 2023-10-03 06:15:27,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 06:15:29,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:15:30,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:15:32,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 06:15:34,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:15:35,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:15:35,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:15:37,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 06:15:38,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 06:15:38,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:15:40,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:15:43,885 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.31 vs. limit=15.0 2023-10-03 06:15:45,914 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 06:15:49,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:15:50,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:15:50,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:15:55,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 06:15:55,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:15:56,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:15:56,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:15:57,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1166786.6666666667, ans=0.95 2023-10-03 06:15:58,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 06:15:58,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:16:01,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:16:01,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:16:07,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 06:16:11,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:16:12,374 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.02 vs. limit=22.5 2023-10-03 06:16:21,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:16:23,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:16:23,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:16:23,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:16:23,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:16:25,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:16:25,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:16:26,415 INFO [train.py:1046] (1/4) Epoch 33, batch 5050, loss[loss=0.1496, simple_loss=0.2402, pruned_loss=0.02947, over 24486.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2388, pruned_loss=0.04055, over 4714554.55 frames. ], batch size: 69, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:16:27,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:16:27,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 06:16:29,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:16:29,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1166920.0, ans=0.05 2023-10-03 06:16:30,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:16:32,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:16:32,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 06:16:35,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:16:35,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:16:37,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:16:38,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:16:39,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:16:40,922 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.953e+02 2.133e+02 2.408e+02 3.808e+02, threshold=4.267e+02, percent-clipped=0.0 2023-10-03 06:16:47,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 06:16:48,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 06:16:48,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:16:49,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 06:16:49,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:16:51,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:16:51,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:16:52,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:16:52,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 06:16:54,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 06:16:55,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:16:58,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:17:00,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1167053.3333333333, ans=0.125 2023-10-03 06:17:01,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:17:01,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 06:17:02,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:17:06,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 06:17:07,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:17:07,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:17:08,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:17:08,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:17:11,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:17:13,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1167120.0, ans=0.2 2023-10-03 06:17:14,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:17:14,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:17:14,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:17:16,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:17:16,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 06:17:17,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:17:18,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:17:21,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:17:21,774 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 06:17:21,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 06:17:23,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:17:24,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:17:24,544 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 06:17:29,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:17:29,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 06:17:29,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:17:30,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff3.min_abs, batch_count=1167186.6666666667, ans=0.2 2023-10-03 06:17:30,994 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.22 vs. limit=15.0 2023-10-03 06:17:31,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:17:33,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:17:33,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 06:17:34,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 06:17:37,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:17:37,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:17:37,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:17:39,171 INFO [train.py:1046] (1/4) Epoch 33, batch 5100, loss[loss=0.1424, simple_loss=0.2205, pruned_loss=0.03218, over 24284.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.24, pruned_loss=0.04068, over 4726311.73 frames. ], batch size: 56, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:17:40,651 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 06:17:42,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:17:43,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1167253.3333333333, ans=0.1 2023-10-03 06:17:45,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 06:17:45,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 06:17:46,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:17:47,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:17:49,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:17:49,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 06:17:50,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 06:17:54,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:17:55,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:17:56,937 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1167320.0, ans=0.125 2023-10-03 06:18:00,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:18:01,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 06:18:02,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:18:04,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:18:04,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 06:18:06,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:18:08,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:18:08,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 06:18:12,009 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 06:18:13,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:18:14,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 06:18:14,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 06:18:18,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:18:20,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1167386.6666666667, ans=0.125 2023-10-03 06:18:26,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:18:28,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 06:18:28,800 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 06:18:28,807 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 06:18:32,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 06:18:32,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:18:33,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 06:18:34,179 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.33 vs. limit=15.0 2023-10-03 06:18:37,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 06:18:39,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 06:18:41,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:18:41,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1167520.0, ans=0.125 2023-10-03 06:18:43,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 06:18:43,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:18:45,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 06:18:50,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:18:50,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:18:50,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:18:51,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:18:51,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:18:52,745 INFO [train.py:1046] (1/4) Epoch 33, batch 5150, loss[loss=0.1714, simple_loss=0.2525, pruned_loss=0.04521, over 23945.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2411, pruned_loss=0.04133, over 4710852.15 frames. ], batch size: 80, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:18:52,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:18:54,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 06:18:54,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 06:18:55,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 06:18:55,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:18:55,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 06:18:58,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:18:58,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 06:19:00,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:19:01,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:19:06,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 06:19:06,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 06:19:07,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:19:08,861 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.881e+02 2.011e+02 2.161e+02 3.119e+02, threshold=4.022e+02, percent-clipped=0.0 2023-10-03 06:19:08,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:19:09,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:19:09,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:19:09,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:19:09,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1167653.3333333333, ans=0.125 2023-10-03 06:19:10,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:19:10,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:19:10,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 06:19:10,673 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:19:13,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:19:13,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:19:15,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:19:16,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 06:19:17,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:19:23,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:19:24,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 06:19:27,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:19:35,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:19:36,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:19:39,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:19:41,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:19:43,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 06:19:43,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1167786.6666666667, ans=0.125 2023-10-03 06:19:47,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:19:49,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:19:49,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:19:53,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:19:53,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:19:54,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 06:19:58,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:19:58,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:20:00,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:20:01,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:20:01,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 06:20:03,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:20:03,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:20:03,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:20:06,128 INFO [train.py:1046] (1/4) Epoch 33, batch 5200, loss[loss=0.1633, simple_loss=0.2534, pruned_loss=0.03664, over 24338.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2417, pruned_loss=0.04149, over 4718348.53 frames. ], batch size: 74, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:20:07,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:20:09,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:20:12,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:20:12,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1167920.0, ans=0.1 2023-10-03 06:20:16,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 06:20:16,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:20:17,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1167920.0, ans=0.0 2023-10-03 06:20:18,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:20:21,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:20:21,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1167986.6666666667, ans=0.125 2023-10-03 06:20:22,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:20:22,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:20:23,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 06:20:25,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:20:25,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:20:28,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 06:20:29,089 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.09 vs. limit=12.0 2023-10-03 06:20:29,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:20:32,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:20:32,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 06:20:34,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 06:20:36,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 06:20:38,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:20:38,078 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 06:20:38,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:20:38,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:20:39,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:20:39,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 06:20:41,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:20:44,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:20:46,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 06:20:46,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 06:20:46,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 06:20:50,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 06:20:52,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:20:56,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:20:57,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:20:58,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 06:20:58,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:20:58,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 06:20:58,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:20:58,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:21:03,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:21:04,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:21:09,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:21:10,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:21:10,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:21:12,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1168186.6666666667, ans=0.0 2023-10-03 06:21:15,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:21:15,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 06:21:16,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1168186.6666666667, ans=0.1 2023-10-03 06:21:17,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:21:17,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:21:18,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:21:19,726 INFO [train.py:1046] (1/4) Epoch 33, batch 5250, loss[loss=0.174, simple_loss=0.2637, pruned_loss=0.04208, over 24329.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2415, pruned_loss=0.04131, over 4723166.29 frames. ], batch size: 74, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:21:19,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:21:19,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:21:22,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:21:25,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:21:26,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:21:26,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1168253.3333333333, ans=0.0 2023-10-03 06:21:28,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:21:31,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1168253.3333333333, ans=0.0 2023-10-03 06:21:32,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:21:33,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:21:35,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:21:36,477 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.883e+02 2.112e+02 2.383e+02 3.529e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-03 06:21:36,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:21:40,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 06:21:40,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:21:41,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:21:58,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1168386.6666666667, ans=0.125 2023-10-03 06:22:00,519 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.80 vs. limit=10.0 2023-10-03 06:22:12,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1168453.3333333333, ans=0.125 2023-10-03 06:22:24,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff2.min_abs, batch_count=1168520.0, ans=0.1 2023-10-03 06:22:24,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1168520.0, ans=0.125 2023-10-03 06:22:27,069 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.14 vs. limit=15.0 2023-10-03 06:22:27,875 INFO [train.py:1046] (1/4) Epoch 33, batch 5300, loss[loss=0.1484, simple_loss=0.2285, pruned_loss=0.03419, over 24287.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2389, pruned_loss=0.04076, over 4698646.51 frames. ], batch size: 56, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:22:37,124 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.94 vs. limit=15.0 2023-10-03 06:22:42,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:22:42,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 06:22:42,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 06:22:42,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:22:42,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:22:42,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:22:42,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:22:42,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:22:42,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:22:42,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:22:43,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:22:43,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:22:43,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 06:22:43,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 06:22:43,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 06:22:43,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:22:43,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 06:22:43,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 06:22:43,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:22:44,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:22:44,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:22:44,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:22:44,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:22:44,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:22:44,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:22:45,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:22:45,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:22:45,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:22:45,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:22:45,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:22:45,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:22:45,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 06:22:45,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:22:46,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:22:46,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 06:22:46,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 06:22:46,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:22:46,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:22:46,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 06:22:46,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 06:22:46,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 06:22:47,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:22:47,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:22:47,554 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 06:22:47,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 06:22:47,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:22:47,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:22:47,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 06:22:47,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 06:22:47,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 06:22:48,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 06:22:54,402 INFO [train.py:1046] (1/4) Epoch 34, batch 0, loss[loss=0.1547, simple_loss=0.2301, pruned_loss=0.03965, over 23812.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2301, pruned_loss=0.03965, over 23812.00 frames. ], batch size: 212, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:22:54,403 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 06:23:07,174 INFO [train.py:1078] (1/4) Epoch 34, validation: loss=0.3345, simple_loss=0.2716, pruned_loss=0.1987, over 1125622.00 frames. 2023-10-03 06:23:07,175 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 06:23:11,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 06:23:11,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1168666.6666666667, ans=0.125 2023-10-03 06:23:13,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:23:13,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1168666.6666666667, ans=0.125 2023-10-03 06:23:14,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:23:19,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:23:19,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:23:19,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:23:20,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 06:23:22,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 06:23:24,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:23:24,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:23:27,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:23:27,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:23:27,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:23:27,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:23:29,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 06:23:29,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1168733.3333333333, ans=0.0 2023-10-03 06:23:30,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:23:37,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:23:37,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:23:41,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 06:23:45,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:23:45,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:23:48,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:23:52,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:23:56,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1168866.6666666667, ans=0.0 2023-10-03 06:23:57,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:24:01,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 06:24:04,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 06:24:06,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:24:06,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:24:06,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:24:06,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:24:07,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 06:24:09,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:24:11,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:24:12,022 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.43 vs. limit=12.0 2023-10-03 06:24:14,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:24:16,919 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 06:24:18,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:24:21,497 INFO [train.py:1046] (1/4) Epoch 34, batch 50, loss[loss=0.1559, simple_loss=0.2369, pruned_loss=0.03741, over 23279.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2414, pruned_loss=0.04187, over 1054242.71 frames. ], batch size: 105, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:24:22,871 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.913e+02 2.173e+02 2.528e+02 6.265e+02, threshold=4.345e+02, percent-clipped=6.0 2023-10-03 06:24:22,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:24:24,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:24:24,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 06:24:25,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:24:25,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:24:27,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:24:28,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:24:31,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:24:32,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 06:24:32,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:24:38,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:24:40,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 06:24:41,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 06:24:43,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:24:44,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:24:44,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:24:46,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:24:46,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:24:47,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:24:47,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:24:55,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:24:57,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1169133.3333333333, ans=0.2 2023-10-03 06:24:58,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:24:58,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 06:24:59,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 06:25:01,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:25:01,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:25:01,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 06:25:02,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:25:04,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 06:25:10,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:25:12,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:25:13,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:25:14,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:25:14,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:25:15,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1169200.0, ans=0.125 2023-10-03 06:25:17,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 06:25:17,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 06:25:18,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:25:19,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:25:20,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:25:22,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:25:22,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 06:25:23,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 06:25:25,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 06:25:25,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:25:26,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:25:27,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 06:25:27,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 06:25:28,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:25:30,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:25:30,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 06:25:31,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:25:33,330 INFO [train.py:1046] (1/4) Epoch 34, batch 100, loss[loss=0.157, simple_loss=0.2416, pruned_loss=0.03621, over 24510.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2433, pruned_loss=0.04171, over 1878190.78 frames. ], batch size: 66, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:25:33,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:25:36,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:25:37,082 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.28 vs. limit=15.0 2023-10-03 06:25:38,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:25:40,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 06:25:40,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:25:45,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:25:46,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:25:46,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:25:46,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:25:46,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:25:48,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 06:25:49,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:25:51,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:25:51,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:25:51,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:25:52,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1169400.0, ans=0.2 2023-10-03 06:25:53,078 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.76 vs. limit=22.5 2023-10-03 06:25:55,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 06:25:57,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:25:58,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:25:59,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:26:01,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:26:05,440 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 06:26:05,454 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 06:26:06,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:26:06,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:26:09,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:26:11,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:26:14,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:17,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:18,374 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 06:26:21,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 06:26:23,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:26:23,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:26:27,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:29,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:26:32,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:26:34,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:26:36,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:36,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:26:38,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:26:38,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:26:38,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:39,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 06:26:39,513 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 06:26:39,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:26:40,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:26:42,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:26:42,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:26:42,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 06:26:44,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 06:26:44,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:26:44,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:26:44,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:26:45,800 INFO [train.py:1046] (1/4) Epoch 34, batch 150, loss[loss=0.1452, simple_loss=0.2378, pruned_loss=0.02628, over 24309.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2424, pruned_loss=0.0416, over 2520530.12 frames. ], batch size: 74, lr: 3.02e-03, grad_scale: 4.0 2023-10-03 06:26:45,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:26:45,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:26:45,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:26:48,605 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.899e+02 2.057e+02 2.467e+02 3.842e+02, threshold=4.114e+02, percent-clipped=0.0 2023-10-03 06:26:50,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:26:52,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:26:52,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:26:52,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:26:56,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:26:56,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:00,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:27:01,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:05,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 06:27:05,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 06:27:05,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 06:27:08,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:27:08,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:27:09,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:27:09,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:27:09,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:27:11,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:11,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:12,523 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 06:27:15,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:27:20,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:27:22,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_ff2.min_abs, batch_count=1169800.0, ans=0.1 2023-10-03 06:27:23,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:27:24,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 06:27:26,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:27:26,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:27:26,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:27:27,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:27:29,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:27:30,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:27:32,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:27:32,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 06:27:37,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:27:39,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:27:39,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:27:39,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:27:39,941 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.04 vs. limit=15.0 2023-10-03 06:27:42,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:27:45,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 06:27:47,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:27:47,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:27:49,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:27:50,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1169933.3333333333, ans=0.125 2023-10-03 06:27:51,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:27:51,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 06:27:51,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:27:53,225 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 06:27:54,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:27:58,630 INFO [train.py:1046] (1/4) Epoch 34, batch 200, loss[loss=0.1638, simple_loss=0.2517, pruned_loss=0.03797, over 24084.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2438, pruned_loss=0.04284, over 2996571.75 frames. ], batch size: 80, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:27:58,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:58,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:28:00,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 06:28:02,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:28:02,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1170000.0, ans=0.125 2023-10-03 06:28:03,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:28:04,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 06:28:06,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 06:28:06,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1170000.0, ans=0.0 2023-10-03 06:28:07,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:28:08,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:28:10,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1170000.0, ans=0.0 2023-10-03 06:28:12,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:28:13,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:28:13,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:28:15,187 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:28:18,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1170066.6666666667, ans=0.125 2023-10-03 06:28:24,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1170066.6666666667, ans=0.125 2023-10-03 06:28:31,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:28:31,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1170133.3333333333, ans=0.2 2023-10-03 06:28:33,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:28:33,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:28:34,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:28:34,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 06:28:34,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:28:35,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:28:37,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:28:38,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:28:38,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:28:40,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 06:28:40,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 06:28:40,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:28:44,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:28:50,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:28:57,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:28:57,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:29:02,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:29:04,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 06:29:05,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:29:07,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:29:07,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:29:08,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:29:08,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 06:29:10,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1170266.6666666667, ans=0.125 2023-10-03 06:29:11,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:29:11,381 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 06:29:12,625 INFO [train.py:1046] (1/4) Epoch 34, batch 250, loss[loss=0.1594, simple_loss=0.2501, pruned_loss=0.03439, over 24662.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2438, pruned_loss=0.04283, over 3379194.48 frames. ], batch size: 73, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:29:14,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:29:15,952 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.885e+02 2.083e+02 2.421e+02 4.173e+02, threshold=4.166e+02, percent-clipped=1.0 2023-10-03 06:29:17,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:29:17,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1170333.3333333333, ans=0.1 2023-10-03 06:29:19,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:29:19,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:29:22,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:29:22,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:29:23,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:29:26,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:29:28,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1170400.0, ans=0.125 2023-10-03 06:29:36,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:29:38,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:29:38,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:29:43,612 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.39 vs. limit=15.0 2023-10-03 06:29:45,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:29:45,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1170466.6666666667, ans=10.0 2023-10-03 06:29:47,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:29:47,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:29:48,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:29:48,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:29:48,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:29:51,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:29:52,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:29:55,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 06:29:56,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:29:59,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:29:59,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:29:59,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:30:00,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:30:00,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:30:00,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:30:01,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1170533.3333333333, ans=0.0 2023-10-03 06:30:02,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:30:04,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:30:05,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:30:09,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:30:11,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:30:14,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1170600.0, ans=0.0 2023-10-03 06:30:15,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:30:19,276 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.21 vs. limit=22.5 2023-10-03 06:30:21,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:30:23,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:30:23,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1170600.0, ans=0.1 2023-10-03 06:30:28,050 INFO [train.py:1046] (1/4) Epoch 34, batch 300, loss[loss=0.1477, simple_loss=0.2147, pruned_loss=0.04033, over 23670.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2409, pruned_loss=0.04174, over 3664439.28 frames. ], batch size: 232, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:30:28,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 06:30:29,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:30:29,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:30:30,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 06:30:30,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 06:30:31,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:30:31,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 06:30:35,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:30:35,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:30:38,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:30:39,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 06:30:41,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:30:41,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:30:41,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 06:30:41,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:30:45,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:30:50,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:30:50,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 06:30:56,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 06:30:56,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:30:58,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:31:00,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:00,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 06:31:00,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:31:02,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:31:03,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:31:03,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:31:05,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1170800.0, ans=0.1 2023-10-03 06:31:07,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 06:31:07,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 06:31:07,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:31:10,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:10,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 06:31:12,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:31:18,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:31:20,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:31:20,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 06:31:23,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:23,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:31:26,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:26,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:31:26,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 06:31:27,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:31:27,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:31:30,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 06:31:31,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:31,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:31:33,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:31:34,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:31:34,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:31:41,420 INFO [train.py:1046] (1/4) Epoch 34, batch 350, loss[loss=0.1951, simple_loss=0.2783, pruned_loss=0.05593, over 24039.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2398, pruned_loss=0.04124, over 3899680.22 frames. ], batch size: 80, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:31:41,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:31:41,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 06:31:44,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:31:46,057 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.903e+02 2.085e+02 2.376e+02 3.254e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-03 06:31:49,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:31:49,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1171000.0, ans=0.2 2023-10-03 06:31:51,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:31:53,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:31:56,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 06:31:58,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:31:58,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 06:31:59,701 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.83 vs. limit=6.0 2023-10-03 06:32:00,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:32:01,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 06:32:01,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:32:03,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 06:32:05,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:32:07,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:32:08,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:32:10,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:32:10,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1171133.3333333333, ans=0.1 2023-10-03 06:32:11,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:32:11,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:32:11,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:32:12,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:32:14,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:32:14,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:32:19,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1171133.3333333333, ans=0.125 2023-10-03 06:32:22,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:32:22,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:32:23,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:32:23,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:32:27,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 06:32:27,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:32:32,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:32:32,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:32:32,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:32:33,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 06:32:36,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:32:36,775 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 06:32:39,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 06:32:39,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:32:43,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:32:43,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 06:32:46,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:32:48,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:32:48,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:32:48,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1171266.6666666667, ans=0.125 2023-10-03 06:32:49,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:32:49,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:32:51,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:32:54,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:32:56,023 INFO [train.py:1046] (1/4) Epoch 34, batch 400, loss[loss=0.1573, simple_loss=0.2493, pruned_loss=0.03268, over 24248.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2393, pruned_loss=0.04098, over 4076681.69 frames. ], batch size: 74, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:32:56,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:32:58,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 06:32:58,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:32:59,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:33:00,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:33:02,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:33:04,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:33:04,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:33:05,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1171333.3333333333, ans=0.2 2023-10-03 06:33:06,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 06:33:09,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 06:33:09,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:33:10,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 06:33:11,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:33:12,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1171400.0, ans=0.1 2023-10-03 06:33:14,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:33:14,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:33:14,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 06:33:14,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:33:16,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:33:16,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:33:16,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:33:19,851 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 06:33:20,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1171400.0, ans=0.0 2023-10-03 06:33:21,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 06:33:25,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:33:25,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:33:27,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 06:33:29,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 06:33:31,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:33:33,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:33:33,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1171466.6666666667, ans=0.125 2023-10-03 06:33:36,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1171466.6666666667, ans=0.0 2023-10-03 06:33:36,740 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.12 vs. limit=15.0 2023-10-03 06:33:37,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 06:33:39,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1171533.3333333333, ans=0.09899494936611666 2023-10-03 06:33:40,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 06:33:41,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 06:33:42,069 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.62 vs. limit=15.0 2023-10-03 06:33:45,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:33:45,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:33:45,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 06:33:49,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:33:52,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:33:54,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:33:56,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:33:56,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 06:34:00,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 06:34:00,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 06:34:01,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:34:01,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:34:04,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 06:34:07,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 06:34:08,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:34:08,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 06:34:09,942 INFO [train.py:1046] (1/4) Epoch 34, batch 450, loss[loss=0.1511, simple_loss=0.2342, pruned_loss=0.03404, over 24473.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2395, pruned_loss=0.0409, over 4221824.38 frames. ], batch size: 63, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:34:10,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 06:34:10,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:34:11,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:34:12,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:34:12,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 06:34:12,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:34:14,036 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.869e+02 1.964e+02 2.234e+02 2.686e+02, threshold=3.928e+02, percent-clipped=0.0 2023-10-03 06:34:14,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:34:15,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:34:16,213 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.60 vs. limit=10.0 2023-10-03 06:34:26,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:34:26,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:34:27,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 06:34:29,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 06:34:32,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:34:33,602 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.50 vs. limit=22.5 2023-10-03 06:34:35,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:34:36,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:34:37,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1171733.3333333333, ans=0.0 2023-10-03 06:34:39,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:34:40,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:34:43,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 06:34:43,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 06:34:46,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 06:34:46,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:34:47,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:34:47,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:34:49,259 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:34:50,924 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 06:34:50,933 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 06:34:50,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:34:52,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:34:53,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 06:34:56,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:34:58,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:34:58,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 06:34:58,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 06:35:01,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:35:03,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:35:03,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:35:07,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 06:35:11,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:35:11,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1171933.3333333333, ans=0.2 2023-10-03 06:35:12,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 06:35:12,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 06:35:14,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:35:18,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:35:19,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:35:22,465 INFO [train.py:1046] (1/4) Epoch 34, batch 500, loss[loss=0.1761, simple_loss=0.2477, pruned_loss=0.05222, over 23628.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2403, pruned_loss=0.04117, over 4334254.99 frames. ], batch size: 256, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:35:22,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:35:22,544 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 06:35:27,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:35:29,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:35:29,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:35:29,137 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 06:35:30,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 06:35:30,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:35:33,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:35:36,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:35:38,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:35:39,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:35:39,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:35:39,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1172066.6666666667, ans=0.125 2023-10-03 06:35:40,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:35:47,030 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.28 vs. limit=10.0 2023-10-03 06:35:49,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:35:49,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 06:35:50,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:35:50,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:35:51,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 06:35:51,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:35:55,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:35:56,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:35:56,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:35:56,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:35:58,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 06:35:58,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1172133.3333333333, ans=0.0 2023-10-03 06:36:02,451 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 06:36:03,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:36:05,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:36:06,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:36:06,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:36:08,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:36:09,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 06:36:11,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1172200.0, ans=0.09899494936611666 2023-10-03 06:36:12,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:36:14,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:36:18,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:36:20,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:36:27,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:36:30,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1172266.6666666667, ans=0.2 2023-10-03 06:36:31,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 06:36:31,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:36:31,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:36:34,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 06:36:34,806 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.29 vs. limit=15.0 2023-10-03 06:36:35,495 INFO [train.py:1046] (1/4) Epoch 34, batch 550, loss[loss=0.1418, simple_loss=0.2211, pruned_loss=0.0312, over 24570.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2412, pruned_loss=0.04162, over 4417275.24 frames. ], batch size: 60, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:36:35,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 06:36:37,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:36:39,847 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.879e+02 2.022e+02 2.267e+02 3.367e+02, threshold=4.045e+02, percent-clipped=0.0 2023-10-03 06:36:41,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 06:36:42,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 06:36:42,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:36:42,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 06:36:42,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:36:44,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:36:44,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:36:46,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:36:46,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:36:47,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:36:48,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:36:49,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 06:36:50,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:36:54,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:36:54,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:36:57,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:36:57,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:37:02,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1172400.0, ans=0.2 2023-10-03 06:37:03,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 06:37:03,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 06:37:03,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:37:07,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:37:09,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:37:10,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:37:10,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1172466.6666666667, ans=0.125 2023-10-03 06:37:13,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:37:13,474 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 06:37:15,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:37:15,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 06:37:18,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:37:19,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 06:37:19,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:37:19,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:37:22,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 06:37:23,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 06:37:23,789 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.66 vs. limit=22.5 2023-10-03 06:37:24,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:37:24,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:37:24,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:37:24,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:37:28,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:37:28,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:37:31,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1172533.3333333333, ans=0.5 2023-10-03 06:37:32,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:37:32,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:37:33,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 06:37:33,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:37:34,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1172600.0, ans=0.1 2023-10-03 06:37:36,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:37:36,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1172600.0, ans=0.125 2023-10-03 06:37:38,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:37:38,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:37:39,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:37:39,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 06:37:40,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1172600.0, ans=0.1 2023-10-03 06:37:45,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 06:37:47,081 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1172600.0, ans=0.2 2023-10-03 06:37:48,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 06:37:49,533 INFO [train.py:1046] (1/4) Epoch 34, batch 600, loss[loss=0.1644, simple_loss=0.2461, pruned_loss=0.04133, over 23919.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2421, pruned_loss=0.04176, over 4483382.72 frames. ], batch size: 86, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:37:49,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:37:50,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:37:50,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:37:57,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:38:00,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:38:02,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 06:38:06,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:38:06,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:38:08,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:38:10,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 06:38:10,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:38:10,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1172733.3333333333, ans=0.0 2023-10-03 06:38:17,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 06:38:20,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:38:21,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:38:21,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:38:27,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:38:28,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:38:28,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:38:34,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:38:38,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:38:38,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:38:38,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:38:45,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 06:38:49,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:38:49,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:38:52,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1172933.3333333333, ans=0.125 2023-10-03 06:38:53,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 06:38:55,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:38:58,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 06:38:58,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:38:58,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:39:03,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 06:39:04,476 INFO [train.py:1046] (1/4) Epoch 34, batch 650, loss[loss=0.1386, simple_loss=0.2164, pruned_loss=0.03036, over 24311.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2414, pruned_loss=0.04134, over 4533899.55 frames. ], batch size: 56, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:39:04,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:39:07,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:39:07,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:39:08,686 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.849e+02 2.038e+02 2.277e+02 3.904e+02, threshold=4.076e+02, percent-clipped=0.0 2023-10-03 06:39:08,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:09,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1173000.0, ans=0.1 2023-10-03 06:39:11,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 06:39:12,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:39:17,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:39:17,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:39:20,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:39:25,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 06:39:26,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:39:27,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:39:30,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:39:32,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 06:39:35,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:39:35,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:35,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:39:36,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:39,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:39:42,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:39:42,187 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 06:39:42,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:39:42,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:39:44,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:46,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:39:46,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:39:46,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:39:47,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 06:39:51,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:39:51,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:39:52,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 06:39:53,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:39:53,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:39:54,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1173200.0, ans=0.04949747468305833 2023-10-03 06:39:55,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 06:39:57,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 06:39:57,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:57,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:39:57,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:39:57,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:39:58,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:40:02,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:40:02,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:40:04,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:40:07,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:40:07,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 06:40:07,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:40:15,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:40:15,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:40:19,755 INFO [train.py:1046] (1/4) Epoch 34, batch 700, loss[loss=0.1604, simple_loss=0.2304, pruned_loss=0.04518, over 23336.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2406, pruned_loss=0.04104, over 4571889.46 frames. ], batch size: 119, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:40:19,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:40:19,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:40:26,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 06:40:27,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 06:40:27,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 06:40:28,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:40:30,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:40:31,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 06:40:35,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:40:37,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:40:40,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:40:40,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:40:41,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:40:42,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1173400.0, ans=0.125 2023-10-03 06:40:43,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:40:46,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 06:40:46,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:40:46,971 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.31 vs. limit=15.0 2023-10-03 06:40:47,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 06:40:50,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 06:40:52,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1173466.6666666667, ans=0.1 2023-10-03 06:40:55,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:40:55,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:40:57,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:40:59,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:41:01,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 06:41:02,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1173533.3333333333, ans=0.0 2023-10-03 06:41:05,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:41:06,084 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.48 vs. limit=15.0 2023-10-03 06:41:06,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:41:06,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 06:41:09,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:41:11,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:41:14,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:41:14,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1173533.3333333333, ans=0.125 2023-10-03 06:41:19,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:41:20,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 06:41:21,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1173600.0, ans=0.125 2023-10-03 06:41:24,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 06:41:24,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 06:41:28,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:41:30,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:41:32,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:41:33,738 INFO [train.py:1046] (1/4) Epoch 34, batch 750, loss[loss=0.1738, simple_loss=0.2581, pruned_loss=0.04479, over 24408.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2396, pruned_loss=0.04097, over 4605272.33 frames. ], batch size: 77, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:41:33,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:41:33,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 06:41:36,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 06:41:38,361 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.818e+02 1.954e+02 2.110e+02 2.895e+02, threshold=3.908e+02, percent-clipped=0.0 2023-10-03 06:41:38,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 06:41:38,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 06:41:39,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 06:41:39,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 06:41:39,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:41:41,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 06:41:42,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:41:42,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:41:45,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:41:46,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:41:46,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:41:46,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:41:49,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:41:51,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1173733.3333333333, ans=0.2 2023-10-03 06:41:52,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:41:53,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:41:55,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:41:56,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:41:56,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 06:41:58,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:41:59,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:42:02,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:42:03,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 06:42:04,224 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.59 vs. limit=22.5 2023-10-03 06:42:04,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 06:42:04,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:42:06,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 06:42:06,533 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 06:42:07,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1173800.0, ans=0.1 2023-10-03 06:42:07,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1173800.0, ans=0.125 2023-10-03 06:42:08,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 06:42:08,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:42:09,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:42:10,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:42:16,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:42:16,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:42:16,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:42:19,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:42:20,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:42:21,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 06:42:21,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:42:23,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 06:42:24,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:42:28,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:42:29,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 06:42:29,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:42:35,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:42:35,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:42:37,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:42:39,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:42:41,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1173933.3333333333, ans=0.0 2023-10-03 06:42:43,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 06:42:43,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:42:43,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:42:44,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:42:46,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:42:47,367 INFO [train.py:1046] (1/4) Epoch 34, batch 800, loss[loss=0.1774, simple_loss=0.2479, pruned_loss=0.05343, over 23409.00 frames. ], tot_loss[loss=0.161, simple_loss=0.24, pruned_loss=0.04097, over 4630961.18 frames. ], batch size: 285, lr: 3.01e-03, grad_scale: 32.0 2023-10-03 06:42:47,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1174000.0, ans=0.125 2023-10-03 06:42:48,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:42:48,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:42:54,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:42:54,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:42:56,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:42:56,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:42:57,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:42:59,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:01,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:43:04,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1174066.6666666667, ans=0.125 2023-10-03 06:43:05,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:43:05,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:43:09,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 06:43:09,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:11,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:43:11,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:43:11,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:43:11,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 06:43:13,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:43:13,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 06:43:14,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1174066.6666666667, ans=0.1 2023-10-03 06:43:14,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1174066.6666666667, ans=0.05 2023-10-03 06:43:16,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:43:17,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:43:18,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:43:18,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:43:21,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:21,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:25,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:43:25,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:43:25,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 06:43:29,186 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 06:43:29,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 06:43:29,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:43:29,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:43:31,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:43:31,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:43:35,609 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:43:36,663 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 06:43:36,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 06:43:38,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:43:41,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:43:44,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:43:44,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1174200.0, ans=0.0 2023-10-03 06:43:48,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:49,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 06:43:49,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:43:52,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 06:43:54,367 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.32 vs. limit=12.0 2023-10-03 06:43:59,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:44:01,113 INFO [train.py:1046] (1/4) Epoch 34, batch 850, loss[loss=0.1448, simple_loss=0.2308, pruned_loss=0.0294, over 24445.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2409, pruned_loss=0.04073, over 4666412.62 frames. ], batch size: 66, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 06:44:01,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:44:02,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 06:44:02,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:44:02,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:44:04,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 06:44:04,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:44:07,095 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.856e+02 2.060e+02 2.258e+02 3.332e+02, threshold=4.119e+02, percent-clipped=0.0 2023-10-03 06:44:07,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:44:08,199 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.68 vs. limit=15.0 2023-10-03 06:44:08,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:44:09,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:44:09,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:44:11,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 06:44:12,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 06:44:12,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 06:44:14,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:44:14,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:44:16,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:44:17,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:44:17,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:44:17,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1174400.0, ans=0.125 2023-10-03 06:44:22,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:44:22,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:44:24,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 06:44:25,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 06:44:29,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:44:29,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 06:44:33,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 06:44:35,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 06:44:37,136 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 06:44:37,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:44:37,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:44:38,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 06:44:41,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:44:43,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:44:43,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 06:44:45,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:44:47,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:44:47,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:44:47,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:44:48,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:44:50,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:44:50,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 06:44:54,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:44:54,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:44:54,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:44:54,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:44:55,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:44:58,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:45:02,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:45:03,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:45:03,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:45:04,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1174600.0, ans=0.0 2023-10-03 06:45:05,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:45:14,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 06:45:14,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:45:15,810 INFO [train.py:1046] (1/4) Epoch 34, batch 900, loss[loss=0.1414, simple_loss=0.2216, pruned_loss=0.03054, over 24330.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2417, pruned_loss=0.04133, over 4673393.52 frames. ], batch size: 61, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:45:15,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 06:45:15,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:45:15,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:45:18,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 06:45:22,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:45:25,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:45:26,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 06:45:29,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:45:29,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 06:45:31,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 06:45:31,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:45:31,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:45:33,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:45:33,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1174733.3333333333, ans=0.0 2023-10-03 06:45:34,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:45:42,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:45:42,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:45:44,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:45:46,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:45:49,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 06:45:51,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1174800.0, ans=0.0 2023-10-03 06:45:52,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:45:55,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 06:45:55,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:45:56,851 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 06:45:56,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 06:45:57,075 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:46:03,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 06:46:03,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:46:05,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:46:09,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:46:09,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:46:11,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 06:46:11,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:46:13,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1174866.6666666667, ans=0.125 2023-10-03 06:46:13,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1174866.6666666667, ans=0.5 2023-10-03 06:46:14,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 06:46:15,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:46:15,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:46:17,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:46:17,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:46:21,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 06:46:22,643 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 06:46:22,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 06:46:22,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 06:46:26,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:46:30,136 INFO [train.py:1046] (1/4) Epoch 34, batch 950, loss[loss=0.1701, simple_loss=0.2576, pruned_loss=0.04129, over 24018.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.242, pruned_loss=0.04158, over 4677725.64 frames. ], batch size: 86, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:46:30,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 06:46:31,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1175000.0, ans=0.125 2023-10-03 06:46:33,876 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1175000.0, ans=0.125 2023-10-03 06:46:35,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:46:38,421 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.873e+02 2.075e+02 2.442e+02 3.584e+02, threshold=4.150e+02, percent-clipped=0.0 2023-10-03 06:46:38,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:46:38,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:46:39,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:46:40,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1175000.0, ans=0.0 2023-10-03 06:46:41,262 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 06:46:42,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1175000.0, ans=0.0 2023-10-03 06:46:44,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.52 vs. limit=15.0 2023-10-03 06:46:45,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:46:47,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:46:48,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:46:48,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:46:48,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 06:46:50,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 06:46:51,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:46:52,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 06:46:52,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:46:57,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:46:57,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:46:58,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:46:59,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 06:47:01,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:47:03,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:47:04,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:47:10,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:47:10,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:47:14,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 06:47:16,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 06:47:16,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:47:17,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:47:19,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:47:19,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:47:23,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 06:47:23,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:47:25,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:47:25,651 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.81 vs. limit=15.0 2023-10-03 06:47:26,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:47:26,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 06:47:26,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:47:26,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:47:26,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 06:47:28,477 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.63 vs. limit=6.0 2023-10-03 06:47:30,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:47:34,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:47:36,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1175266.6666666667, ans=0.05 2023-10-03 06:47:38,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:47:38,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 06:47:38,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 06:47:42,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:47:44,857 INFO [train.py:1046] (1/4) Epoch 34, batch 1000, loss[loss=0.1606, simple_loss=0.2322, pruned_loss=0.04449, over 23732.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2414, pruned_loss=0.04109, over 4693300.83 frames. ], batch size: 179, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:47:48,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 06:47:48,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:47:52,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:47:52,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1175333.3333333333, ans=0.125 2023-10-03 06:47:55,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 06:47:55,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 06:47:59,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:47:59,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:48:02,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:48:03,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1175400.0, ans=0.0 2023-10-03 06:48:03,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1175400.0, ans=0.05 2023-10-03 06:48:05,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 06:48:06,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 06:48:09,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 06:48:09,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:48:11,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 06:48:12,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 06:48:12,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 06:48:13,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:48:14,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:48:21,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:48:21,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1175466.6666666667, ans=0.05 2023-10-03 06:48:22,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:48:22,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:48:22,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:48:22,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 06:48:24,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:48:24,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:48:24,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1175466.6666666667, ans=0.2 2023-10-03 06:48:25,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:48:26,918 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 06:48:28,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 06:48:29,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 06:48:31,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 06:48:34,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:48:37,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1175533.3333333333, ans=0.2 2023-10-03 06:48:40,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:48:40,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:48:41,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:48:42,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:48:44,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 06:48:44,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1175600.0, ans=0.0 2023-10-03 06:48:44,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1175600.0, ans=0.0 2023-10-03 06:48:45,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:48:45,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 06:48:47,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 06:48:48,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:48:48,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:48:52,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:48:54,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:48:55,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:48:57,059 INFO [train.py:1046] (1/4) Epoch 34, batch 1050, loss[loss=0.1511, simple_loss=0.2301, pruned_loss=0.03601, over 24586.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2399, pruned_loss=0.04073, over 4701833.66 frames. ], batch size: 60, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:48:58,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:48:58,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:49:00,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:49:02,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:49:04,580 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.921e+02 2.098e+02 2.393e+02 3.925e+02, threshold=4.195e+02, percent-clipped=0.0 2023-10-03 06:49:04,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:49:07,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:49:08,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:49:10,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:49:11,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:49:11,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:49:11,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:49:13,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 06:49:14,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:49:14,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 06:49:17,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:49:17,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 06:49:17,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 06:49:26,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:49:27,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:49:27,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:49:29,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 06:49:29,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 06:49:29,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:49:34,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 06:49:37,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 06:49:37,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:49:41,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 06:49:41,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1175866.6666666667, ans=0.0 2023-10-03 06:49:43,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 06:49:43,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:49:43,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:49:46,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:49:51,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 06:49:53,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 06:49:54,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 06:49:54,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:49:54,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:49:56,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 06:49:58,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:50:02,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:50:02,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:50:03,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:50:03,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:50:06,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:50:06,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 06:50:07,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:50:07,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 06:50:08,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 06:50:08,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:50:10,774 INFO [train.py:1046] (1/4) Epoch 34, batch 1100, loss[loss=0.1678, simple_loss=0.2513, pruned_loss=0.04213, over 23414.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2394, pruned_loss=0.04065, over 4707661.74 frames. ], batch size: 93, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:50:12,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:50:16,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:50:23,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:50:25,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:50:25,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:50:25,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 06:50:28,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:50:28,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:50:31,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:50:34,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:50:34,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 06:50:36,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 06:50:38,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:50:38,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:50:40,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:50:42,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:50:44,298 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.00 vs. limit=15.0 2023-10-03 06:50:45,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1176133.3333333333, ans=0.05 2023-10-03 06:50:46,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:50:50,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 06:50:51,485 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 06:50:51,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:50:54,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:50:54,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:50:56,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:50:57,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 06:50:57,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1176200.0, ans=0.125 2023-10-03 06:50:58,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:50:58,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:50:58,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:51:00,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:51:00,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 06:51:03,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1176200.0, ans=0.1 2023-10-03 06:51:04,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:51:04,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 06:51:05,440 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.35 vs. limit=22.5 2023-10-03 06:51:06,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:51:11,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:51:13,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1176266.6666666667, ans=0.0 2023-10-03 06:51:15,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 06:51:15,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 06:51:16,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:51:19,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:51:19,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:51:19,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 06:51:20,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:51:22,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:51:23,807 INFO [train.py:1046] (1/4) Epoch 34, batch 1150, loss[loss=0.1847, simple_loss=0.251, pruned_loss=0.05924, over 22856.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2402, pruned_loss=0.04094, over 4707533.81 frames. ], batch size: 322, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:51:23,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 06:51:23,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:51:23,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 06:51:25,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:51:25,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:51:25,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:51:31,203 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.910e+02 2.113e+02 2.416e+02 3.603e+02, threshold=4.226e+02, percent-clipped=0.0 2023-10-03 06:51:31,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:51:32,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:51:34,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:51:35,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:51:35,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 06:51:36,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:51:37,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 06:51:40,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:51:40,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:51:46,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 06:51:48,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:51:50,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:51:52,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:51:52,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 06:51:52,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:51:53,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:51:56,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 06:51:58,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:51:59,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:52:05,340 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.69 vs. limit=15.0 2023-10-03 06:52:05,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:52:07,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1176533.3333333333, ans=0.125 2023-10-03 06:52:12,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:52:13,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 06:52:13,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:52:13,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:52:19,801 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 06:52:19,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1176533.3333333333, ans=0.0 2023-10-03 06:52:20,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.69 vs. limit=15.0 2023-10-03 06:52:22,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:52:25,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1176600.0, ans=0.125 2023-10-03 06:52:27,800 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 06:52:31,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:52:33,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:52:33,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:52:34,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:52:37,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:52:38,901 INFO [train.py:1046] (1/4) Epoch 34, batch 1200, loss[loss=0.1619, simple_loss=0.2422, pruned_loss=0.04082, over 23710.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2409, pruned_loss=0.04104, over 4697221.53 frames. ], batch size: 85, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 06:52:41,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:52:41,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:52:43,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:52:43,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:52:44,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:52:46,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:52:47,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:52:50,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:52:50,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:52:50,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1176666.6666666667, ans=0.0 2023-10-03 06:52:53,361 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 06:52:56,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 06:52:58,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:53:00,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:53:03,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:53:05,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:53:05,346 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 06:53:05,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:53:14,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:53:14,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:53:14,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 06:53:15,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:53:15,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1176800.0, ans=0.0 2023-10-03 06:53:18,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 06:53:22,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 06:53:22,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:53:24,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:53:25,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:53:27,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 06:53:28,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:53:28,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:53:30,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:53:31,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 06:53:31,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:53:33,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:53:33,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 06:53:34,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:53:34,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:53:39,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 06:53:39,487 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:53:41,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:53:43,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 06:53:48,272 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 06:53:49,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:53:52,410 INFO [train.py:1046] (1/4) Epoch 34, batch 1250, loss[loss=0.1447, simple_loss=0.2239, pruned_loss=0.03271, over 22339.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2408, pruned_loss=0.04059, over 4715426.41 frames. ], batch size: 49, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 06:53:52,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:53:53,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:53:55,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:53:58,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 06:53:59,821 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.873e+02 2.047e+02 2.320e+02 3.578e+02, threshold=4.093e+02, percent-clipped=0.0 2023-10-03 06:54:00,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1177000.0, ans=0.125 2023-10-03 06:54:01,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:54:01,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1177000.0, ans=0.0 2023-10-03 06:54:03,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:54:03,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 06:54:04,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:54:05,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:54:10,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:54:10,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:54:12,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:54:12,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:54:12,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1177066.6666666667, ans=0.1 2023-10-03 06:54:14,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:54:19,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 06:54:19,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:54:19,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:54:21,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:54:21,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:54:24,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:54:24,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1177133.3333333333, ans=0.2 2023-10-03 06:54:26,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 06:54:26,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1177133.3333333333, ans=0.0 2023-10-03 06:54:26,327 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1177133.3333333333, ans=0.125 2023-10-03 06:54:29,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 06:54:30,010 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.13 vs. limit=15.0 2023-10-03 06:54:30,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:54:31,095 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1177133.3333333333, ans=0.125 2023-10-03 06:54:32,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:54:34,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 06:54:34,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:54:34,270 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 06:54:34,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:54:34,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:54:37,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1177200.0, ans=0.125 2023-10-03 06:54:40,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:54:41,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:54:41,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:54:43,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 06:54:43,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 06:54:43,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 06:54:46,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:54:47,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 06:54:47,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:54:51,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 06:54:51,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:54:53,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 06:54:54,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:54:54,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:54:54,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 06:54:54,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:54:57,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 06:55:00,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:55:00,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:55:02,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:55:07,040 INFO [train.py:1046] (1/4) Epoch 34, batch 1300, loss[loss=0.161, simple_loss=0.249, pruned_loss=0.03647, over 24380.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.241, pruned_loss=0.04055, over 4725731.09 frames. ], batch size: 77, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:55:07,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:55:09,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:55:09,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 06:55:12,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:55:15,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:55:15,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:55:17,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:55:18,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:55:19,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1177333.3333333333, ans=0.125 2023-10-03 06:55:20,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 06:55:24,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:55:25,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 06:55:27,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 06:55:31,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:55:34,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:55:36,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:55:38,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:55:39,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:55:39,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1177466.6666666667, ans=0.0 2023-10-03 06:55:39,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1177466.6666666667, ans=0.0 2023-10-03 06:55:40,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:55:40,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 06:55:40,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 06:55:47,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:55:47,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:55:50,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 06:55:50,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:55:51,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:55:53,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:55:53,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 06:55:53,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1177533.3333333333, ans=0.125 2023-10-03 06:55:54,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:55:54,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 06:55:56,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:55:59,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:55:59,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:56:03,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 06:56:05,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 06:56:07,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 06:56:10,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:56:13,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 06:56:14,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:56:14,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1177600.0, ans=10.0 2023-10-03 06:56:20,034 INFO [train.py:1046] (1/4) Epoch 34, batch 1350, loss[loss=0.158, simple_loss=0.2253, pruned_loss=0.04539, over 23602.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2399, pruned_loss=0.04073, over 4708636.84 frames. ], batch size: 256, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:56:20,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 06:56:23,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:56:25,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:56:27,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:56:27,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:56:29,621 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.836e+02 2.071e+02 2.360e+02 3.214e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-03 06:56:31,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:56:31,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:56:35,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:56:38,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 06:56:40,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:56:40,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:56:43,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 06:56:43,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:56:44,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:56:44,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 06:56:45,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 06:56:46,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1177733.3333333333, ans=0.125 2023-10-03 06:56:47,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 06:56:48,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:56:48,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 06:56:52,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1177800.0, ans=0.1 2023-10-03 06:57:00,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:57:08,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:57:09,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:57:09,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 06:57:12,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:57:13,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 06:57:13,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:57:15,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:57:16,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:57:17,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 06:57:20,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:57:26,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 06:57:28,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 06:57:31,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1177933.3333333333, ans=0.1 2023-10-03 06:57:34,245 INFO [train.py:1046] (1/4) Epoch 34, batch 1400, loss[loss=0.1466, simple_loss=0.2259, pruned_loss=0.03363, over 17220.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.239, pruned_loss=0.04026, over 4708516.15 frames. ], batch size: 37, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:57:34,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 06:57:36,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:57:38,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:57:39,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:57:40,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1178000.0, ans=0.2 2023-10-03 06:57:43,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 06:57:44,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 06:57:48,287 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.98 vs. limit=22.5 2023-10-03 06:57:54,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:57:57,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:57:58,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1178066.6666666667, ans=0.125 2023-10-03 06:57:59,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:57:59,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 06:58:04,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:58:04,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 06:58:13,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:58:14,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:58:17,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 06:58:17,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1178200.0, ans=0.5 2023-10-03 06:58:18,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:58:20,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:58:20,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:58:21,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:58:23,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:58:23,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:58:24,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:58:24,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 06:58:24,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:58:30,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:58:31,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:58:40,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 06:58:41,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:58:43,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:58:44,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 06:58:44,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:58:47,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:58:48,819 INFO [train.py:1046] (1/4) Epoch 34, batch 1450, loss[loss=0.1631, simple_loss=0.2581, pruned_loss=0.03409, over 24456.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2389, pruned_loss=0.03986, over 4710585.44 frames. ], batch size: 69, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:58:50,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:58:53,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:58:53,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:58:53,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 06:58:57,698 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.834e+02 2.006e+02 2.217e+02 3.059e+02, threshold=4.013e+02, percent-clipped=0.0 2023-10-03 06:58:59,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:58:59,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:58:59,919 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.33 vs. limit=15.0 2023-10-03 06:59:00,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:59:00,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 06:59:01,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:59:01,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 06:59:03,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:59:03,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:59:03,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 06:59:04,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:59:05,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:59:05,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 06:59:05,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:59:08,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:59:09,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:59:12,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:59:15,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:59:15,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:59:16,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:59:16,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:59:21,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:59:21,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:59:21,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:59:22,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:59:25,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 06:59:27,428 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.90 vs. limit=15.0 2023-10-03 06:59:28,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:59:30,014 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 06:59:30,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1178466.6666666667, ans=0.2 2023-10-03 06:59:32,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:59:33,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:59:35,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:59:37,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 06:59:42,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:59:43,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 06:59:44,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1178533.3333333333, ans=0.125 2023-10-03 06:59:45,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 06:59:45,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:59:49,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:59:49,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:59:52,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 06:59:53,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 06:59:53,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 06:59:54,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:59:56,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:00:02,345 INFO [train.py:1046] (1/4) Epoch 34, batch 1500, loss[loss=0.1688, simple_loss=0.2492, pruned_loss=0.04421, over 23463.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2397, pruned_loss=0.04055, over 4702416.14 frames. ], batch size: 106, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:00:05,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 07:00:06,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:00:06,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:00:08,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:00:09,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:00:09,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:00:11,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 07:00:11,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1178666.6666666667, ans=0.125 2023-10-03 07:00:13,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:00:14,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:00:14,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:00:14,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1178666.6666666667, ans=0.125 2023-10-03 07:00:15,425 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.56 vs. limit=15.0 2023-10-03 07:00:16,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:00:17,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:00:18,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:00:23,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:00:24,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 07:00:24,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:00:25,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:00:26,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:00:30,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 07:00:32,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1178800.0, ans=0.0 2023-10-03 07:00:33,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 07:00:33,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:00:34,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 07:00:37,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:00:39,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:00:40,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:00:40,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:00:42,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 07:00:42,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1178800.0, ans=0.125 2023-10-03 07:00:43,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:00:43,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:00:45,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 07:00:45,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:00:45,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1178866.6666666667, ans=0.125 2023-10-03 07:00:49,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:00:49,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 07:00:55,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 07:00:55,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1178866.6666666667, ans=0.09899494936611666 2023-10-03 07:00:56,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:00:58,523 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 07:00:59,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:00:59,886 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 07:01:02,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:01:03,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:01:03,912 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 07:01:05,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:01:08,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 07:01:09,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:01:14,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:01:14,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:01:15,349 INFO [train.py:1046] (1/4) Epoch 34, batch 1550, loss[loss=0.1492, simple_loss=0.2388, pruned_loss=0.02975, over 24467.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2402, pruned_loss=0.04081, over 4707924.27 frames. ], batch size: 69, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:01:15,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:01:15,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:01:17,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:01:17,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 07:01:18,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 07:01:18,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:01:20,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 07:01:20,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 07:01:22,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:01:23,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:01:24,227 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.899e+02 2.097e+02 2.316e+02 2.849e+02, threshold=4.195e+02, percent-clipped=0.0 2023-10-03 07:01:24,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:01:24,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:01:24,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:01:25,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:01:28,693 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 07:01:28,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:01:28,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:01:30,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:01:32,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:01:32,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 07:01:34,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:01:34,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 07:01:36,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 07:01:36,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 07:01:38,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:01:38,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:01:41,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1179066.6666666667, ans=0.2 2023-10-03 07:01:42,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:01:45,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 07:01:45,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 07:01:52,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:01:52,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1179133.3333333333, ans=0.0 2023-10-03 07:01:55,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:01:56,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:01:56,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:01:56,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 07:01:57,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1179133.3333333333, ans=10.0 2023-10-03 07:01:59,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1179200.0, ans=0.1 2023-10-03 07:02:04,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:02:04,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1179200.0, ans=0.125 2023-10-03 07:02:05,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:02:07,627 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.59 vs. limit=15.0 2023-10-03 07:02:08,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:02:10,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:02:10,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:02:10,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 07:02:10,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:02:14,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:02:14,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:02:14,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 07:02:14,462 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 07:02:17,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:02:22,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 07:02:28,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:02:29,301 INFO [train.py:1046] (1/4) Epoch 34, batch 1600, loss[loss=0.1897, simple_loss=0.2587, pruned_loss=0.06034, over 22766.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2405, pruned_loss=0.04104, over 4700012.62 frames. ], batch size: 322, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 07:02:29,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:02:30,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 07:02:32,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:02:33,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:02:33,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:02:33,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:02:34,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:02:38,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:02:38,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 07:02:38,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 07:02:41,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 07:02:43,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:02:44,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 07:02:44,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:02:47,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:02:48,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1179400.0, ans=0.0 2023-10-03 07:02:48,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1179400.0, ans=0.125 2023-10-03 07:02:49,918 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.76 vs. limit=22.5 2023-10-03 07:02:52,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:02:55,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 07:02:58,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:02:59,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 07:02:59,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:02:59,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 07:03:01,538 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.34 vs. limit=12.0 2023-10-03 07:03:03,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 07:03:12,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:03:14,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 07:03:14,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:03:14,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:03:14,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:03:17,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 07:03:22,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 07:03:25,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:03:25,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:03:26,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:03:26,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:03:28,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:03:28,947 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.05 vs. limit=15.0 2023-10-03 07:03:29,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:03:30,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:03:34,270 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.53 vs. limit=12.0 2023-10-03 07:03:36,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:03:37,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:03:37,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1179600.0, ans=0.0 2023-10-03 07:03:39,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 07:03:39,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:03:39,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 07:03:44,174 INFO [train.py:1046] (1/4) Epoch 34, batch 1650, loss[loss=0.1673, simple_loss=0.248, pruned_loss=0.04328, over 23713.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2415, pruned_loss=0.04141, over 4707413.04 frames. ], batch size: 149, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 07:03:45,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:03:45,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:03:46,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:03:46,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 07:03:46,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 07:03:46,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 07:03:47,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 07:03:51,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:03:52,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:03:52,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:03:52,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1179666.6666666667, ans=0.1 2023-10-03 07:03:54,538 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.876e+02 2.077e+02 2.336e+02 3.555e+02, threshold=4.154e+02, percent-clipped=0.0 2023-10-03 07:03:54,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:03:57,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:03:58,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 07:04:00,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:04:00,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:04:00,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:04:00,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:04:01,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 07:04:01,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 07:04:07,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:04:09,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:04:13,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1179800.0, ans=0.125 2023-10-03 07:04:15,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1179800.0, ans=0.0 2023-10-03 07:04:18,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 07:04:19,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:04:20,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 07:04:23,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:04:25,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:04:27,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:04:27,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:04:28,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:04:28,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:04:30,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1179866.6666666667, ans=0.0 2023-10-03 07:04:31,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:04:32,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:04:32,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:04:32,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:04:34,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:04:35,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:04:39,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:04:41,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 07:04:43,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:04:43,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 07:04:44,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 07:04:44,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 07:04:44,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:04:46,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:04:46,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:04:46,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:04:46,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 07:04:50,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:04:52,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:04:52,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:04:55,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 07:04:58,425 INFO [train.py:1046] (1/4) Epoch 34, batch 1700, loss[loss=0.166, simple_loss=0.235, pruned_loss=0.04846, over 23774.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2402, pruned_loss=0.04138, over 4691717.20 frames. ], batch size: 164, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:04:59,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:04:59,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:04:59,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 07:05:00,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1180000.0, ans=0.95 2023-10-03 07:05:01,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:05:01,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:05:01,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:05:04,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:05:04,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:05:04,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 07:05:06,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:05:11,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1180066.6666666667, ans=0.0 2023-10-03 07:05:13,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:05:14,191 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.53 vs. limit=10.0 2023-10-03 07:05:17,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:05:21,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1180066.6666666667, ans=0.1 2023-10-03 07:05:22,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:05:22,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:05:23,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:05:23,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:05:26,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 07:05:28,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:05:28,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:05:29,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:05:29,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:05:32,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 07:05:32,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 07:05:32,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:05:35,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 07:05:35,764 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.07 vs. limit=22.5 2023-10-03 07:05:36,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:05:43,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1180200.0, ans=0.5 2023-10-03 07:05:46,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:05:46,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:05:47,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:05:49,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:05:49,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 07:05:49,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:05:52,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:05:52,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 07:05:52,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:05:52,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:05:52,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:05:52,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:05:52,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1180200.0, ans=0.0 2023-10-03 07:05:55,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:05:55,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:05:55,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:05:57,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:05:57,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:05:57,300 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:06:01,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:06:02,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 07:06:02,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1180266.6666666667, ans=0.0 2023-10-03 07:06:04,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:06:06,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:06:08,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 07:06:12,499 INFO [train.py:1046] (1/4) Epoch 34, batch 1750, loss[loss=0.1472, simple_loss=0.2243, pruned_loss=0.03509, over 23737.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2385, pruned_loss=0.04109, over 4699072.05 frames. ], batch size: 232, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:06:15,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:06:16,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1180333.3333333333, ans=0.04949747468305833 2023-10-03 07:06:17,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:06:17,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:06:17,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 07:06:19,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:06:20,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:06:20,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:06:23,234 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.852e+02 2.086e+02 2.272e+02 3.301e+02, threshold=4.171e+02, percent-clipped=0.0 2023-10-03 07:06:24,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 07:06:25,073 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:06:26,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:06:26,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1180400.0, ans=0.2 2023-10-03 07:06:28,750 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.35 vs. limit=12.0 2023-10-03 07:06:29,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 07:06:29,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:06:30,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:06:34,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 07:06:37,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 07:06:38,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:06:39,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 07:06:48,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:06:51,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:06:51,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:06:55,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:06:55,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:06:57,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:06:58,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:06:59,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1180533.3333333333, ans=0.04949747468305833 2023-10-03 07:07:01,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:07:01,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:07:03,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 07:07:04,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:07:06,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 07:07:07,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:07:10,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:07:10,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:07:14,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:07:15,876 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.19 vs. limit=22.5 2023-10-03 07:07:16,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 07:07:16,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:07:18,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:07:20,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:07:23,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:07:24,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:07:25,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 07:07:25,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:07:26,717 INFO [train.py:1046] (1/4) Epoch 34, batch 1800, loss[loss=0.172, simple_loss=0.2531, pruned_loss=0.04545, over 24032.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2386, pruned_loss=0.04073, over 4711426.18 frames. ], batch size: 86, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:07:26,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:07:26,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:07:26,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:07:26,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:07:28,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:07:31,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:07:32,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:07:34,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:07:37,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:07:38,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:07:39,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:07:43,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:07:44,972 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.84 vs. limit=22.5 2023-10-03 07:07:45,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:07:45,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:07:46,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1180733.3333333333, ans=0.0 2023-10-03 07:07:47,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:07:47,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:07:47,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 07:07:49,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:07:53,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:07:56,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 07:07:59,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 07:07:59,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 07:08:01,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:08:01,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:08:01,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:08:02,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:08:08,300 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 07:08:09,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:08:11,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:08:12,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 07:08:12,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 07:08:12,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:08:14,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:08:16,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:08:21,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 07:08:22,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1180866.6666666667, ans=0.125 2023-10-03 07:08:24,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1180866.6666666667, ans=0.1 2023-10-03 07:08:25,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:08:25,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 07:08:26,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:08:26,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:08:26,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:08:28,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 07:08:31,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:08:31,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:08:35,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 07:08:35,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:08:36,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:08:36,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:08:36,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:08:39,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:08:39,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:08:40,835 INFO [train.py:1046] (1/4) Epoch 34, batch 1850, loss[loss=0.1595, simple_loss=0.2456, pruned_loss=0.03666, over 24291.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2392, pruned_loss=0.04072, over 4708454.55 frames. ], batch size: 61, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:08:42,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:08:42,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:08:44,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:08:45,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:08:47,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1181000.0, ans=0.0 2023-10-03 07:08:51,418 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.930e+02 2.238e+02 2.527e+02 4.034e+02, threshold=4.475e+02, percent-clipped=0.0 2023-10-03 07:08:51,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1181000.0, ans=0.125 2023-10-03 07:08:54,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:08:54,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 07:08:57,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 07:09:00,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 07:09:03,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:09:03,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 07:09:03,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 07:09:08,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1181066.6666666667, ans=0.04949747468305833 2023-10-03 07:09:13,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:09:15,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 07:09:17,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:09:18,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:09:20,588 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.23 vs. limit=12.0 2023-10-03 07:09:21,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 07:09:22,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:09:22,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:09:24,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:09:25,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1181200.0, ans=0.2 2023-10-03 07:09:26,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:09:29,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:09:31,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:09:32,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:09:32,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 07:09:32,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:09:33,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:09:35,426 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.10 vs. limit=15.0 2023-10-03 07:09:36,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:09:36,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1181200.0, ans=0.125 2023-10-03 07:09:38,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 07:09:38,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1181200.0, ans=0.125 2023-10-03 07:09:39,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:09:42,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:09:43,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:09:43,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 07:09:43,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 07:09:47,422 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 07:09:48,589 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 07:09:48,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:09:48,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:09:48,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:09:50,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:09:50,123 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 07:09:50,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:09:51,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:09:52,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:09:54,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:09:54,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1181333.3333333333, ans=0.125 2023-10-03 07:09:56,154 INFO [train.py:1046] (1/4) Epoch 34, batch 1900, loss[loss=0.1508, simple_loss=0.2209, pruned_loss=0.04034, over 23548.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2407, pruned_loss=0.04154, over 4684566.38 frames. ], batch size: 149, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:09:57,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:09:57,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 07:10:00,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:10:00,396 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 07:10:00,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:10:01,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:10:05,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:10:07,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:10:09,256 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 07:10:10,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 07:10:10,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:10:12,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:10:12,042 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 07:10:12,072 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 07:10:16,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 07:10:16,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:10:19,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 07:10:21,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 07:10:22,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1181400.0, ans=0.125 2023-10-03 07:10:31,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 07:10:34,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 07:10:34,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:10:35,648 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 07:10:35,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 07:10:35,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 07:10:35,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 07:10:35,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:10:40,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 07:10:43,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:10:47,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:10:47,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 07:10:49,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:10:53,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 07:10:53,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:10:58,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:10:58,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:11:00,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:11:00,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:11:01,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:11:01,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:11:02,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:11:06,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:11:06,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:11:09,533 INFO [train.py:1046] (1/4) Epoch 34, batch 1950, loss[loss=0.1676, simple_loss=0.2593, pruned_loss=0.03793, over 24419.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2418, pruned_loss=0.04175, over 4686466.79 frames. ], batch size: 69, lr: 3.00e-03, grad_scale: 8.0 2023-10-03 07:11:09,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:11:09,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:11:09,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:11:11,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:11:14,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:11:16,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:11:16,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:11:16,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:11:16,675 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.07 vs. limit=10.0 2023-10-03 07:11:19,957 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.878e+02 2.003e+02 2.217e+02 3.435e+02, threshold=4.007e+02, percent-clipped=0.0 2023-10-03 07:11:20,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 07:11:20,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 07:11:20,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:11:22,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:11:25,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:11:25,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:11:25,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:11:28,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:11:31,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:11:31,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:11:32,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:11:32,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:11:35,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:11:39,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:11:39,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:11:39,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:11:39,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 07:11:41,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:11:41,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:11:42,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:11:47,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:11:48,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:11:52,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:11:53,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1181866.6666666667, ans=0.1 2023-10-03 07:11:54,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:11:56,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:11:56,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 07:11:56,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:11:59,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:12:00,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:12:02,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:12:08,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:12:08,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:12:10,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:12:12,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:12:13,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:12:15,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:12:15,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 07:12:15,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:12:17,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:12:17,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 07:12:19,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:12:24,235 INFO [train.py:1046] (1/4) Epoch 34, batch 2000, loss[loss=0.1563, simple_loss=0.2482, pruned_loss=0.03224, over 24524.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2421, pruned_loss=0.0416, over 4699498.36 frames. ], batch size: 71, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:12:24,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:12:25,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:12:26,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:12:27,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1182000.0, ans=0.125 2023-10-03 07:12:29,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:12:29,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:12:33,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 07:12:34,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:12:36,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:12:37,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 07:12:37,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:12:37,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:12:40,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_ff3.min_abs, batch_count=1182066.6666666667, ans=0.2 2023-10-03 07:12:41,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:12:41,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 07:12:42,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:12:44,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:12:44,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:12:46,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 07:12:46,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:12:48,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 07:12:48,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:12:53,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:12:53,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 07:12:53,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:12:55,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:12:55,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:12:56,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 07:13:00,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 07:13:00,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:13:00,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:13:00,828 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.68 vs. limit=15.0 2023-10-03 07:13:05,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:13:06,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:13:06,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:13:06,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:13:09,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:13:10,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:13:12,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:13:12,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:13:13,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:16,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:13:16,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 07:13:19,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:13:21,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:13:26,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:13:26,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:13:29,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:31,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:13:31,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:33,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 07:13:33,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:13:37,624 INFO [train.py:1046] (1/4) Epoch 34, batch 2050, loss[loss=0.1531, simple_loss=0.2305, pruned_loss=0.03782, over 23565.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2403, pruned_loss=0.04133, over 4692933.96 frames. ], batch size: 134, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:13:37,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:13:37,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:37,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1182333.3333333333, ans=0.1 2023-10-03 07:13:40,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:13:42,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:44,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:13:46,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:13:47,557 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.844e+02 2.070e+02 2.334e+02 4.430e+02, threshold=4.140e+02, percent-clipped=1.0 2023-10-03 07:13:47,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:47,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:13:50,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 07:13:50,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:13:52,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:13:52,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:13:57,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1182400.0, ans=0.125 2023-10-03 07:13:59,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1182400.0, ans=0.0 2023-10-03 07:14:02,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:14:03,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:14:04,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1182400.0, ans=0.0 2023-10-03 07:14:05,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 07:14:07,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:14:09,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 07:14:10,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:14:13,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:14:14,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:14:16,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:14:16,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:14:17,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:14:17,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:14:19,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:14:22,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:14:25,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:14:27,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:14:28,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:14:34,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:14:35,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1182533.3333333333, ans=0.125 2023-10-03 07:14:37,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:14:38,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1182600.0, ans=0.125 2023-10-03 07:14:39,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 07:14:44,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:14:44,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:14:46,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1182600.0, ans=0.1 2023-10-03 07:14:47,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:14:49,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 07:14:50,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1182666.6666666667, ans=0.2 2023-10-03 07:14:52,172 INFO [train.py:1046] (1/4) Epoch 34, batch 2100, loss[loss=0.1375, simple_loss=0.1876, pruned_loss=0.04373, over 19401.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2386, pruned_loss=0.04086, over 4699725.77 frames. ], batch size: 390, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:14:54,097 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 07:14:54,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:14:54,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:14:54,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:14:54,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:14:54,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 07:14:56,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 07:14:57,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:15:00,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:15:02,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:15:04,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:04,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:15:05,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 07:15:06,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:15:06,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 07:15:06,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 07:15:08,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:15:09,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:15:09,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 07:15:09,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 07:15:11,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1182733.3333333333, ans=0.125 2023-10-03 07:15:14,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 07:15:14,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:15:16,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1182733.3333333333, ans=0.125 2023-10-03 07:15:17,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:15:19,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:15:23,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:15:23,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 07:15:24,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:15:24,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 07:15:25,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 07:15:25,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:25,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 07:15:25,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 07:15:25,911 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.81 vs. limit=22.5 2023-10-03 07:15:27,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 07:15:27,856 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.85 vs. limit=15.0 2023-10-03 07:15:28,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:15:30,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:15:33,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:15:34,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:15:35,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:15:36,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:15:36,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 07:15:36,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:37,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1182866.6666666667, ans=0.125 2023-10-03 07:15:38,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:15:38,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:15:38,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 07:15:39,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 07:15:40,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 07:15:43,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:15:47,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:15:47,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 07:15:52,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:55,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:15:55,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:15:55,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:15:55,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 07:15:55,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:15:57,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:57,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:15:59,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:15:59,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:00,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 07:16:02,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 07:16:02,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:16:05,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:16:05,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:16:06,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:16:06,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:16:08,083 INFO [train.py:1046] (1/4) Epoch 34, batch 2150, loss[loss=0.1649, simple_loss=0.2334, pruned_loss=0.04817, over 23716.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2389, pruned_loss=0.04091, over 4711266.16 frames. ], batch size: 232, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:16:13,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 07:16:14,256 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.09 vs. limit=15.0 2023-10-03 07:16:14,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:16:16,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:17,650 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.865e+02 1.979e+02 2.220e+02 3.230e+02, threshold=3.959e+02, percent-clipped=0.0 2023-10-03 07:16:17,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:16:17,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:19,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:16:21,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:22,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:16:22,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:16:26,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:26,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 07:16:29,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:16:31,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:16:32,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:33,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:16:33,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:35,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:16:35,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:16:36,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:16:37,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:16:37,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 07:16:40,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:16:41,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:43,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:16:43,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:16:44,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:16:47,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:47,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:16:49,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:16:49,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 07:16:49,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:16:52,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:16:52,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:53,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:16:55,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:16:56,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:16:56,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:56,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 07:16:59,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 07:16:59,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:17:00,684 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 07:17:00,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:17:00,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:17:02,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 07:17:02,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:17:02,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 07:17:02,712 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 07:17:02,712 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 07:17:04,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 07:17:06,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:17:06,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:17:06,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:17:07,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:07,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:17:10,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:17:10,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:17,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:17:18,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 07:17:21,361 INFO [train.py:1046] (1/4) Epoch 34, batch 2200, loss[loss=0.1835, simple_loss=0.2534, pruned_loss=0.05675, over 23763.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2387, pruned_loss=0.04089, over 4700573.21 frames. ], batch size: 164, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:17:23,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:17:28,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:28,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:17:30,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:17:30,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:17:31,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:17:31,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:17:32,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 07:17:35,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1183400.0, ans=0.1 2023-10-03 07:17:38,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 07:17:39,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 07:17:40,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1183400.0, ans=0.0 2023-10-03 07:17:45,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 07:17:46,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:46,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:17:48,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:17:51,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:17:51,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 07:17:55,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:17:55,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:55,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1183466.6666666667, ans=0.125 2023-10-03 07:17:57,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 07:18:00,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:18:02,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:18:03,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:18:05,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:18:08,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 07:18:08,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:18:11,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 07:18:13,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:18:13,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:18:13,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:18:14,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1183533.3333333333, ans=0.1 2023-10-03 07:18:16,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:18:16,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:18:16,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:18:18,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:18:19,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:18:19,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:18:22,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:18:23,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 07:18:24,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:18:25,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1183600.0, ans=0.1 2023-10-03 07:18:26,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:18:28,242 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 07:18:29,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:18:29,754 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 07:18:31,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 07:18:31,162 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 07:18:34,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:18:35,753 INFO [train.py:1046] (1/4) Epoch 34, batch 2250, loss[loss=0.164, simple_loss=0.2374, pruned_loss=0.04537, over 23481.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2388, pruned_loss=0.04047, over 4716625.91 frames. ], batch size: 134, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:18:35,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:18:35,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:18:39,649 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 07:18:41,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:18:42,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:18:45,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:18:46,636 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.861e+02 2.051e+02 2.381e+02 3.141e+02, threshold=4.103e+02, percent-clipped=0.0 2023-10-03 07:18:47,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:18:50,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:18:51,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:18:53,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:18:54,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 07:18:54,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:18:54,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:18:57,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 07:18:57,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:18:57,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:18:59,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:19:02,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1183733.3333333333, ans=0.125 2023-10-03 07:19:07,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:19:09,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 07:19:09,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:19:11,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 07:19:12,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:19:13,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:19:15,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1183800.0, ans=0.1 2023-10-03 07:19:18,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:19:19,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:19:20,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:19:20,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:19:22,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:19:25,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:19:29,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:19:31,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:19:35,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:19:35,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1183933.3333333333, ans=0.2 2023-10-03 07:19:37,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:19:37,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:19:42,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 07:19:43,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:19:43,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 07:19:45,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:19:45,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:19:46,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1183933.3333333333, ans=0.2 2023-10-03 07:19:47,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 07:19:51,021 INFO [train.py:1046] (1/4) Epoch 34, batch 2300, loss[loss=0.1617, simple_loss=0.2323, pruned_loss=0.0456, over 23698.00 frames. ], tot_loss[loss=0.161, simple_loss=0.24, pruned_loss=0.04097, over 4717569.50 frames. ], batch size: 232, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:19:51,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:19:51,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:19:56,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:19:58,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:19:59,534 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 07:20:00,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:20:09,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:20:09,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 07:20:09,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:20:10,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:20:10,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 07:20:10,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:20:14,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:20:14,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:20:17,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:20:18,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:20:22,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:20:24,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1184133.3333333333, ans=0.2 2023-10-03 07:20:25,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:20:26,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:20:29,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:20:32,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:20:35,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:20:37,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:20:37,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:20:37,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 07:20:41,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 07:20:41,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:20:41,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:20:43,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:20:43,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:20:43,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 07:20:43,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:20:45,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 07:20:45,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:20:45,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:20:45,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 07:20:49,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:20:52,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:20:55,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1184266.6666666667, ans=0.0 2023-10-03 07:20:56,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:20:57,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:20:57,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:21:00,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:21:00,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:21:00,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:21:01,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 07:21:05,071 INFO [train.py:1046] (1/4) Epoch 34, batch 2350, loss[loss=0.1487, simple_loss=0.2249, pruned_loss=0.03625, over 24315.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2408, pruned_loss=0.04119, over 4720480.18 frames. ], batch size: 56, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:21:08,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:21:08,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 07:21:12,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1184333.3333333333, ans=0.0 2023-10-03 07:21:13,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 07:21:16,333 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.838e+02 1.985e+02 2.277e+02 3.152e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-03 07:21:17,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:21:20,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:21:20,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:21:21,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:21:21,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:21:23,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 07:21:26,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:21:30,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 07:21:31,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:21:34,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:21:34,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:21:36,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:21:39,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 07:21:39,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:21:42,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:21:42,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:21:42,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:21:46,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:21:46,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 07:21:47,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:21:50,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:21:50,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:21:51,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 07:21:53,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:21:55,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 07:21:55,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:22:00,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 07:22:04,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 07:22:04,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:22:05,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 07:22:06,932 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 07:22:06,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 07:22:08,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 07:22:11,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:22:16,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:22:19,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:22:20,817 INFO [train.py:1046] (1/4) Epoch 34, batch 2400, loss[loss=0.1504, simple_loss=0.2366, pruned_loss=0.0321, over 24663.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2395, pruned_loss=0.04082, over 4710691.97 frames. ], batch size: 65, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:22:22,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:22:23,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 07:22:23,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 07:22:29,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:22:29,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:22:29,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1184666.6666666667, ans=0.125 2023-10-03 07:22:31,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 07:22:31,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:22:33,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:22:33,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 07:22:40,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:22:41,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 07:22:46,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:22:52,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 07:22:54,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:22:55,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:22:55,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1184800.0, ans=0.2 2023-10-03 07:22:58,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:22:59,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 07:23:00,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:23:05,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1184866.6666666667, ans=0.125 2023-10-03 07:23:09,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:23:11,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:23:14,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:23:14,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:23:15,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 07:23:15,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:23:15,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:23:17,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:23:17,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 07:23:21,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:23:21,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:23:21,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 07:23:23,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 07:23:24,263 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.54 vs. limit=6.0 2023-10-03 07:23:26,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:23:26,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:23:26,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1184933.3333333333, ans=0.125 2023-10-03 07:23:27,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 07:23:27,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 07:23:27,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1184933.3333333333, ans=0.125 2023-10-03 07:23:28,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 07:23:28,735 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 07:23:28,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 07:23:30,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:23:31,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:23:31,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:23:32,982 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 07:23:34,290 INFO [train.py:1046] (1/4) Epoch 34, batch 2450, loss[loss=0.1561, simple_loss=0.2437, pruned_loss=0.03421, over 24313.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2384, pruned_loss=0.04059, over 4704305.82 frames. ], batch size: 74, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:23:34,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:23:34,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 07:23:35,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1185000.0, ans=0.125 2023-10-03 07:23:37,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:23:37,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:23:41,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:23:41,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:23:43,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 07:23:47,260 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.892e+02 2.133e+02 2.562e+02 4.061e+02, threshold=4.265e+02, percent-clipped=1.0 2023-10-03 07:23:47,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:23:47,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:23:47,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1185000.0, ans=0.125 2023-10-03 07:23:50,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:23:50,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:23:50,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:23:50,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 07:23:52,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1185066.6666666667, ans=0.2 2023-10-03 07:23:54,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:23:56,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:23:57,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:24:01,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:24:01,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:24:03,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:24:03,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:24:03,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1185133.3333333333, ans=10.0 2023-10-03 07:24:04,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 07:24:04,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:24:12,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:24:12,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:24:14,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:24:14,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:24:14,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:24:15,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:24:17,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 07:24:21,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:24:21,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1185200.0, ans=0.0 2023-10-03 07:24:22,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:24:25,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:24:25,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:24:29,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:24:29,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 07:24:31,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:24:32,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:24:32,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 07:24:32,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:24:34,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:24:37,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:24:40,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:24:41,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:24:43,996 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.16 vs. limit=12.0 2023-10-03 07:24:44,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 07:24:47,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:24:49,891 INFO [train.py:1046] (1/4) Epoch 34, batch 2500, loss[loss=0.1647, simple_loss=0.2514, pruned_loss=0.03898, over 24028.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.238, pruned_loss=0.04047, over 4710446.13 frames. ], batch size: 80, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:24:51,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:25:01,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:25:01,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:25:01,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:25:01,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 07:25:08,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:25:09,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:25:11,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 07:25:11,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 07:25:12,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 07:25:14,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:25:14,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:25:16,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 07:25:16,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:25:17,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 07:25:17,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:25:19,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1185466.6666666667, ans=0.125 2023-10-03 07:25:23,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:25:25,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:25:28,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:25:28,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 07:25:28,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:25:29,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:25:32,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:25:36,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:25:39,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:25:42,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1185533.3333333333, ans=0.1 2023-10-03 07:25:43,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:25:46,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 07:25:46,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:25:46,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:25:49,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:25:49,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:25:50,318 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 07:25:50,319 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 07:25:50,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 07:25:54,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:25:56,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 07:25:57,298 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.61 vs. limit=15.0 2023-10-03 07:25:57,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 07:25:57,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:25:59,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 07:26:03,118 INFO [train.py:1046] (1/4) Epoch 34, batch 2550, loss[loss=0.1638, simple_loss=0.2511, pruned_loss=0.0382, over 23944.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2391, pruned_loss=0.04048, over 4722578.01 frames. ], batch size: 80, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:26:03,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 07:26:03,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1185666.6666666667, ans=0.125 2023-10-03 07:26:04,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:26:04,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:26:06,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:26:07,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:26:09,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 07:26:09,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:26:13,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 07:26:13,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:26:14,468 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.97 vs. limit=15.0 2023-10-03 07:26:15,135 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.964e+02 2.224e+02 2.724e+02 4.382e+02, threshold=4.447e+02, percent-clipped=1.0 2023-10-03 07:26:15,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:26:15,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1185666.6666666667, ans=0.0 2023-10-03 07:26:18,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:26:18,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 07:26:18,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:26:18,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:26:19,600 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.32 vs. limit=15.0 2023-10-03 07:26:20,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:26:21,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:26:21,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 07:26:21,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:26:21,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:26:23,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 07:26:25,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1185733.3333333333, ans=0.125 2023-10-03 07:26:36,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:26:40,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:26:40,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:26:42,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:26:43,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:26:49,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:26:52,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:26:54,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:26:54,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:26:54,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:26:54,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:26:57,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:26:57,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:27:00,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1185866.6666666667, ans=0.0 2023-10-03 07:27:02,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:27:02,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 07:27:02,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:27:02,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1185933.3333333333, ans=0.2 2023-10-03 07:27:02,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1185933.3333333333, ans=0.125 2023-10-03 07:27:03,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:27:04,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:27:04,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:27:05,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:27:11,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:27:11,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1185933.3333333333, ans=0.125 2023-10-03 07:27:13,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:27:14,928 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 07:27:16,944 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 07:27:16,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:27:18,265 INFO [train.py:1046] (1/4) Epoch 34, batch 2600, loss[loss=0.1644, simple_loss=0.2364, pruned_loss=0.04615, over 23454.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2401, pruned_loss=0.0412, over 4713456.92 frames. ], batch size: 285, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:27:18,348 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 07:27:19,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 07:27:19,677 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 07:27:22,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:27:22,909 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 07:27:24,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 07:27:24,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1186000.0, ans=0.0 2023-10-03 07:27:26,172 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 07:27:27,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:27:29,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 07:27:30,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 07:27:31,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:27:31,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 07:27:33,303 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 07:27:33,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 07:27:33,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1186066.6666666667, ans=0.125 2023-10-03 07:27:38,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:27:38,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:27:38,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:27:38,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 07:27:40,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=1186066.6666666667, ans=0.05 2023-10-03 07:27:41,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:27:42,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1186066.6666666667, ans=0.1 2023-10-03 07:27:46,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1186133.3333333333, ans=0.125 2023-10-03 07:27:47,829 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 07:27:54,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:27:54,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1186133.3333333333, ans=0.2 2023-10-03 07:27:55,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:27:57,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 07:27:57,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:27:57,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:27:58,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 07:27:59,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:28:00,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:28:01,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:28:04,483 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 07:28:05,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:28:05,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:28:08,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:28:09,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:28:09,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 07:28:13,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:28:15,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:28:16,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:28:21,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 07:28:22,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:28:23,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:28:28,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 07:28:28,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:28:28,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:28:29,650 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 07:28:29,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:28:32,269 INFO [train.py:1046] (1/4) Epoch 34, batch 2650, loss[loss=0.1932, simple_loss=0.2634, pruned_loss=0.06153, over 22850.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2412, pruned_loss=0.04173, over 4701674.37 frames. ], batch size: 322, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:28:32,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:28:34,214 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.03 vs. limit=22.5 2023-10-03 07:28:35,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:28:36,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:28:39,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:28:40,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 07:28:40,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:28:41,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:28:43,949 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.874e+02 2.143e+02 2.497e+02 3.678e+02, threshold=4.285e+02, percent-clipped=0.0 2023-10-03 07:28:44,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 07:28:47,246 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 07:28:49,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:28:51,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 07:28:52,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:28:53,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 07:28:57,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:28:57,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:28:57,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:28:57,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:00,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1186466.6666666667, ans=0.0 2023-10-03 07:29:03,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 07:29:03,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 07:29:05,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1186466.6666666667, ans=0.125 2023-10-03 07:29:06,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:29:09,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 07:29:09,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:29:09,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:09,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:29:10,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:29:10,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:29:11,096 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.63 vs. limit=6.0 2023-10-03 07:29:11,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:29:15,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:29:16,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:29:16,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:29:18,825 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.28 vs. limit=22.5 2023-10-03 07:29:19,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:29:20,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:29:22,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:29:22,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:29:25,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:29:25,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 07:29:29,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:30,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:29:30,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:29:30,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 07:29:37,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:29:37,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:38,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:38,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:29:39,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:29:39,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:29:42,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:29:42,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 07:29:45,423 INFO [train.py:1046] (1/4) Epoch 34, batch 2700, loss[loss=0.1792, simple_loss=0.2576, pruned_loss=0.05046, over 23895.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2419, pruned_loss=0.04188, over 4701248.01 frames. ], batch size: 86, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:29:45,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:29:48,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 07:29:51,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:29:51,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:29:51,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:29:51,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1186666.6666666667, ans=0.125 2023-10-03 07:29:53,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:29:53,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:29:53,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:29:53,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:29:53,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 07:29:55,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:29:56,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:29:57,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:29:57,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1186666.6666666667, ans=0.125 2023-10-03 07:29:59,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:30:02,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:30:03,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 07:30:03,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:30:05,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1186733.3333333333, ans=0.1 2023-10-03 07:30:08,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:30:08,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:30:09,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1186733.3333333333, ans=0.0 2023-10-03 07:30:13,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:30:13,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:30:13,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:30:13,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:30:17,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:30:19,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:30:19,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:30:19,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:30:26,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1186800.0, ans=0.125 2023-10-03 07:30:27,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:30:27,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:30:29,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=10.60 vs. limit=22.5 2023-10-03 07:30:33,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:30:33,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:30:34,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1186866.6666666667, ans=0.95 2023-10-03 07:30:37,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:30:37,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:30:40,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:30:42,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:30:43,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:30:44,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:30:46,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:30:46,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:30:47,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:30:49,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:30:49,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:30:52,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 07:30:53,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:30:55,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:30:55,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 07:30:56,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 07:30:56,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:30:58,703 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.51 vs. limit=15.0 2023-10-03 07:30:59,384 INFO [train.py:1046] (1/4) Epoch 34, batch 2750, loss[loss=0.1698, simple_loss=0.2432, pruned_loss=0.04824, over 23532.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2409, pruned_loss=0.04155, over 4700809.45 frames. ], batch size: 119, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:30:59,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:30:59,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:31:02,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:02,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:31:03,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:06,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:31:06,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 07:31:07,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:31:07,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:07,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 07:31:07,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:31:07,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:31:10,304 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.859e+02 2.069e+02 2.359e+02 3.471e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-03 07:31:13,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1187066.6666666667, ans=0.125 2023-10-03 07:31:17,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 07:31:19,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:31:19,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:19,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:31:20,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 07:31:21,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:31:22,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:31:23,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:31:23,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:31:28,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:31:28,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 07:31:29,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:31:30,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:31,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 07:31:38,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:31:39,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:31:40,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:31:43,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:43,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:31:43,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:31:49,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:31:49,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:31:49,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 07:31:53,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:31:55,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1187200.0, ans=0.09899494936611666 2023-10-03 07:31:56,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 07:31:58,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1187266.6666666667, ans=0.125 2023-10-03 07:32:00,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 07:32:01,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1187266.6666666667, ans=0.125 2023-10-03 07:32:03,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:32:03,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 07:32:04,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:32:06,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:32:06,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 07:32:06,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:32:07,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 07:32:07,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1187266.6666666667, ans=0.125 2023-10-03 07:32:09,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:32:09,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:32:09,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 07:32:09,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:32:11,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:32:12,369 INFO [train.py:1046] (1/4) Epoch 34, batch 2800, loss[loss=0.1769, simple_loss=0.2635, pruned_loss=0.04519, over 24021.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2395, pruned_loss=0.04088, over 4702624.66 frames. ], batch size: 86, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:32:12,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:32:13,821 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 07:32:13,822 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 07:32:16,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:32:20,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:32:20,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:32:23,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:32:25,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 07:32:26,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 07:32:28,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 07:32:28,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:32:29,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:32:29,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:32:33,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:32:33,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:32:33,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 07:32:35,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:32:41,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:32:42,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1187466.6666666667, ans=0.125 2023-10-03 07:32:44,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:32:46,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:32:46,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:32:48,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:32:48,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1187466.6666666667, ans=0.125 2023-10-03 07:32:52,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:32:52,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 07:32:54,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:32:54,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:32:54,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:32:58,297 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.75 vs. limit=15.0 2023-10-03 07:32:58,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:32:59,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:33:01,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:33:04,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:33:04,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:33:04,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:33:05,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:33:05,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:33:07,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:33:07,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 07:33:07,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:33:08,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:33:08,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:33:08,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1187533.3333333333, ans=0.0 2023-10-03 07:33:10,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 07:33:12,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:33:12,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:33:13,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:33:14,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 07:33:22,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:33:22,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:33:23,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:33:25,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:33:27,108 INFO [train.py:1046] (1/4) Epoch 34, batch 2850, loss[loss=0.1742, simple_loss=0.2515, pruned_loss=0.0484, over 24015.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2383, pruned_loss=0.0407, over 4701168.35 frames. ], batch size: 86, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:33:29,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:33:29,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:33:29,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:33:32,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:33:34,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:33:35,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:33:35,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 07:33:38,280 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.832e+02 1.949e+02 2.119e+02 3.126e+02, threshold=3.897e+02, percent-clipped=0.0 2023-10-03 07:33:41,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 07:33:41,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:33:42,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 07:33:44,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:33:47,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 07:33:47,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 07:33:48,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:33:49,486 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.74 vs. limit=10.0 2023-10-03 07:34:01,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:34:03,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:34:03,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:34:03,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 07:34:03,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:34:03,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:34:06,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:34:06,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 07:34:06,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1187800.0, ans=0.125 2023-10-03 07:34:07,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:34:08,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:34:08,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:34:08,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:12,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:34:12,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:34:13,251 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:34:14,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:34:14,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:34:16,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:34:17,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:17,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:34:20,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:34:22,750 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.79 vs. limit=6.0 2023-10-03 07:34:25,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:34:25,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1187933.3333333333, ans=0.07 2023-10-03 07:34:27,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 07:34:27,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 07:34:28,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:34:28,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:34:29,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 07:34:31,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:34:31,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:34:31,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:34:32,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:34:32,645 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 07:34:32,697 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 07:34:32,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:34:32,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:34:34,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1187933.3333333333, ans=0.0 2023-10-03 07:34:36,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 07:34:36,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:34:38,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:34:38,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 07:34:40,969 INFO [train.py:1046] (1/4) Epoch 34, batch 2900, loss[loss=0.1675, simple_loss=0.252, pruned_loss=0.04149, over 23682.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2389, pruned_loss=0.04095, over 4695833.31 frames. ], batch size: 85, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:34:41,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:41,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 07:34:42,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 07:34:45,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:34:45,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:34:48,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:34:48,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:34:53,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:34:53,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:56,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:34:57,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 07:34:57,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:34:59,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:35:01,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1188066.6666666667, ans=0.125 2023-10-03 07:35:02,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 07:35:02,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 07:35:05,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:35:05,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 07:35:05,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:35:08,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:35:08,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 07:35:11,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:35:13,035 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.66 vs. limit=6.0 2023-10-03 07:35:13,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:35:15,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:35:18,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:35:21,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 07:35:21,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 07:35:21,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:35:26,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:35:27,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 07:35:28,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:35:33,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1188200.0, ans=0.125 2023-10-03 07:35:34,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:35:43,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:35:43,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:35:43,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 07:35:46,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:35:46,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 07:35:48,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:35:48,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:35:53,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:35:53,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1188266.6666666667, ans=0.125 2023-10-03 07:35:54,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 07:35:55,839 INFO [train.py:1046] (1/4) Epoch 34, batch 2950, loss[loss=0.1762, simple_loss=0.2496, pruned_loss=0.05134, over 23754.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2392, pruned_loss=0.04097, over 4696889.21 frames. ], batch size: 212, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:35:55,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:35:55,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:35:57,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:35:57,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1188333.3333333333, ans=0.125 2023-10-03 07:35:59,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:36:00,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 07:36:00,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 07:36:00,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:36:00,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:36:05,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:36:07,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:36:08,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1188333.3333333333, ans=0.2 2023-10-03 07:36:08,912 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.911e+02 2.066e+02 2.316e+02 3.128e+02, threshold=4.131e+02, percent-clipped=0.0 2023-10-03 07:36:10,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:36:10,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:36:13,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:36:13,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:36:14,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:36:16,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:36:16,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:36:19,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 07:36:25,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 07:36:25,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1188466.6666666667, ans=0.0 2023-10-03 07:36:26,309 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 07:36:26,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:36:29,635 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 07:36:31,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 07:36:31,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:36:31,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:36:31,256 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 07:36:31,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 07:36:33,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 07:36:35,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:36:35,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:36:35,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1188466.6666666667, ans=0.125 2023-10-03 07:36:35,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1188466.6666666667, ans=0.04949747468305833 2023-10-03 07:36:38,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:36:39,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:36:39,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:36:39,826 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 07:36:39,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:36:41,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 07:36:46,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:36:48,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:36:48,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 07:36:49,235 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.16 vs. limit=12.0 2023-10-03 07:36:49,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:36:51,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 07:36:52,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:36:54,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:36:56,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:36:57,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:36:57,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 07:36:58,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:36:58,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:36:58,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:37:00,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:37:00,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:37:02,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:37:03,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:37:03,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 07:37:04,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:37:06,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:37:06,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:37:09,471 INFO [train.py:1046] (1/4) Epoch 34, batch 3000, loss[loss=0.1694, simple_loss=0.2511, pruned_loss=0.04379, over 24608.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2399, pruned_loss=0.0411, over 4714421.61 frames. ], batch size: 60, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:37:09,471 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 07:37:15,397 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.1268, 5.5188, 5.7307, 5.8982], device='cuda:1') 2023-10-03 07:37:21,180 INFO [train.py:1078] (1/4) Epoch 34, validation: loss=0.3506, simple_loss=0.2704, pruned_loss=0.2154, over 1125622.00 frames. 2023-10-03 07:37:21,181 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 07:37:21,300 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 07:37:22,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 07:37:23,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1188666.6666666667, ans=0.1 2023-10-03 07:37:24,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:37:24,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:37:25,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 07:37:25,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:37:32,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:37:40,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:37:43,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1188733.3333333333, ans=0.0 2023-10-03 07:37:46,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 07:37:48,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:37:49,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:37:49,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:37:51,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:37:53,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:37:53,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 07:37:54,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 07:37:55,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:37:56,121 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1188800.0, ans=0.125 2023-10-03 07:37:57,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 07:37:59,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:37:59,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:37:59,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:37:59,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:37:59,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1188800.0, ans=0.2 2023-10-03 07:38:03,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:38:03,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:38:03,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:38:05,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:38:07,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 07:38:09,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:38:10,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:38:10,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:38:14,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:38:14,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:38:16,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 07:38:16,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 07:38:16,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:38:17,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1188866.6666666667, ans=0.0 2023-10-03 07:38:18,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 07:38:18,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:38:19,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 07:38:21,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:38:21,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1188933.3333333333, ans=0.0 2023-10-03 07:38:24,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:38:24,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 07:38:25,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 07:38:25,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:38:25,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:38:26,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:38:26,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:38:26,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:38:28,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:38:32,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 07:38:34,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:38:34,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1189000.0, ans=0.0 2023-10-03 07:38:35,344 INFO [train.py:1046] (1/4) Epoch 34, batch 3050, loss[loss=0.1557, simple_loss=0.2325, pruned_loss=0.03947, over 24664.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2416, pruned_loss=0.042, over 4709190.22 frames. ], batch size: 65, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:38:36,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:38:36,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:38:40,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:38:43,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 07:38:47,568 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.885e+02 2.088e+02 2.309e+02 3.994e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-03 07:38:50,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 07:38:50,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 07:38:52,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:38:54,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1189066.6666666667, ans=0.0 2023-10-03 07:38:56,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:38:58,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:38:58,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1189066.6666666667, ans=0.1 2023-10-03 07:38:59,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:38:59,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:39:02,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:39:02,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:39:02,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:39:02,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1189133.3333333333, ans=0.2 2023-10-03 07:39:04,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:39:04,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:39:05,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:39:07,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:39:11,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:39:11,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 07:39:11,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:39:11,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:39:15,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:39:15,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:39:16,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:39:16,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:39:22,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:39:22,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:39:27,443 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.97 vs. limit=15.0 2023-10-03 07:39:28,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:39:28,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:39:28,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:39:31,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:39:31,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 07:39:31,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:39:32,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 07:39:34,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:39:34,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:39:35,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 07:39:37,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:39:43,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:39:44,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:39:47,527 INFO [train.py:1046] (1/4) Epoch 34, batch 3100, loss[loss=0.1665, simple_loss=0.2445, pruned_loss=0.04423, over 24068.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2404, pruned_loss=0.04139, over 4713409.59 frames. ], batch size: 86, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:39:47,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:39:47,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1189333.3333333333, ans=0.125 2023-10-03 07:39:47,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1189333.3333333333, ans=0.125 2023-10-03 07:39:49,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 07:39:51,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 07:39:51,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 07:39:54,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:39:58,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:39:58,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:39:59,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 07:40:03,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:40:08,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 07:40:12,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 07:40:13,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:13,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:40:14,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:40:15,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 07:40:17,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:40:17,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 07:40:17,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:40:18,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:40:18,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 07:40:20,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:40:20,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1189466.6666666667, ans=0.0 2023-10-03 07:40:21,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:40:23,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 07:40:24,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 07:40:26,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:27,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:40:29,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:40:29,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:29,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:40:30,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:40:30,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:40:34,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:40:34,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:40:34,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:34,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 07:40:38,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:40:40,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 07:40:41,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:40:41,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 07:40:41,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:40:41,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:43,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 07:40:53,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 07:40:56,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:40:58,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:41:00,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:41:00,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:41:00,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 07:41:02,590 INFO [train.py:1046] (1/4) Epoch 34, batch 3150, loss[loss=0.166, simple_loss=0.2472, pruned_loss=0.04241, over 24493.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2388, pruned_loss=0.04108, over 4707341.93 frames. ], batch size: 63, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:41:02,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:41:02,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 07:41:04,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 07:41:05,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:41:07,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1189666.6666666667, ans=0.0 2023-10-03 07:41:08,621 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 07:41:10,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 07:41:10,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:41:11,413 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 07:41:11,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 07:41:14,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 07:41:14,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 07:41:14,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 07:41:14,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:41:14,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:41:15,390 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.879e+02 2.066e+02 2.438e+02 3.109e+02, threshold=4.131e+02, percent-clipped=0.0 2023-10-03 07:41:16,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:41:17,718 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.56 vs. limit=15.0 2023-10-03 07:41:18,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 07:41:18,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:41:18,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:41:19,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:41:19,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:41:25,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 07:41:25,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:41:30,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:41:30,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:41:30,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 07:41:34,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 07:41:34,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:41:35,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 07:41:35,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 07:41:36,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:41:36,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:41:37,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:41:37,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 07:41:38,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 07:41:39,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:41:39,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:41:42,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:41:42,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:41:42,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 07:41:43,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:41:45,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 07:41:46,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:41:46,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 07:41:47,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 07:41:50,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:41:50,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:41:52,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 07:41:52,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 07:41:54,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:41:55,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:41:56,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:41:58,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:42:03,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:42:03,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:42:05,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 07:42:07,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1189933.3333333333, ans=0.125 2023-10-03 07:42:10,404 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.09 vs. limit=10.0 2023-10-03 07:42:11,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:42:11,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:42:13,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:42:15,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:42:15,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 07:42:16,677 INFO [train.py:1046] (1/4) Epoch 34, batch 3200, loss[loss=0.1658, simple_loss=0.2384, pruned_loss=0.0466, over 23163.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.238, pruned_loss=0.04086, over 4706441.53 frames. ], batch size: 105, lr: 2.99e-03, grad_scale: 32.0 2023-10-03 07:42:16,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:42:20,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:42:23,460 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.88 vs. limit=15.0 2023-10-03 07:42:24,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:42:34,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:42:41,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1190066.6666666667, ans=0.0 2023-10-03 07:42:44,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 07:42:46,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:42:50,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 07:42:50,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 07:42:53,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:42:53,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:42:54,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:42:58,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 07:42:59,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 07:43:02,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 07:43:02,745 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.05 vs. limit=15.0 2023-10-03 07:43:04,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=1190200.0, ans=15.0 2023-10-03 07:43:05,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 07:43:08,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:43:12,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:43:12,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:43:14,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:43:14,317 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 07:43:14,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:43:17,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1190266.6666666667, ans=0.1 2023-10-03 07:43:18,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:43:18,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1190266.6666666667, ans=0.125 2023-10-03 07:43:19,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 07:43:19,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 07:43:21,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 07:43:21,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 07:43:22,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:43:24,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:43:24,746 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 07:43:25,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:43:25,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:26,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1190266.6666666667, ans=0.125 2023-10-03 07:43:27,253 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 07:43:30,667 INFO [train.py:1046] (1/4) Epoch 34, batch 3250, loss[loss=0.1603, simple_loss=0.2557, pruned_loss=0.0324, over 24372.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2386, pruned_loss=0.04066, over 4719510.12 frames. ], batch size: 74, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:43:30,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:43:33,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:43:44,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:43:44,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 07:43:45,397 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.836e+02 2.055e+02 2.276e+02 3.582e+02, threshold=4.110e+02, percent-clipped=0.0 2023-10-03 07:43:45,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:43:46,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:43:46,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:43:48,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:43:49,069 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=26.62 vs. limit=22.5 2023-10-03 07:43:49,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 07:43:49,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1190400.0, ans=0.0 2023-10-03 07:43:52,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:52,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:43:52,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:43:52,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:52,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:53,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:43:55,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:43:56,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:43:59,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:43:59,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:44:00,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:44:00,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:44:00,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:44:04,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 07:44:06,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:44:06,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:44:08,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:44:09,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:44:15,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:44:15,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1190533.3333333333, ans=0.2 2023-10-03 07:44:22,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:44:22,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:44:22,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 07:44:22,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:44:22,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 07:44:22,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:44:25,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 07:44:25,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1190533.3333333333, ans=0.125 2023-10-03 07:44:26,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 07:44:26,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:44:26,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1190533.3333333333, ans=0.125 2023-10-03 07:44:28,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1190600.0, ans=0.125 2023-10-03 07:44:29,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:44:29,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:44:29,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 07:44:30,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:44:31,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1190600.0, ans=0.2 2023-10-03 07:44:34,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:44:34,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:44:36,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 07:44:36,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:44:40,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:44:40,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 07:44:43,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:44:43,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 07:44:45,003 INFO [train.py:1046] (1/4) Epoch 34, batch 3300, loss[loss=0.1739, simple_loss=0.2626, pruned_loss=0.04259, over 24447.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2388, pruned_loss=0.04057, over 4712201.68 frames. ], batch size: 69, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:44:45,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 07:44:45,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 07:44:47,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:44:50,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:44:51,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:44:51,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:44:54,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 07:44:54,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:44:56,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:44:58,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:45:02,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 07:45:02,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:45:03,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:45:05,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:06,682 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 07:45:06,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:45:08,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 07:45:08,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:45:08,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:45:08,310 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 07:45:13,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:45:13,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:45:15,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:15,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 07:45:16,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 07:45:16,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:17,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1190800.0, ans=0.125 2023-10-03 07:45:18,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:45:18,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1190800.0, ans=0.125 2023-10-03 07:45:19,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1190800.0, ans=0.125 2023-10-03 07:45:20,967 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 07:45:21,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 07:45:21,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1190800.0, ans=0.09899494936611666 2023-10-03 07:45:21,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1190800.0, ans=0.125 2023-10-03 07:45:22,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:45:23,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 07:45:26,708 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:45:27,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:45:29,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:45:30,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:45:32,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:45:33,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:45:33,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:45:33,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:45:36,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:45:36,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:36,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:45:38,322 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 07:45:40,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 07:45:41,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 07:45:41,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:45:41,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:45:44,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:45:44,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:45:46,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:45:46,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:45:46,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:45:47,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:49,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:45:50,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 07:45:52,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:45:53,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:45:54,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:45:54,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:45:56,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:45:57,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:45:57,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:45:58,668 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.85 vs. limit=15.0 2023-10-03 07:45:59,062 INFO [train.py:1046] (1/4) Epoch 34, batch 3350, loss[loss=0.1558, simple_loss=0.2464, pruned_loss=0.0326, over 24665.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2399, pruned_loss=0.04088, over 4716514.99 frames. ], batch size: 73, lr: 2.99e-03, grad_scale: 8.0 2023-10-03 07:45:59,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:46:00,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:00,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:46:03,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:06,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1191000.0, ans=0.125 2023-10-03 07:46:07,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:46:08,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:46:09,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:46:09,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 07:46:11,437 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 07:46:11,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:46:14,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 07:46:14,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 07:46:16,358 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.949e+02 2.135e+02 2.589e+02 3.898e+02, threshold=4.270e+02, percent-clipped=0.0 2023-10-03 07:46:16,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:46:16,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:46:16,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1191066.6666666667, ans=0.125 2023-10-03 07:46:17,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:46:17,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 07:46:17,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:17,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:46:19,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:20,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:20,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:22,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:46:24,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:46:25,601 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.48 vs. limit=22.5 2023-10-03 07:46:27,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:28,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:46:30,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:46:32,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:34,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:34,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:46:37,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:46:37,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1191133.3333333333, ans=0.125 2023-10-03 07:46:38,231 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.75 vs. limit=15.0 2023-10-03 07:46:39,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 07:46:39,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:46:39,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 07:46:40,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:46:41,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 07:46:43,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:46:44,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:44,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1191200.0, ans=0.0 2023-10-03 07:46:52,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:46:53,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 07:46:54,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:46:56,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:46:57,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:47:01,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:47:03,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 07:47:05,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:47:05,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:47:06,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:47:08,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 07:47:08,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:47:08,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 07:47:11,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:47:12,409 INFO [train.py:1046] (1/4) Epoch 34, batch 3400, loss[loss=0.1522, simple_loss=0.2437, pruned_loss=0.03032, over 24571.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2403, pruned_loss=0.04079, over 4728516.36 frames. ], batch size: 71, lr: 2.99e-03, grad_scale: 8.0 2023-10-03 07:47:12,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:47:13,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:47:13,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:47:15,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 07:47:18,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 07:47:18,609 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 07:47:18,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:47:23,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:47:23,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:47:24,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:47:24,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:47:28,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:47:30,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 07:47:33,692 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.16 vs. limit=22.5 2023-10-03 07:47:35,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:47:39,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:47:39,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:47:40,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 07:47:40,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=1191466.6666666667, ans=0.95 2023-10-03 07:47:42,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1191466.6666666667, ans=0.125 2023-10-03 07:47:44,099 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.66 vs. limit=22.5 2023-10-03 07:47:44,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:47:49,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 07:47:56,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:47:58,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:47:58,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 07:47:58,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:47:58,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:47:59,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:48:00,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:48:01,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1191533.3333333333, ans=0.125 2023-10-03 07:48:02,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:48:02,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1191533.3333333333, ans=0.125 2023-10-03 07:48:05,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:48:05,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:48:08,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1191533.3333333333, ans=0.0 2023-10-03 07:48:12,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:48:14,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 07:48:18,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 07:48:22,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 07:48:26,335 INFO [train.py:1046] (1/4) Epoch 34, batch 3450, loss[loss=0.1716, simple_loss=0.2254, pruned_loss=0.05888, over 19537.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2397, pruned_loss=0.04062, over 4724800.14 frames. ], batch size: 388, lr: 2.99e-03, grad_scale: 4.0 2023-10-03 07:48:26,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 07:48:26,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1191666.6666666667, ans=0.0 2023-10-03 07:48:27,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:48:29,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:48:29,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 07:48:29,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:48:33,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:48:33,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1191666.6666666667, ans=0.07 2023-10-03 07:48:37,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:48:39,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:48:39,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:48:39,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:48:42,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:48:44,088 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.945e+02 2.174e+02 2.507e+02 5.518e+02, threshold=4.348e+02, percent-clipped=2.0 2023-10-03 07:48:47,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1191733.3333333333, ans=0.1 2023-10-03 07:48:48,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 07:48:51,778 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.30 vs. limit=15.0 2023-10-03 07:48:54,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 07:48:55,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 07:48:55,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:48:57,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:49:01,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 07:49:02,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:49:02,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1191800.0, ans=0.0 2023-10-03 07:49:03,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1191800.0, ans=0.1 2023-10-03 07:49:06,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:49:06,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:49:08,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:49:10,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:49:11,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 07:49:11,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:49:11,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:49:13,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:49:15,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 07:49:20,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:49:23,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:49:24,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:49:27,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:49:33,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:49:33,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:49:34,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:49:35,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:49:38,912 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.87 vs. limit=8.0 2023-10-03 07:49:39,504 INFO [train.py:1046] (1/4) Epoch 34, batch 3500, loss[loss=0.1409, simple_loss=0.2201, pruned_loss=0.03084, over 24339.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2379, pruned_loss=0.04033, over 4706034.80 frames. ], batch size: 56, lr: 2.99e-03, grad_scale: 8.0 2023-10-03 07:49:41,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:49:42,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:49:44,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 07:49:45,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 07:49:48,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 07:49:51,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:49:51,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 07:49:56,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:49:58,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:49:58,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:49:58,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:49:59,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:49:59,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:00,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:50:00,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 07:50:02,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:03,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:50:04,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:50:10,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:12,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 07:50:12,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:50:14,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:50:14,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1192133.3333333333, ans=0.1 2023-10-03 07:50:15,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:50:15,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1192133.3333333333, ans=0.125 2023-10-03 07:50:16,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:18,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:50:19,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:50:20,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 07:50:22,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 07:50:24,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 07:50:24,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:50:25,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:27,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:50:27,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:50:30,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:50:31,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:50:35,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:50:38,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 07:50:38,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 07:50:38,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:50:39,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:50:39,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:50:41,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:43,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1192266.6666666667, ans=0.09899494936611666 2023-10-03 07:50:45,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 07:50:46,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:50:48,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:50:49,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 07:50:50,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 07:50:52,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:53,617 INFO [train.py:1046] (1/4) Epoch 34, batch 3550, loss[loss=0.1602, simple_loss=0.2495, pruned_loss=0.03549, over 24304.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2379, pruned_loss=0.04009, over 4713936.09 frames. ], batch size: 74, lr: 2.99e-03, grad_scale: 8.0 2023-10-03 07:50:53,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:50:53,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:50:55,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:50:55,831 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:50:57,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:50:57,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1192333.3333333333, ans=0.125 2023-10-03 07:51:05,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:51:07,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 07:51:10,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1192400.0, ans=0.125 2023-10-03 07:51:11,509 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.871e+02 2.039e+02 2.227e+02 3.484e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 07:51:11,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:51:11,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:51:11,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1192400.0, ans=0.125 2023-10-03 07:51:13,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:51:13,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:51:13,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:51:17,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:51:17,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:51:17,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:51:19,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:51:20,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:51:25,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:51:25,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:51:27,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:51:27,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:51:28,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:51:28,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 07:51:28,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:51:30,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:51:31,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 07:51:35,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:51:35,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:51:37,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:51:38,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 07:51:40,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:51:40,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 07:51:40,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:51:41,455 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.10 vs. limit=15.0 2023-10-03 07:51:42,477 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.00 vs. limit=12.0 2023-10-03 07:51:43,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:51:43,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:51:44,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=1192533.3333333333, ans=15.0 2023-10-03 07:51:46,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 07:51:47,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:51:52,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:51:53,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 07:51:54,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:51:57,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:51:58,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 07:51:59,292 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.09 vs. limit=15.0 2023-10-03 07:52:02,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1192600.0, ans=0.07 2023-10-03 07:52:04,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 07:52:05,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:52:06,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:52:08,112 INFO [train.py:1046] (1/4) Epoch 34, batch 3600, loss[loss=0.1602, simple_loss=0.2482, pruned_loss=0.03608, over 24468.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2372, pruned_loss=0.0401, over 4704587.71 frames. ], batch size: 66, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:52:08,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:52:08,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:52:09,624 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.54 vs. limit=10.0 2023-10-03 07:52:11,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:52:14,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:52:14,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1192666.6666666667, ans=0.125 2023-10-03 07:52:16,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1192666.6666666667, ans=0.125 2023-10-03 07:52:17,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:52:18,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:52:18,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:52:19,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:52:19,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 07:52:23,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:52:25,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:52:28,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:52:30,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:52:31,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:52:32,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:52:32,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 07:52:32,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:52:36,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:52:38,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:52:38,852 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.66 vs. limit=6.0 2023-10-03 07:52:39,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:52:40,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:52:42,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:52:42,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 07:52:50,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:52:51,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 07:52:51,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 07:52:57,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:53:02,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:53:05,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:53:07,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1192933.3333333333, ans=0.125 2023-10-03 07:53:09,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:53:09,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:53:09,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 07:53:11,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 07:53:13,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 07:53:15,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:53:15,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:53:15,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 07:53:17,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:53:17,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:53:17,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:53:18,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 07:53:19,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1192933.3333333333, ans=0.125 2023-10-03 07:53:19,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1192933.3333333333, ans=0.1 2023-10-03 07:53:20,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 07:53:21,543 INFO [train.py:1046] (1/4) Epoch 34, batch 3650, loss[loss=0.1725, simple_loss=0.2602, pruned_loss=0.04237, over 23935.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2381, pruned_loss=0.04026, over 4706818.98 frames. ], batch size: 80, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:53:22,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:53:23,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1193000.0, ans=0.025 2023-10-03 07:53:24,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 07:53:25,225 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.02 vs. limit=22.5 2023-10-03 07:53:29,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 07:53:30,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:53:30,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1193000.0, ans=0.1 2023-10-03 07:53:35,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 07:53:36,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 07:53:39,302 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.432e+02 1.892e+02 2.047e+02 2.269e+02 3.053e+02, threshold=4.094e+02, percent-clipped=0.0 2023-10-03 07:53:39,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:53:39,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:53:39,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 07:53:42,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:53:42,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:53:44,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 07:53:44,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:53:44,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:53:45,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 07:53:46,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:53:47,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:53:47,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:53:51,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:53:53,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 07:53:54,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 07:53:55,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:53:58,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 07:53:59,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:53:59,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:54:06,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:54:06,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:54:06,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:54:08,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1193200.0, ans=0.0 2023-10-03 07:54:09,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:54:10,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:54:12,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:54:14,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:54:15,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:54:15,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:54:18,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:54:19,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:54:19,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:54:25,733 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 07:54:29,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:54:29,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:54:30,354 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.05 vs. limit=15.0 2023-10-03 07:54:31,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:54:31,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1193266.6666666667, ans=0.5 2023-10-03 07:54:32,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:54:34,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:54:35,822 INFO [train.py:1046] (1/4) Epoch 34, batch 3700, loss[loss=0.1542, simple_loss=0.2344, pruned_loss=0.037, over 24587.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2393, pruned_loss=0.04084, over 4712640.34 frames. ], batch size: 60, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:54:35,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:54:37,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 07:54:37,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:54:39,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:54:40,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:54:42,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:54:43,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:54:43,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 07:54:44,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:54:44,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 07:54:46,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:54:47,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1193333.3333333333, ans=0.1 2023-10-03 07:54:49,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 07:54:52,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:54:52,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:54:54,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:54:55,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:54:55,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:54:55,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1193400.0, ans=0.0 2023-10-03 07:54:56,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:54:57,166 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:54:58,304 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 07:55:05,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:55:05,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1193466.6666666667, ans=10.0 2023-10-03 07:55:07,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 07:55:08,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:55:10,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 07:55:10,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:55:13,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:55:14,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 07:55:15,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:55:17,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:55:18,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:55:19,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:55:21,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 07:55:26,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:55:26,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 07:55:27,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:55:27,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 07:55:32,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:55:32,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:55:35,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:55:35,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1193600.0, ans=0.0 2023-10-03 07:55:36,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 07:55:36,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1193600.0, ans=0.0 2023-10-03 07:55:38,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:55:38,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:55:39,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:55:39,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:55:43,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:55:43,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 07:55:45,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 07:55:45,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:55:45,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:55:47,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:55:48,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:55:50,548 INFO [train.py:1046] (1/4) Epoch 34, batch 3750, loss[loss=0.1369, simple_loss=0.2106, pruned_loss=0.03155, over 24442.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2406, pruned_loss=0.04132, over 4716925.53 frames. ], batch size: 58, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:55:50,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:55:52,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:55:52,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1193666.6666666667, ans=0.1 2023-10-03 07:55:53,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:55:55,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 07:55:56,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 07:55:58,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:55:58,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 07:55:58,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:55:59,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:56:00,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:56:02,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:56:03,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:56:05,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:56:07,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:56:07,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1193733.3333333333, ans=0.0 2023-10-03 07:56:08,416 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.948e+02 2.234e+02 2.708e+02 3.464e+02, threshold=4.468e+02, percent-clipped=0.0 2023-10-03 07:56:08,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1193733.3333333333, ans=0.2 2023-10-03 07:56:10,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:56:12,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:56:13,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 07:56:14,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:56:16,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:56:16,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:56:19,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 07:56:23,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 07:56:25,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:56:25,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:56:26,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:56:29,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:56:30,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:56:33,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 07:56:37,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:56:40,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:56:40,290 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1193866.6666666667, ans=0.125 2023-10-03 07:56:41,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:56:42,488 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.87 vs. limit=6.0 2023-10-03 07:56:44,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:56:49,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:56:50,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 07:56:52,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:56:53,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:56:54,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:56:55,563 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.33 vs. limit=15.0 2023-10-03 07:57:03,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:57:04,525 INFO [train.py:1046] (1/4) Epoch 34, batch 3800, loss[loss=0.179, simple_loss=0.2609, pruned_loss=0.0486, over 23959.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2414, pruned_loss=0.04187, over 4709914.85 frames. ], batch size: 86, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:57:06,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1194000.0, ans=0.125 2023-10-03 07:57:07,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:57:08,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:57:09,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 07:57:10,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:57:12,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:57:14,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:57:17,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 07:57:17,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:57:19,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:57:20,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:57:20,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:57:21,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:57:23,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 07:57:27,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 07:57:27,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:57:30,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:57:32,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:57:34,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 07:57:35,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 07:57:35,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:57:38,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:57:38,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1194133.3333333333, ans=0.1 2023-10-03 07:57:39,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:57:43,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1194133.3333333333, ans=0.125 2023-10-03 07:57:44,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:57:44,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 07:57:44,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1194133.3333333333, ans=0.2 2023-10-03 07:57:46,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:57:54,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:57:55,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1194200.0, ans=0.025 2023-10-03 07:57:59,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:58:02,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 07:58:03,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 07:58:05,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:58:06,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:58:06,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:58:08,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 07:58:12,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 07:58:12,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 07:58:12,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:58:13,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:58:17,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:58:19,192 INFO [train.py:1046] (1/4) Epoch 34, batch 3850, loss[loss=0.1522, simple_loss=0.2305, pruned_loss=0.03691, over 23591.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2392, pruned_loss=0.04158, over 4702698.10 frames. ], batch size: 149, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:58:19,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:58:24,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:58:25,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 07:58:27,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:58:27,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:58:31,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:58:33,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:58:33,278 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:58:35,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:58:36,898 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.877e+02 2.078e+02 2.275e+02 4.210e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-03 07:58:36,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 07:58:43,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:58:44,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1194400.0, ans=0.125 2023-10-03 07:58:45,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:58:48,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:58:48,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:58:51,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:58:51,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:58:53,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:58:53,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:58:53,876 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1194466.6666666667, ans=0.025 2023-10-03 07:58:53,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1194466.6666666667, ans=0.125 2023-10-03 07:58:53,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1194466.6666666667, ans=0.1 2023-10-03 07:58:55,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:58:57,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:58:57,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1194466.6666666667, ans=0.2 2023-10-03 07:58:57,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1194466.6666666667, ans=0.0 2023-10-03 07:58:58,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:58:58,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:58:59,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 07:58:59,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 07:59:01,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:59:01,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:59:04,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:04,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:59:04,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 07:59:04,658 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.68 vs. limit=12.0 2023-10-03 07:59:05,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1194533.3333333333, ans=0.125 2023-10-03 07:59:06,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 07:59:08,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:11,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 07:59:12,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:59:16,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:18,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:59:21,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:21,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 07:59:26,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 07:59:27,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:59:27,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:59:29,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:59:29,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:59:31,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:31,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:31,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:59:32,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 07:59:32,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:59:34,197 INFO [train.py:1046] (1/4) Epoch 34, batch 3900, loss[loss=0.148, simple_loss=0.2194, pruned_loss=0.0383, over 23616.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2381, pruned_loss=0.04163, over 4682434.95 frames. ], batch size: 256, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:59:34,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 07:59:35,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:35,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:59:37,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:59:37,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:39,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:59:39,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:59:39,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:41,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:59:41,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 07:59:41,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:45,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:59:45,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:59:46,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:59:48,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:59:50,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:59:51,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:51,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1194733.3333333333, ans=0.125 2023-10-03 07:59:53,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:59:54,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 07:59:54,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:59:56,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 07:59:56,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:57,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 07:59:59,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 08:00:02,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1194800.0, ans=0.125 2023-10-03 08:00:03,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:00:03,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:00:03,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:00:03,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:00:09,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:00:10,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:00:13,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:00:13,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:00:13,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:00:18,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:00:20,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:00:24,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1194866.6666666667, ans=0.0 2023-10-03 08:00:26,838 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.33 vs. limit=22.5 2023-10-03 08:00:27,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:00:28,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:00:31,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1194866.6666666667, ans=0.125 2023-10-03 08:00:37,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:00:39,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:00:40,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 08:00:40,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 08:00:40,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:00:42,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 08:00:42,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1194933.3333333333, ans=0.04949747468305833 2023-10-03 08:00:43,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:00:43,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 08:00:44,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1194933.3333333333, ans=0.125 2023-10-03 08:00:47,991 INFO [train.py:1046] (1/4) Epoch 34, batch 3950, loss[loss=0.1604, simple_loss=0.2388, pruned_loss=0.04105, over 23772.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.238, pruned_loss=0.04125, over 4687860.14 frames. ], batch size: 179, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:00:50,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:00:52,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 08:00:52,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:00:54,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:00:57,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:01:01,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1195000.0, ans=0.125 2023-10-03 08:01:02,405 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 08:01:03,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:01:03,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 08:01:05,104 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 08:01:05,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:01:06,449 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.876e+02 2.006e+02 2.250e+02 3.004e+02, threshold=4.013e+02, percent-clipped=0.0 2023-10-03 08:01:06,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:01:06,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:01:06,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:01:09,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 08:01:10,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:01:10,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:01:10,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:01:12,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:01:12,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:01:24,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:01:24,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:01:30,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 08:01:36,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 08:01:36,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 08:01:36,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:01:37,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:01:40,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1195200.0, ans=0.125 2023-10-03 08:01:42,912 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.21 vs. limit=15.0 2023-10-03 08:01:45,179 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.88 vs. limit=15.0 2023-10-03 08:01:45,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:01:45,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:01:46,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:01:47,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:01:47,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 08:01:52,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:01:53,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:01:57,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 08:02:02,688 INFO [train.py:1046] (1/4) Epoch 34, batch 4000, loss[loss=0.1588, simple_loss=0.2363, pruned_loss=0.04061, over 22934.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2388, pruned_loss=0.0414, over 4702417.71 frames. ], batch size: 322, lr: 2.99e-03, grad_scale: 32.0 2023-10-03 08:02:06,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:02:12,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:02:17,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:02:18,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:02:18,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:02:20,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 08:02:20,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:02:20,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 08:02:20,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:02:20,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 08:02:22,664 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.47 vs. limit=15.0 2023-10-03 08:02:23,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:02:26,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:02:26,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:02:26,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:02:26,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:02:26,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:02:28,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:02:28,469 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:02:29,453 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 08:02:29,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:02:29,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:02:32,796 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 08:02:34,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 08:02:34,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:02:40,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 08:02:40,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:02:42,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:02:43,487 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 08:02:43,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:02:44,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 08:02:44,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:02:44,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:02:46,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:02:48,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:02:48,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:02:49,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:02:51,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 08:02:51,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:02:52,561 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 08:02:59,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:02:59,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1195533.3333333333, ans=0.125 2023-10-03 08:03:02,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 08:03:03,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:03:03,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:03:05,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:03:06,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:03:09,104 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.61 vs. limit=10.0 2023-10-03 08:03:12,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:03:13,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 08:03:14,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 08:03:16,220 INFO [train.py:1046] (1/4) Epoch 34, batch 4050, loss[loss=0.1636, simple_loss=0.2558, pruned_loss=0.03576, over 24340.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2396, pruned_loss=0.04099, over 4708655.45 frames. ], batch size: 74, lr: 2.99e-03, grad_scale: 32.0 2023-10-03 08:03:16,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:03:16,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:03:18,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:03:19,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:03:20,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:03:25,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:03:29,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:03:29,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 08:03:31,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:03:32,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:03:34,567 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.804e+02 1.973e+02 2.142e+02 3.125e+02, threshold=3.945e+02, percent-clipped=0.0 2023-10-03 08:03:37,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:03:40,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:03:40,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1195733.3333333333, ans=0.2 2023-10-03 08:03:41,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 08:03:44,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 08:03:44,266 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 08:03:45,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:03:51,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 08:03:51,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:03:56,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:03:58,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:03:58,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:03:58,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:04:03,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:04:08,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 08:04:08,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:04:10,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:04:12,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 08:04:15,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:04:21,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 08:04:23,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:04:23,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:04:24,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 08:04:24,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 08:04:24,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:04:27,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:04:27,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:04:27,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:04:30,030 INFO [train.py:1046] (1/4) Epoch 34, batch 4100, loss[loss=0.2388, simple_loss=0.2985, pruned_loss=0.08956, over 19663.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2408, pruned_loss=0.04119, over 4716829.60 frames. ], batch size: 388, lr: 2.99e-03, grad_scale: 32.0 2023-10-03 08:04:35,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 08:04:37,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 08:04:39,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 08:04:39,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 08:04:39,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:04:40,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:04:40,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:04:40,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:04:40,995 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 08:04:45,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:04:46,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:04:46,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:04:46,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1196066.6666666667, ans=0.2 2023-10-03 08:04:47,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:04:52,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:04:52,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:04:52,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:04:52,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 08:04:54,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:04:54,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:04:54,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:04:54,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:04:55,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 08:04:58,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:04:59,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 08:05:02,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:05:05,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:05:05,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 08:05:05,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1196133.3333333333, ans=0.125 2023-10-03 08:05:06,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:05:06,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:05:06,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:05:07,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1196133.3333333333, ans=0.125 2023-10-03 08:05:10,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 08:05:10,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:05:11,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:05:13,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 08:05:14,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:05:14,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:05:17,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:05:23,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:05:25,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:05:26,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:05:27,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1196266.6666666667, ans=0.1 2023-10-03 08:05:31,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:05:31,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:05:33,876 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1196266.6666666667, ans=0.1 2023-10-03 08:05:36,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:05:39,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:05:43,637 INFO [train.py:1046] (1/4) Epoch 34, batch 4150, loss[loss=0.15, simple_loss=0.2308, pruned_loss=0.03461, over 24450.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2402, pruned_loss=0.04112, over 4699512.93 frames. ], batch size: 58, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:05:43,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:05:43,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:05:45,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:05:45,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:05:45,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1196333.3333333333, ans=0.125 2023-10-03 08:05:47,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 08:05:49,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:05:49,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 08:05:49,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 08:05:49,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 08:05:50,118 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.47 vs. limit=22.5 2023-10-03 08:05:52,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:05:56,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:05:56,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:06:01,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:06:02,270 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.887e+02 2.039e+02 2.346e+02 3.122e+02, threshold=4.079e+02, percent-clipped=0.0 2023-10-03 08:06:02,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:06:02,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:06:05,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:06:05,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:06:05,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 08:06:10,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:06:13,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:06:13,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 08:06:16,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 08:06:16,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:06:16,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 08:06:16,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:06:17,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:06:20,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:06:22,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:06:23,558 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:06:25,584 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.79 vs. limit=22.5 2023-10-03 08:06:26,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 08:06:28,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:06:29,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:06:30,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 08:06:30,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:06:31,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 08:06:32,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1196533.3333333333, ans=0.125 2023-10-03 08:06:33,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:06:34,528 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.11 vs. limit=15.0 2023-10-03 08:06:36,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:06:36,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:06:38,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 08:06:38,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:06:38,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 08:06:39,572 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.02 vs. limit=22.5 2023-10-03 08:06:40,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:06:42,068 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.52 vs. limit=6.0 2023-10-03 08:06:44,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 08:06:44,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:06:44,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:06:44,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 08:06:45,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 08:06:45,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:06:47,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 08:06:47,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:06:48,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:06:49,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 08:06:49,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:06:49,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1196600.0, ans=0.0 2023-10-03 08:06:54,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:06:56,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 08:06:57,320 INFO [train.py:1046] (1/4) Epoch 34, batch 4200, loss[loss=0.147, simple_loss=0.2348, pruned_loss=0.02959, over 24310.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2388, pruned_loss=0.04091, over 4698211.87 frames. ], batch size: 74, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:06:58,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:06:59,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1196666.6666666667, ans=0.125 2023-10-03 08:07:00,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:07:01,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:07:01,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:07:01,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:07:02,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 08:07:06,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 08:07:06,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:07:08,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:07:12,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:07:16,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 08:07:16,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:07:17,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:07:17,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 08:07:17,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:07:19,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:07:19,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:07:20,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:07:22,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:07:25,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 08:07:25,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:07:28,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 08:07:29,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:07:30,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:07:32,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:07:35,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:07:37,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 08:07:37,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:07:38,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:07:44,043 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.20 vs. limit=15.0 2023-10-03 08:07:45,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:07:47,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:07:51,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:07:56,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 08:07:58,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:08:01,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1196933.3333333333, ans=0.125 2023-10-03 08:08:03,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:08:03,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:08:03,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 08:08:09,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:08:11,121 INFO [train.py:1046] (1/4) Epoch 34, batch 4250, loss[loss=0.1457, simple_loss=0.1957, pruned_loss=0.04784, over 18737.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2376, pruned_loss=0.04058, over 4697466.73 frames. ], batch size: 389, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:08:13,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:08:13,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:08:14,154 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.43 vs. limit=6.0 2023-10-03 08:08:15,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:08:22,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:08:22,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 08:08:22,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:08:27,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:08:28,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:08:30,136 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.924e+02 2.069e+02 2.506e+02 3.818e+02, threshold=4.139e+02, percent-clipped=0.0 2023-10-03 08:08:33,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:08:33,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:08:36,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:08:36,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:08:37,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:08:38,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:08:40,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:08:42,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:08:43,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:08:45,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 08:08:47,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 08:08:47,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:08:48,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:08:49,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:08:51,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:08:51,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:08:51,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:08:53,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:08:54,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1197200.0, ans=0.125 2023-10-03 08:08:55,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:08:58,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:09:00,137 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:09:01,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:09:01,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 08:09:01,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:09:01,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 08:09:02,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:09:04,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:09:05,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:09:06,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:09:08,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 08:09:10,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:09:10,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:09:13,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:09:16,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:09:18,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:09:19,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:09:20,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:09:22,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:09:23,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:09:23,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 08:09:23,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:09:25,712 INFO [train.py:1046] (1/4) Epoch 34, batch 4300, loss[loss=0.1714, simple_loss=0.2467, pruned_loss=0.04807, over 23345.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2376, pruned_loss=0.04069, over 4686391.81 frames. ], batch size: 119, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:09:29,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:09:29,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:09:31,496 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:09:34,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:09:34,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1197333.3333333333, ans=0.0 2023-10-03 08:09:36,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1197333.3333333333, ans=0.125 2023-10-03 08:09:37,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1197333.3333333333, ans=0.125 2023-10-03 08:09:42,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:09:42,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 08:09:44,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:09:48,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:09:48,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:09:48,046 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 08:09:49,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 08:09:50,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:09:55,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 08:09:55,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:09:55,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 08:09:58,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 08:09:59,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:10:01,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:10:01,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:10:03,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:10:04,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:10:05,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:10:05,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 08:10:06,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 08:10:08,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:10:11,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:10:11,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 08:10:11,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:10:12,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:10:12,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 08:10:12,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 08:10:12,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 08:10:13,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1197533.3333333333, ans=0.0 2023-10-03 08:10:14,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:10:14,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 08:10:14,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 08:10:18,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:10:18,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1197533.3333333333, ans=0.0 2023-10-03 08:10:20,159 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 08:10:21,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:10:23,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1197600.0, ans=0.125 2023-10-03 08:10:23,627 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.21 vs. limit=15.0 2023-10-03 08:10:24,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:10:24,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:10:25,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 08:10:25,763 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:10:28,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:10:28,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:10:28,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:10:30,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:10:31,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:10:31,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1197600.0, ans=0.125 2023-10-03 08:10:32,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:10:34,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:10:35,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:10:37,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:10:38,354 INFO [train.py:1046] (1/4) Epoch 34, batch 4350, loss[loss=0.1663, simple_loss=0.2393, pruned_loss=0.04661, over 23492.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2381, pruned_loss=0.04103, over 4692651.99 frames. ], batch size: 285, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:10:39,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1197666.6666666667, ans=0.125 2023-10-03 08:10:43,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 08:10:43,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 08:10:47,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:10:49,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:10:51,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:10:51,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:10:56,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:10:57,704 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.863e+02 2.001e+02 2.275e+02 3.129e+02, threshold=4.001e+02, percent-clipped=0.0 2023-10-03 08:11:00,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:11:03,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:11:03,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:11:06,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:11:09,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:11:11,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:11:15,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 08:11:16,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:11:16,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:11:16,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1197800.0, ans=0.125 2023-10-03 08:11:24,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:11:25,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 08:11:28,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:11:28,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:11:33,563 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 08:11:35,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:11:35,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:11:36,428 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 08:11:37,833 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 08:11:37,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:11:37,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:11:37,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:11:39,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:11:39,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:11:39,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:11:43,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 08:11:43,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:11:43,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:11:43,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:11:45,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 08:11:45,397 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 08:11:45,402 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 08:11:45,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 08:11:48,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:11:50,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:11:50,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:11:51,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:11:52,796 INFO [train.py:1046] (1/4) Epoch 34, batch 4400, loss[loss=0.1793, simple_loss=0.2517, pruned_loss=0.05352, over 22556.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.239, pruned_loss=0.04078, over 4714550.06 frames. ], batch size: 322, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:11:52,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 08:11:56,201 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 08:11:56,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:11:56,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1198000.0, ans=0.1 2023-10-03 08:11:57,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1198000.0, ans=0.1 2023-10-03 08:12:00,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:12:02,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:12:04,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:12:05,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 08:12:05,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 08:12:05,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1198000.0, ans=0.125 2023-10-03 08:12:06,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 08:12:06,354 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 08:12:06,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1198066.6666666667, ans=0.125 2023-10-03 08:12:06,880 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.16 vs. limit=12.0 2023-10-03 08:12:07,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:12:07,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:12:10,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 08:12:11,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:12:13,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:12:13,320 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 08:12:14,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:12:14,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 08:12:15,499 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 08:12:18,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 08:12:19,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 08:12:19,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 08:12:21,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:12:21,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:12:21,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:12:22,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:12:24,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 08:12:24,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 08:12:25,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:12:26,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:12:26,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:12:28,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:12:28,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:12:28,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 08:12:30,199 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 08:12:30,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1198133.3333333333, ans=0.125 2023-10-03 08:12:33,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:12:40,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:12:41,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 08:12:46,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:12:48,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:12:51,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:12:51,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 08:12:51,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:12:53,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:12:53,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:12:53,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:12:55,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1198266.6666666667, ans=0.2 2023-10-03 08:12:56,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 08:12:59,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 08:13:00,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1198266.6666666667, ans=0.0 2023-10-03 08:13:01,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 08:13:01,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:13:02,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 08:13:04,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:13:07,171 INFO [train.py:1046] (1/4) Epoch 34, batch 4450, loss[loss=0.1733, simple_loss=0.251, pruned_loss=0.04782, over 23965.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2401, pruned_loss=0.04095, over 4710910.06 frames. ], batch size: 80, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:13:07,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:13:08,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 08:13:11,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1198333.3333333333, ans=0.1 2023-10-03 08:13:12,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:13:14,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:13:15,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:13:22,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:13:22,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:13:26,260 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.851e+02 2.013e+02 2.365e+02 4.076e+02, threshold=4.026e+02, percent-clipped=1.0 2023-10-03 08:13:26,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:13:27,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:13:30,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:13:30,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:13:32,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 08:13:32,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:13:32,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1198400.0, ans=0.0 2023-10-03 08:13:34,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:13:34,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:13:34,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:13:37,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 08:13:41,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:13:41,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:13:44,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:13:44,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:13:45,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:13:50,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 08:13:51,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 08:13:51,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 08:13:51,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:13:51,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1198533.3333333333, ans=0.125 2023-10-03 08:13:56,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:13:56,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 08:13:59,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:14:02,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:14:03,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 08:14:03,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:14:03,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:14:03,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:14:03,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:14:05,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:14:09,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:14:09,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 08:14:11,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:14:12,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:14:13,376 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.70 vs. limit=10.0 2023-10-03 08:14:15,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:14:15,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:14:16,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 08:14:18,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:14:20,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 08:14:21,767 INFO [train.py:1046] (1/4) Epoch 34, batch 4500, loss[loss=0.1612, simple_loss=0.2429, pruned_loss=0.03973, over 24475.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2405, pruned_loss=0.04117, over 4708509.92 frames. ], batch size: 63, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:14:21,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:14:25,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:14:27,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 08:14:27,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 08:14:27,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:14:33,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1198666.6666666667, ans=0.125 2023-10-03 08:14:34,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1198666.6666666667, ans=0.125 2023-10-03 08:14:35,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:14:35,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:14:37,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:14:37,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:14:37,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:14:37,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:14:48,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:14:48,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:14:50,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:14:51,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:14:53,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:15:00,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 08:15:01,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1198800.0, ans=0.0 2023-10-03 08:15:03,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1198800.0, ans=0.1 2023-10-03 08:15:05,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:15:08,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:15:08,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1198866.6666666667, ans=0.125 2023-10-03 08:15:09,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:15:10,567 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.27 vs. limit=22.5 2023-10-03 08:15:11,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 08:15:11,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:15:11,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:15:12,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:15:14,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:15:14,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1198866.6666666667, ans=0.125 2023-10-03 08:15:17,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:15:17,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 08:15:17,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:15:17,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:15:20,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:15:21,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:15:24,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:15:26,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:15:26,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:15:28,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 08:15:30,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 08:15:30,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 08:15:33,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 08:15:37,045 INFO [train.py:1046] (1/4) Epoch 34, batch 4550, loss[loss=0.1556, simple_loss=0.2209, pruned_loss=0.04516, over 23517.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2388, pruned_loss=0.04105, over 4703364.46 frames. ], batch size: 256, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:15:37,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 08:15:38,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:15:41,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:15:41,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:15:44,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:15:48,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:15:50,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:15:51,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:15:51,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:15:51,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:15:52,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1199066.6666666667, ans=0.1 2023-10-03 08:15:54,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:15:54,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:15:54,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1199066.6666666667, ans=0.125 2023-10-03 08:15:57,243 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.900e+02 2.073e+02 2.296e+02 3.311e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 08:15:58,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:16:00,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 08:16:00,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 08:16:01,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:16:03,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 08:16:06,366 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.78 vs. limit=15.0 2023-10-03 08:16:06,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 08:16:08,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:16:11,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 08:16:12,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:16:15,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:15,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:15,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:16:18,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 08:16:19,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:16:24,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:24,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:16:25,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:16:25,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 08:16:26,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1199200.0, ans=0.125 2023-10-03 08:16:27,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 08:16:27,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:16:27,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 08:16:30,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 08:16:30,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:16:31,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:16:31,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:16:32,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:34,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:16:36,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:16:36,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 08:16:37,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:16:37,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 08:16:38,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 08:16:38,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:16:38,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 08:16:42,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:16:42,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:16:46,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:16:46,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:46,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 08:16:47,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:16:50,174 INFO [train.py:1046] (1/4) Epoch 34, batch 4600, loss[loss=0.1538, simple_loss=0.2456, pruned_loss=0.03103, over 24569.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2379, pruned_loss=0.04035, over 4706378.69 frames. ], batch size: 71, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:16:50,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:16:52,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:16:54,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:16:57,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:16:57,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:16:59,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:17:00,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 08:17:01,194 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.50 vs. limit=12.0 2023-10-03 08:17:01,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:17:04,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:17:06,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:17:08,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:12,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 08:17:14,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:17,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:19,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:17:19,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:17:24,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 08:17:24,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 08:17:24,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1199466.6666666667, ans=0.1 2023-10-03 08:17:25,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:17:31,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:31,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:17:32,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:17:36,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 08:17:38,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:17:42,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:17:44,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:17:45,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:17:45,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 08:17:46,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:47,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 08:17:47,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:17:48,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:17:49,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:17:51,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:17:51,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:17:52,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 08:17:52,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 08:17:53,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 08:17:53,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:17:53,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:17:55,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:17:56,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:18:03,970 INFO [train.py:1046] (1/4) Epoch 34, batch 4650, loss[loss=0.1543, simple_loss=0.2303, pruned_loss=0.03917, over 23498.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2371, pruned_loss=0.04054, over 4692689.85 frames. ], batch size: 134, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:18:06,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:18:10,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:18:10,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:18:12,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:18:12,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:18:12,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:18:13,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:18:17,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 08:18:20,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:18:21,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 08:18:21,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:18:23,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 08:18:23,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:18:23,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 08:18:24,436 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.454e+02 1.812e+02 2.012e+02 2.227e+02 3.293e+02, threshold=4.023e+02, percent-clipped=0.0 2023-10-03 08:18:24,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 08:18:24,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:18:24,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:18:28,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:18:30,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:18:30,117 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 08:18:33,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:18:34,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 08:18:36,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:18:36,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:18:37,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 08:18:39,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:18:42,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:18:45,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:18:46,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1199800.0, ans=0.125 2023-10-03 08:18:51,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:18:54,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:18:54,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:18:55,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:18:57,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 08:18:58,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 08:18:58,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 08:18:58,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 08:18:59,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:19:06,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:19:06,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:19:06,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 08:19:07,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:19:07,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:19:08,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:19:08,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:19:11,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:19:11,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:19:13,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:19:17,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:19:20,905 INFO [train.py:1046] (1/4) Epoch 34, batch 4700, loss[loss=0.1585, simple_loss=0.2542, pruned_loss=0.03138, over 24294.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2379, pruned_loss=0.04045, over 4712197.09 frames. ], batch size: 74, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:19:20,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:19:20,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:19:21,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1200000.0, ans=0.1 2023-10-03 08:19:22,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 08:19:22,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:19:23,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 08:19:25,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1200000.0, ans=0.0 2023-10-03 08:19:30,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:19:30,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:19:31,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:19:32,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:19:33,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:19:39,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 08:19:39,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 08:19:40,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:19:44,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:19:44,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:19:47,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:19:54,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:19:54,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 08:19:57,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:19:59,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1200133.3333333333, ans=0.125 2023-10-03 08:20:00,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1200133.3333333333, ans=0.125 2023-10-03 08:20:04,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 08:20:06,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:20:07,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:10,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 08:20:11,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:20:16,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:20:17,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 08:20:19,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:19,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:20:22,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:20:22,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:20:22,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 08:20:24,004 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 08:20:25,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:20:28,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:28,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:28,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 08:20:28,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:32,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 08:20:32,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1200333.3333333333, ans=0.0 2023-10-03 08:20:33,485 INFO [train.py:1046] (1/4) Epoch 34, batch 4750, loss[loss=0.1559, simple_loss=0.2398, pruned_loss=0.03601, over 24486.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2387, pruned_loss=0.04068, over 4718258.93 frames. ], batch size: 66, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:20:34,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:20:34,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:20:38,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1200333.3333333333, ans=0.2 2023-10-03 08:20:39,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:20:41,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:20:42,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 08:20:42,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:20:44,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 08:20:44,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1200333.3333333333, ans=0.125 2023-10-03 08:20:47,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:20:47,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:20:49,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:20:55,002 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.401e+02 1.941e+02 2.055e+02 2.370e+02 3.747e+02, threshold=4.109e+02, percent-clipped=0.0 2023-10-03 08:20:55,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 08:20:59,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:21:00,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 08:21:01,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1200400.0, ans=0.0 2023-10-03 08:21:02,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:21:04,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:21:04,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:21:04,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:21:05,065 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 08:21:05,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 08:21:09,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 08:21:14,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:21:16,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:21:18,534 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.36 vs. limit=15.0 2023-10-03 08:21:19,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:21:19,256 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 08:21:19,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:21:22,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:21:27,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:21:27,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 08:21:28,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 08:21:30,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:21:30,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:21:30,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:21:31,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 08:21:32,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 08:21:34,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 08:21:35,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:21:36,591 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.97 vs. limit=6.0 2023-10-03 08:21:38,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:21:38,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 08:21:38,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:21:40,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:21:43,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:21:43,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:21:43,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:21:46,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:21:47,623 INFO [train.py:1046] (1/4) Epoch 34, batch 4800, loss[loss=0.179, simple_loss=0.2561, pruned_loss=0.05101, over 22783.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2399, pruned_loss=0.04087, over 4729014.36 frames. ], batch size: 322, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:21:47,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 08:21:47,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 08:21:49,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 08:21:50,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:21:52,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:21:53,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 08:21:59,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:21:59,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:21:59,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1200666.6666666667, ans=0.0 2023-10-03 08:22:02,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1200733.3333333333, ans=0.04949747468305833 2023-10-03 08:22:05,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:22:06,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:22:06,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:22:07,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 08:22:07,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:22:09,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:22:11,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:22:15,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:16,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:22:16,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:22:18,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:22:18,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 08:22:18,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:22:19,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:22:20,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:22:22,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:22:26,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:22:26,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:22:27,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 08:22:28,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:22:30,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 08:22:30,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 08:22:31,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:22:31,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:22:33,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:22:33,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:22:33,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:22:33,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1200866.6666666667, ans=0.2 2023-10-03 08:22:35,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:22:35,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:22:37,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:22:40,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:41,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:22:46,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 08:22:46,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:22:47,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:47,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:22:49,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:22:52,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:22:54,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:22:54,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:54,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:22:54,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:22:55,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:22:58,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:22:58,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:58,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:23:01,329 INFO [train.py:1046] (1/4) Epoch 34, batch 4850, loss[loss=0.1628, simple_loss=0.2423, pruned_loss=0.04158, over 23540.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2399, pruned_loss=0.04108, over 4728136.96 frames. ], batch size: 134, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:23:01,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 08:23:02,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 08:23:02,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:23:02,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:23:03,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1201000.0, ans=0.125 2023-10-03 08:23:04,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:23:04,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:23:06,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:23:11,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1201000.0, ans=0.0 2023-10-03 08:23:11,709 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.12 vs. limit=15.0 2023-10-03 08:23:13,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 08:23:15,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:23:18,372 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:23:19,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:23:19,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 08:23:20,842 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.893e+02 2.140e+02 2.446e+02 3.787e+02, threshold=4.279e+02, percent-clipped=0.0 2023-10-03 08:23:20,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:23:24,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:23:26,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:23:26,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1201066.6666666667, ans=0.0 2023-10-03 08:23:28,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:23:28,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 08:23:30,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:23:32,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:23:32,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 08:23:33,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:23:33,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 08:23:35,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:23:35,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:23:37,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:23:38,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 08:23:38,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=1201133.3333333333, ans=0.95 2023-10-03 08:23:39,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 08:23:39,942 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=19.28 vs. limit=15.0 2023-10-03 08:23:40,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:23:48,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:23:49,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 08:23:50,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:23:50,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:23:52,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1201200.0, ans=0.125 2023-10-03 08:23:54,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:23:54,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 08:23:56,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:23:56,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 08:23:57,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:23:57,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:23:59,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 08:24:06,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:24:10,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:24:11,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:24:15,140 INFO [train.py:1046] (1/4) Epoch 34, batch 4900, loss[loss=0.141, simple_loss=0.2197, pruned_loss=0.03115, over 24288.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2393, pruned_loss=0.041, over 4724843.75 frames. ], batch size: 56, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:24:17,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 08:24:17,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:24:22,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:24:24,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:24:24,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:24:27,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 08:24:31,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 08:24:35,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 08:24:37,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 08:24:37,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:24:37,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:24:38,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:24:38,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:24:38,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:24:38,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 08:24:41,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 08:24:42,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:24:44,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:24:44,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:24:47,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:24:48,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:24:50,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:24:50,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 08:24:51,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:24:53,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:24:53,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 08:24:53,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 08:24:57,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 08:24:58,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:24:58,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:25:00,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:25:00,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:25:01,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 08:25:01,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:25:01,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 08:25:03,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:25:04,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1201533.3333333333, ans=0.125 2023-10-03 08:25:05,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:25:07,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:25:09,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 08:25:11,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:25:12,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 08:25:12,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 08:25:12,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1201600.0, ans=0.125 2023-10-03 08:25:14,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1201600.0, ans=0.0 2023-10-03 08:25:18,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1201600.0, ans=0.5 2023-10-03 08:25:19,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:25:19,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:25:21,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 08:25:21,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 08:25:21,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:25:23,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:25:27,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:25:27,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:25:27,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:25:29,184 INFO [train.py:1046] (1/4) Epoch 34, batch 4950, loss[loss=0.1467, simple_loss=0.2211, pruned_loss=0.03618, over 23700.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2384, pruned_loss=0.04085, over 4707520.14 frames. ], batch size: 232, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:25:29,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 08:25:29,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:25:31,655 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.24 vs. limit=15.0 2023-10-03 08:25:32,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:25:33,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 08:25:36,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 08:25:37,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 08:25:37,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:25:39,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 08:25:39,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:25:39,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:25:39,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:25:39,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:25:41,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:25:43,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:25:43,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:25:44,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:25:47,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:25:47,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:25:50,570 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.889e+02 2.059e+02 2.301e+02 3.763e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-03 08:25:50,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:25:55,592 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=13.27 vs. limit=15.0 2023-10-03 08:25:56,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:25:57,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:25:59,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1201800.0, ans=0.125 2023-10-03 08:26:00,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:26:00,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:01,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:26:03,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 08:26:04,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 08:26:05,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:08,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:26:08,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:26:09,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:26:09,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:26:09,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:26:11,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:26:14,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:26:15,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:26:17,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:26:17,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:18,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 08:26:18,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:26:20,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:26:24,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:26:26,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:26:26,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:26:28,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:28,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:26:29,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:26:31,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:26:31,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:26:32,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:26:33,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1201933.3333333333, ans=0.0 2023-10-03 08:26:34,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 08:26:38,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:26:42,294 INFO [train.py:1046] (1/4) Epoch 34, batch 5000, loss[loss=0.1476, simple_loss=0.2303, pruned_loss=0.03244, over 24653.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2379, pruned_loss=0.04057, over 4717814.74 frames. ], batch size: 65, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:26:42,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 08:26:42,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 08:26:42,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1202000.0, ans=0.125 2023-10-03 08:26:48,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:48,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:26:50,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=1202000.0, ans=15.0 2023-10-03 08:26:51,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 08:26:51,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 08:26:53,403 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.14 vs. limit=15.0 2023-10-03 08:26:54,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:26:55,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 08:26:55,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:26:55,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:26:58,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 08:26:59,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:26:59,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:27:01,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 08:27:01,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:27:01,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:27:02,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 08:27:03,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 08:27:05,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:27:05,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 08:27:05,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:27:05,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:27:05,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:27:05,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 08:27:06,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 08:27:09,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 08:27:09,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:27:09,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:27:10,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 08:27:10,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:27:10,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:27:12,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:27:13,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 08:27:14,239 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.15 vs. limit=15.0 2023-10-03 08:27:16,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 08:27:16,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:27:18,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:27:22,757 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 08:27:27,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:27:28,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:27:28,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:27:32,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 08:27:32,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:27:32,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:27:32,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:27:34,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 08:27:34,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:27:38,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:27:39,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:27:44,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 08:27:45,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1202266.6666666667, ans=0.125 2023-10-03 08:27:46,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:27:57,231 INFO [train.py:1046] (1/4) Epoch 34, batch 5050, loss[loss=0.1439, simple_loss=0.2285, pruned_loss=0.02968, over 24618.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.238, pruned_loss=0.04073, over 4704022.13 frames. ], batch size: 60, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:27:57,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:27:58,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:27:58,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:27:58,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:27:58,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:27:58,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:28:00,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:28:02,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:28:04,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 08:28:05,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:28:08,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:28:09,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:28:10,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 08:28:11,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:28:11,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:28:14,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:28:14,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:28:15,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:28:18,352 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.843e+02 2.007e+02 2.253e+02 3.128e+02, threshold=4.014e+02, percent-clipped=0.0 2023-10-03 08:28:18,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1202400.0, ans=0.125 2023-10-03 08:28:22,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 08:28:23,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:28:24,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:28:24,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 08:28:26,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:28:28,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:28:28,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:28:29,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:28:29,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 08:28:29,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 08:28:30,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:28:33,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:28:36,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:28:36,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 08:28:38,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:28:41,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 08:28:42,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:28:42,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:28:42,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:28:43,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:28:45,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:28:46,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:28:46,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:28:48,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:28:48,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:28:48,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 08:28:49,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:28:52,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:28:57,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:28:57,497 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 08:28:57,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 08:28:58,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:29:00,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:29:00,289 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 08:29:00,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1202600.0, ans=0.125 2023-10-03 08:29:01,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:29:01,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 08:29:01,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:29:04,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:29:06,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:29:06,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 08:29:06,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1202600.0, ans=0.2 2023-10-03 08:29:07,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 08:29:09,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:29:09,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:29:09,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:29:10,816 INFO [train.py:1046] (1/4) Epoch 34, batch 5100, loss[loss=0.224, simple_loss=0.2892, pruned_loss=0.07937, over 19300.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2392, pruned_loss=0.04098, over 4710701.63 frames. ], batch size: 388, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:29:12,303 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 08:29:15,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:29:15,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1202666.6666666667, ans=0.125 2023-10-03 08:29:17,422 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.79 vs. limit=15.0 2023-10-03 08:29:17,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 08:29:17,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 08:29:19,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:29:20,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:29:22,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:29:23,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 08:29:24,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 08:29:28,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:29:28,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:29:29,499 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.60 vs. limit=6.0 2023-10-03 08:29:33,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:29:36,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 08:29:38,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:29:39,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:29:39,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 08:29:45,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:29:45,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:29:45,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 08:29:48,075 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 08:29:48,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:29:48,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 08:29:49,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 08:29:52,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:29:59,411 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.68 vs. limit=15.0 2023-10-03 08:30:00,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:30:04,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 08:30:04,829 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 08:30:04,836 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 08:30:06,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 08:30:06,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:30:08,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 08:30:13,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 08:30:14,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1202933.3333333333, ans=0.0 2023-10-03 08:30:15,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 08:30:15,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:30:16,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1202933.3333333333, ans=0.125 2023-10-03 08:30:19,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 08:30:20,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:30:20,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 08:30:25,339 INFO [train.py:1046] (1/4) Epoch 34, batch 5150, loss[loss=0.1567, simple_loss=0.2301, pruned_loss=0.04161, over 23581.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2403, pruned_loss=0.04137, over 4712044.77 frames. ], batch size: 120, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:30:26,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:30:26,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:30:26,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:30:26,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:30:28,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:30:28,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:30:29,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 08:30:29,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 08:30:29,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 08:30:31,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:30:31,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 08:30:33,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:30:34,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 08:30:36,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:30:37,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:30:42,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:30:42,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 08:30:43,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:30:43,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:30:44,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:30:44,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:30:44,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:30:46,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:30:46,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:30:48,006 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.915e+02 2.112e+02 2.409e+02 3.229e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-03 08:30:48,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 08:30:48,406 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:30:49,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:30:49,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:30:52,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:30:53,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 08:30:53,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:31:01,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:31:03,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 08:31:06,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:31:12,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:31:12,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:31:14,100 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.03 vs. limit=10.0 2023-10-03 08:31:16,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:31:18,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:31:19,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 08:31:19,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1203200.0, ans=0.0 2023-10-03 08:31:22,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:31:23,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:31:25,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:31:28,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:31:30,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:31:31,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 08:31:36,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:31:36,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1203266.6666666667, ans=0.2 2023-10-03 08:31:38,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:31:40,218 INFO [train.py:1046] (1/4) Epoch 34, batch 5200, loss[loss=0.226, simple_loss=0.2919, pruned_loss=0.08009, over 19462.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2408, pruned_loss=0.04178, over 4706146.53 frames. ], batch size: 388, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:31:40,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:31:40,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:31:42,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 08:31:42,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:31:42,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:31:42,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:31:46,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:31:47,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:31:49,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:31:53,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 08:31:55,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:31:55,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:31:55,865 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.26 vs. limit=22.5 2023-10-03 08:31:57,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:31:58,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:31:58,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:31:58,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1203400.0, ans=0.2 2023-10-03 08:32:01,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 08:32:02,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:32:04,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:32:05,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 08:32:06,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:32:08,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:32:09,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 08:32:09,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 08:32:12,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 08:32:14,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:32:14,177 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 08:32:14,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:32:15,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:32:15,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:32:16,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 08:32:17,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:32:20,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:32:21,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 08:32:21,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 08:32:21,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 08:32:22,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1203466.6666666667, ans=0.0 2023-10-03 08:32:28,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 08:32:28,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:32:28,682 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.79 vs. limit=15.0 2023-10-03 08:32:32,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:32:32,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:32:35,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 08:32:36,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:32:36,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 08:32:36,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:32:36,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:32:39,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:32:41,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:32:44,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:32:44,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:32:44,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:32:50,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:32:51,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 08:32:52,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:32:53,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:32:54,308 INFO [train.py:1046] (1/4) Epoch 34, batch 5250, loss[loss=0.1655, simple_loss=0.2542, pruned_loss=0.03839, over 24426.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2396, pruned_loss=0.04154, over 4701156.71 frames. ], batch size: 69, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:32:54,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:32:55,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:32:55,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:32:59,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:33:01,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:33:01,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:33:03,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:33:07,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:33:08,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1203733.3333333333, ans=0.0 2023-10-03 08:33:10,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:33:12,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:33:13,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:33:15,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 08:33:15,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:33:16,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:33:18,465 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.466e+02 1.928e+02 2.131e+02 2.518e+02 4.702e+02, threshold=4.262e+02, percent-clipped=2.0 2023-10-03 08:33:26,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1203800.0, ans=0.125 2023-10-03 08:33:38,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1203866.6666666667, ans=0.125 2023-10-03 08:33:52,855 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:34:03,910 INFO [train.py:1046] (1/4) Epoch 34, batch 5300, loss[loss=0.1583, simple_loss=0.2541, pruned_loss=0.03128, over 24458.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2375, pruned_loss=0.04075, over 4696387.62 frames. ], batch size: 69, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:34:19,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:34:19,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 08:34:19,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 08:34:19,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:34:19,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:34:19,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:34:19,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:34:19,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:34:19,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:34:19,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:34:19,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:34:20,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:34:20,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 08:34:20,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 08:34:20,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 08:34:20,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 08:34:20,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 08:34:20,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 08:34:20,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:34:21,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:34:21,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:34:21,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:34:21,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:34:21,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:34:21,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:34:21,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:34:21,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:34:21,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:34:21,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:34:21,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:34:21,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:34:22,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 08:34:22,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:34:23,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:34:23,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 08:34:23,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 08:34:23,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:34:23,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:34:23,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 08:34:23,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 08:34:23,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:34:23,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:34:24,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:34:24,222 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 08:34:24,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 08:34:24,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:34:24,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:34:24,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 08:34:24,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 08:34:24,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 08:34:24,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:34:31,026 INFO [train.py:1046] (1/4) Epoch 35, batch 0, loss[loss=0.1588, simple_loss=0.2395, pruned_loss=0.03906, over 23368.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2395, pruned_loss=0.03906, over 23368.00 frames. ], batch size: 119, lr: 2.93e-03, grad_scale: 32.0 2023-10-03 08:34:31,026 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 08:34:43,450 INFO [train.py:1078] (1/4) Epoch 35, validation: loss=0.3289, simple_loss=0.2753, pruned_loss=0.1913, over 1125622.00 frames. 2023-10-03 08:34:43,450 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 08:34:44,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 08:34:44,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:34:46,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:34:51,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:34:51,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:34:51,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:34:53,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 08:34:54,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 08:34:57,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:34:57,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:35:01,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:35:01,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:35:03,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:35:03,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:35:04,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 08:35:07,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:35:15,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:35:15,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:35:17,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 08:35:18,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1204220.0, ans=0.0 2023-10-03 08:35:21,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:35:21,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:35:23,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:35:26,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:35:26,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1204286.6666666667, ans=0.125 2023-10-03 08:35:31,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:35:32,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1204286.6666666667, ans=0.1 2023-10-03 08:35:37,796 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.99 vs. limit=15.0 2023-10-03 08:35:38,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 08:35:42,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 08:35:43,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:35:43,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:35:44,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:35:44,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:35:47,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 08:35:49,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:35:49,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:35:54,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:35:56,829 INFO [train.py:1046] (1/4) Epoch 35, batch 50, loss[loss=0.2, simple_loss=0.2678, pruned_loss=0.06616, over 19910.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2399, pruned_loss=0.03975, over 1062807.54 frames. ], batch size: 388, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:35:58,325 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 08:35:59,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:36:01,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:36:03,820 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.925e+02 2.366e+02 2.722e+02 6.685e+02, threshold=4.732e+02, percent-clipped=5.0 2023-10-03 08:36:03,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:36:03,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 08:36:04,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:36:04,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:36:07,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:36:08,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:36:11,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:36:14,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 08:36:14,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:36:21,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 08:36:23,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 08:36:25,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 08:36:25,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1204553.3333333333, ans=0.0 2023-10-03 08:36:27,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:36:27,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:36:29,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:36:29,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:36:29,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 08:36:30,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:36:30,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:36:37,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:36:39,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:36:40,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:36:40,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 08:36:42,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:36:43,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:36:43,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 08:36:43,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:36:45,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 08:36:45,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1204620.0, ans=0.125 2023-10-03 08:36:52,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:36:53,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:36:55,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:36:57,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:36:57,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:36:58,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 08:36:58,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 08:37:00,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:37:01,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:37:01,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:37:02,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:37:02,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 08:37:03,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 08:37:04,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 08:37:05,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:07,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:37:08,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 08:37:08,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 08:37:08,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:09,819 INFO [train.py:1046] (1/4) Epoch 35, batch 100, loss[loss=0.1679, simple_loss=0.2447, pruned_loss=0.04558, over 24469.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2382, pruned_loss=0.03942, over 1871943.76 frames. ], batch size: 63, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:37:09,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:37:11,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:37:11,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:37:15,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:37:18,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:37:18,882 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.63 vs. limit=12.0 2023-10-03 08:37:21,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:37:22,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 08:37:22,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:37:26,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:37:26,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:37:26,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:37:26,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:37:27,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:37:28,239 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:37:29,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 08:37:31,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1204820.0, ans=0.125 2023-10-03 08:37:32,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:37:32,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:32,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:37:33,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:37:36,206 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.38 vs. limit=15.0 2023-10-03 08:37:36,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 08:37:38,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:38,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:37:39,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:37:42,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:37:46,629 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 08:37:46,649 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 08:37:49,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:37:49,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:37:51,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 08:37:53,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:56,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:37:56,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1204953.3333333333, ans=0.125 2023-10-03 08:38:00,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:01,600 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 08:38:02,397 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.01 vs. limit=15.0 2023-10-03 08:38:03,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1204953.3333333333, ans=0.0 2023-10-03 08:38:04,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 08:38:07,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:38:07,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:38:11,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:13,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:38:14,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:38:17,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:38:17,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:19,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:38:21,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:38:21,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:38:21,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:21,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 08:38:22,618 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 08:38:22,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:38:22,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:38:24,008 INFO [train.py:1046] (1/4) Epoch 35, batch 150, loss[loss=0.1689, simple_loss=0.2609, pruned_loss=0.03839, over 24683.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2395, pruned_loss=0.04004, over 2510584.69 frames. ], batch size: 73, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:38:24,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:24,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:38:24,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 08:38:24,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:38:25,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:38:25,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:26,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:38:26,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:38:28,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:38:28,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:38:30,902 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.859e+02 2.007e+02 2.245e+02 3.352e+02, threshold=4.014e+02, percent-clipped=0.0 2023-10-03 08:38:31,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:38:35,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:38:35,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:38:35,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:37,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:38:37,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:40,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1205153.3333333333, ans=0.125 2023-10-03 08:38:41,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:38:42,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:45,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 08:38:45,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 08:38:45,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 08:38:48,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:38:48,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:38:49,300 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.56 vs. limit=10.0 2023-10-03 08:38:49,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:38:50,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:50,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1205153.3333333333, ans=0.125 2023-10-03 08:38:51,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:38:51,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:52,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1205220.0, ans=0.1 2023-10-03 08:38:53,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:54,628 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 08:38:56,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:39:00,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:39:03,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:39:04,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 08:39:09,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:39:09,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:39:09,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:39:09,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1205286.6666666667, ans=0.125 2023-10-03 08:39:10,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:39:11,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:39:13,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:39:14,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:39:16,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 08:39:20,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:39:21,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:39:21,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:39:21,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:39:23,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:39:26,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 08:39:27,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:39:29,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:39:30,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:39:33,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:39:33,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 08:39:35,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:39:35,262 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 08:39:36,566 INFO [train.py:1046] (1/4) Epoch 35, batch 200, loss[loss=0.1567, simple_loss=0.2392, pruned_loss=0.0371, over 24466.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.241, pruned_loss=0.04104, over 3003536.45 frames. ], batch size: 63, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:39:38,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:39:42,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:39:42,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:39:43,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 08:39:45,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:39:45,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:39:45,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1205420.0, ans=0.125 2023-10-03 08:39:48,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 08:39:49,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 08:39:51,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:39:51,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:39:53,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1205486.6666666667, ans=0.125 2023-10-03 08:39:56,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:39:56,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:39:57,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:40:05,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1205553.3333333333, ans=0.125 2023-10-03 08:40:09,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1205553.3333333333, ans=0.125 2023-10-03 08:40:13,189 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.59 vs. limit=22.5 2023-10-03 08:40:14,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:40:14,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:40:14,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:40:16,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:40:18,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 08:40:18,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:40:19,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:20,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:40:21,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:40:22,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:40:22,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 08:40:23,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:40:23,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:40:27,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:40:34,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:40:40,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:41,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:40:47,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:50,621 INFO [train.py:1046] (1/4) Epoch 35, batch 250, loss[loss=0.1752, simple_loss=0.2656, pruned_loss=0.04238, over 24348.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2409, pruned_loss=0.04068, over 3386565.95 frames. ], batch size: 77, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:40:50,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 08:40:50,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:40:50,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:40:50,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:40:50,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:40:51,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1205753.3333333333, ans=0.0 2023-10-03 08:40:52,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 08:40:52,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:40:53,529 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 08:40:53,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1205753.3333333333, ans=0.125 2023-10-03 08:40:54,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:56,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:40:57,921 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.919e+02 2.120e+02 2.596e+02 4.381e+02, threshold=4.240e+02, percent-clipped=2.0 2023-10-03 08:40:58,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:59,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:41:03,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:41:03,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:41:05,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:41:08,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:41:08,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1205820.0, ans=0.1 2023-10-03 08:41:11,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1205820.0, ans=0.125 2023-10-03 08:41:18,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:41:21,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:41:21,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:41:27,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 08:41:27,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:41:28,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:41:28,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:41:30,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:41:30,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:41:31,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:41:33,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:41:36,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 08:41:36,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:41:39,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:41:39,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:41:39,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:41:39,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:41:40,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:41:40,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:41:40,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1205953.3333333333, ans=0.1 2023-10-03 08:41:43,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:41:43,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:41:44,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:41:46,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1205953.3333333333, ans=0.125 2023-10-03 08:41:47,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:41:52,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:41:53,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:41:57,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:41:58,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1206020.0, ans=0.125 2023-10-03 08:41:59,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:42:02,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 08:42:04,663 INFO [train.py:1046] (1/4) Epoch 35, batch 300, loss[loss=0.1545, simple_loss=0.2273, pruned_loss=0.04082, over 23839.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2383, pruned_loss=0.04057, over 3670117.80 frames. ], batch size: 164, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:42:04,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:42:04,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:42:07,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 08:42:07,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:42:07,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:42:07,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 08:42:11,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:42:13,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:42:17,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:42:17,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 08:42:19,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:42:20,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 08:42:20,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 08:42:20,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:42:23,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:42:24,200 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.53 vs. limit=15.0 2023-10-03 08:42:28,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:42:28,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 08:42:32,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 08:42:32,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:42:35,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:42:36,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:42:36,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 08:42:36,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:42:39,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:42:42,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:42:42,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:42:45,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 08:42:45,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 08:42:47,101 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.73 vs. limit=5.0 2023-10-03 08:42:47,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:42:50,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:42:51,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 08:42:53,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:42:56,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:42:58,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:42:58,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 08:43:03,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:43:03,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:43:06,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:43:07,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:43:07,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 08:43:07,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:43:09,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:43:09,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 08:43:10,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:43:12,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:12,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:43:12,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:43:12,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1206353.3333333333, ans=0.2 2023-10-03 08:43:13,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:13,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1206353.3333333333, ans=0.1 2023-10-03 08:43:18,386 INFO [train.py:1046] (1/4) Epoch 35, batch 350, loss[loss=0.1486, simple_loss=0.2299, pruned_loss=0.03361, over 24421.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2372, pruned_loss=0.04032, over 3902874.66 frames. ], batch size: 58, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:43:18,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:43:18,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 08:43:21,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:25,038 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.902e+02 2.096e+02 2.398e+02 4.416e+02, threshold=4.192e+02, percent-clipped=1.0 2023-10-03 08:43:27,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:43:29,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:43:29,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1206420.0, ans=0.0 2023-10-03 08:43:31,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:34,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 08:43:36,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:43:36,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 08:43:37,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:38,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 08:43:38,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:43:41,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 08:43:44,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:43:45,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:43:45,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:43:47,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:43:47,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:43:48,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:43:48,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:43:48,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:43:51,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:43:51,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:58,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:43:58,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:44:00,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:44:00,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:44:05,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 08:44:05,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:44:09,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:44:09,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:44:09,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:44:11,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 08:44:14,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:14,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1206620.0, ans=0.1 2023-10-03 08:44:15,529 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 08:44:15,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1206686.6666666667, ans=0.0 2023-10-03 08:44:16,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 08:44:16,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:44:19,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:44:19,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 08:44:21,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:23,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:44:24,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:44:25,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:25,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:44:27,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:44:30,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:44:31,943 INFO [train.py:1046] (1/4) Epoch 35, batch 400, loss[loss=0.1653, simple_loss=0.2399, pruned_loss=0.04539, over 23641.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2372, pruned_loss=0.03993, over 4088230.42 frames. ], batch size: 232, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:44:33,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:44:33,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 08:44:33,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:35,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:44:36,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:44:36,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:44:39,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:44:41,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:44:41,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 08:44:43,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 08:44:43,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:44:45,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 08:44:45,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:44:46,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1206820.0, ans=0.125 2023-10-03 08:44:50,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:44:50,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:44:50,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 08:44:50,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:44:52,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:44:52,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:44:52,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:55,546 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 08:44:56,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 08:44:59,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1206820.0, ans=0.0 2023-10-03 08:45:00,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:45:03,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:45:04,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 08:45:05,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 08:45:09,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:45:09,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1206886.6666666667, ans=0.0 2023-10-03 08:45:10,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:45:19,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 08:45:21,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:45:22,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 08:45:22,728 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.37 vs. limit=12.0 2023-10-03 08:45:24,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:45:25,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:45:25,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 08:45:29,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:45:31,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:45:34,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:45:37,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:45:37,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 08:45:39,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:45:41,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 08:45:42,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:45:43,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:45:45,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 08:45:46,707 INFO [train.py:1046] (1/4) Epoch 35, batch 450, loss[loss=0.1652, simple_loss=0.2453, pruned_loss=0.04252, over 23169.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2378, pruned_loss=0.04013, over 4238922.04 frames. ], batch size: 105, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:45:48,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:45:48,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:45:48,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 08:45:49,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 08:45:49,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:45:50,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:45:52,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:45:52,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 08:45:52,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:45:53,602 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.846e+02 1.963e+02 2.221e+02 3.123e+02, threshold=3.927e+02, percent-clipped=0.0 2023-10-03 08:45:53,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:45:57,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:46:05,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:46:06,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:46:09,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 08:46:10,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 08:46:13,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:46:15,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:46:16,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:46:19,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:46:20,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:46:24,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 08:46:24,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 08:46:26,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 08:46:27,234 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.04 vs. limit=12.0 2023-10-03 08:46:28,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:46:29,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:46:29,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:46:31,054 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 08:46:31,062 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 08:46:32,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:46:33,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:46:35,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 08:46:35,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1207286.6666666667, ans=10.0 2023-10-03 08:46:37,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:46:38,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:46:38,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 08:46:40,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 08:46:42,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:46:46,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:46:47,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:46:47,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 08:46:49,409 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.44 vs. limit=15.0 2023-10-03 08:46:50,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:46:50,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 08:46:50,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 08:46:51,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:46:58,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:47:00,732 INFO [train.py:1046] (1/4) Epoch 35, batch 500, loss[loss=0.157, simple_loss=0.2464, pruned_loss=0.03376, over 24633.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2389, pruned_loss=0.04054, over 4348994.01 frames. ], batch size: 73, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:47:00,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:47:00,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:47:02,220 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 08:47:05,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:47:05,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:47:05,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1207420.0, ans=0.0 2023-10-03 08:47:06,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:47:06,375 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 08:47:07,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 08:47:07,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:47:11,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:47:14,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:47:15,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:47:17,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:47:18,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:47:18,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:47:27,618 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.74 vs. limit=22.5 2023-10-03 08:47:28,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:47:28,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 08:47:29,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1207553.3333333333, ans=0.0 2023-10-03 08:47:30,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:47:30,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:47:30,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 08:47:31,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:47:34,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:47:35,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:47:35,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:47:35,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:47:37,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 08:47:41,200 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 08:47:43,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:47:45,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:47:45,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:47:47,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:47:47,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:47:48,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 08:47:50,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:47:52,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:47:56,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:47:58,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:47:58,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1207686.6666666667, ans=0.1 2023-10-03 08:48:05,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:48:07,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 08:48:07,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:48:08,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:48:12,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 08:48:13,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 08:48:15,006 INFO [train.py:1046] (1/4) Epoch 35, batch 550, loss[loss=0.1734, simple_loss=0.2466, pruned_loss=0.05005, over 23323.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2402, pruned_loss=0.04093, over 4446067.25 frames. ], batch size: 285, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:48:16,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:48:19,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 08:48:21,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 08:48:21,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:48:21,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 08:48:22,463 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.820e+02 2.073e+02 2.406e+02 3.793e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 08:48:22,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:48:22,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:48:23,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:23,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:25,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:48:26,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:48:29,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:48:29,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 08:48:29,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:48:29,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1207820.0, ans=0.04949747468305833 2023-10-03 08:48:32,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1207820.0, ans=0.0 2023-10-03 08:48:33,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:48:34,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:36,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:48:38,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:41,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 08:48:44,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 08:48:45,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:48:48,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1207886.6666666667, ans=0.125 2023-10-03 08:48:50,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:48:50,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:48:51,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:48:54,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:48:54,605 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 08:48:56,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:57,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 08:49:00,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:49:00,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:49:00,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:49:01,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:02,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 08:49:03,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 08:49:03,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1207953.3333333333, ans=0.0 2023-10-03 08:49:04,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:49:04,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:49:04,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:49:04,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:49:07,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:49:09,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:49:12,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:49:12,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:12,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 08:49:15,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:49:15,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:49:16,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:49:18,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:19,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:49:21,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 08:49:21,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1208020.0, ans=0.2 2023-10-03 08:49:21,732 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1208020.0, ans=0.0 2023-10-03 08:49:26,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 08:49:28,085 INFO [train.py:1046] (1/4) Epoch 35, batch 600, loss[loss=0.1654, simple_loss=0.245, pruned_loss=0.04289, over 23649.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2405, pruned_loss=0.04103, over 4508673.26 frames. ], batch size: 85, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:49:29,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 08:49:30,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:49:32,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:49:32,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:49:37,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:49:39,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:49:40,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 08:49:43,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 08:49:44,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:49:46,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:47,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 08:49:47,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:49:55,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 08:49:59,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:49:59,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:59,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:50:02,559 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.92 vs. limit=22.5 2023-10-03 08:50:03,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:50:03,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:50:05,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:50:11,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:50:16,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:50:16,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:50:16,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:50:23,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 08:50:26,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:50:27,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:50:31,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 08:50:33,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:50:35,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 08:50:35,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1208353.3333333333, ans=0.2 2023-10-03 08:50:36,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:50:36,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:50:41,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 08:50:43,029 INFO [train.py:1046] (1/4) Epoch 35, batch 650, loss[loss=0.1741, simple_loss=0.2403, pruned_loss=0.05398, over 23716.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2398, pruned_loss=0.04089, over 4548254.60 frames. ], batch size: 179, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:50:43,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:50:44,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:50:46,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:50:48,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:50:49,940 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.925e+02 2.110e+02 2.430e+02 3.265e+02, threshold=4.220e+02, percent-clipped=0.0 2023-10-03 08:50:51,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 08:50:53,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:50:54,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1208420.0, ans=0.125 2023-10-03 08:50:56,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1208486.6666666667, ans=0.125 2023-10-03 08:50:57,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:50:57,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:51:00,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:51:01,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1208486.6666666667, ans=0.2 2023-10-03 08:51:04,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 08:51:04,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:51:06,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:51:09,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:51:09,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 08:51:13,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:51:14,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:15,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:51:15,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:17,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:51:19,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:51:19,296 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 08:51:19,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:51:19,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:51:22,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1208553.3333333333, ans=0.125 2023-10-03 08:51:23,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:23,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1208553.3333333333, ans=0.05 2023-10-03 08:51:24,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:51:24,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:51:26,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:51:26,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 08:51:27,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:51:27,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:51:29,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 08:51:29,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:51:30,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 08:51:31,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 08:51:31,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 08:51:32,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:32,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:51:32,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:51:33,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:51:34,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:51:34,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1208620.0, ans=0.125 2023-10-03 08:51:41,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:41,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:51:42,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:51:45,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:51:45,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 08:51:46,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:51:50,886 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.91 vs. limit=15.0 2023-10-03 08:51:51,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1208686.6666666667, ans=0.125 2023-10-03 08:51:52,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:51:52,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:51:52,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:51:54,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:51:56,798 INFO [train.py:1046] (1/4) Epoch 35, batch 700, loss[loss=0.1621, simple_loss=0.2422, pruned_loss=0.04098, over 23575.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2387, pruned_loss=0.04085, over 4591231.46 frames. ], batch size: 134, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:51:57,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 08:51:58,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 08:52:01,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 08:52:02,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:52:02,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1208753.3333333333, ans=0.125 2023-10-03 08:52:03,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:52:05,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 08:52:06,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1208753.3333333333, ans=0.125 2023-10-03 08:52:10,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1208820.0, ans=0.0 2023-10-03 08:52:12,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:52:13,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:52:16,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:52:17,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 08:52:17,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:52:21,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:52:23,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 08:52:23,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:52:25,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 08:52:27,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 08:52:28,342 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.59 vs. limit=15.0 2023-10-03 08:52:33,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:52:33,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:52:33,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:52:37,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:52:39,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 08:52:39,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1208953.3333333333, ans=0.0 2023-10-03 08:52:43,898 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.22 vs. limit=15.0 2023-10-03 08:52:44,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:52:44,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:52:46,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 08:52:48,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:52:50,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:52:53,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:52:57,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:52:57,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 08:52:59,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1209020.0, ans=0.0 2023-10-03 08:53:00,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1209020.0, ans=0.0 2023-10-03 08:53:02,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 08:53:02,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 08:53:03,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1209020.0, ans=0.125 2023-10-03 08:53:04,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:53:06,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:53:07,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:53:10,790 INFO [train.py:1046] (1/4) Epoch 35, batch 750, loss[loss=0.1528, simple_loss=0.2342, pruned_loss=0.03572, over 24685.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2381, pruned_loss=0.04027, over 4632378.81 frames. ], batch size: 65, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:53:10,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:53:10,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 08:53:14,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1209086.6666666667, ans=0.0 2023-10-03 08:53:16,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 08:53:17,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 08:53:17,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 08:53:18,616 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 2.000e+02 2.273e+02 2.606e+02 4.191e+02, threshold=4.547e+02, percent-clipped=0.0 2023-10-03 08:53:18,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 08:53:18,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 08:53:18,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1209086.6666666667, ans=0.125 2023-10-03 08:53:20,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:53:21,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 08:53:21,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:53:22,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:53:24,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:53:26,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:53:27,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:53:27,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:53:29,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:53:30,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:53:31,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:53:33,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:53:33,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:53:34,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 08:53:36,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:53:36,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:53:37,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:53:37,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1209153.3333333333, ans=0.0 2023-10-03 08:53:38,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 08:53:40,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 08:53:40,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:53:43,146 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.66 vs. limit=6.0 2023-10-03 08:53:43,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 08:53:43,605 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 08:53:43,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 08:53:43,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:53:43,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:53:45,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:53:49,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1209220.0, ans=0.0 2023-10-03 08:53:51,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:53:52,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:53:52,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:53:55,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:53:57,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:53:57,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 08:53:58,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:53:59,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 08:53:59,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:54:02,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:54:02,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 08:54:02,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:54:03,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1209286.6666666667, ans=0.0 2023-10-03 08:54:08,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:54:10,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:54:10,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:54:11,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1209353.3333333333, ans=0.125 2023-10-03 08:54:12,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:54:17,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 08:54:17,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:54:17,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:54:20,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:54:20,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:54:23,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:54:23,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:54:25,500 INFO [train.py:1046] (1/4) Epoch 35, batch 800, loss[loss=0.1579, simple_loss=0.2306, pruned_loss=0.0426, over 23567.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2387, pruned_loss=0.04074, over 4647992.97 frames. ], batch size: 256, lr: 2.93e-03, grad_scale: 32.0 2023-10-03 08:54:31,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:54:31,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:54:34,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:54:34,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:54:34,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:54:34,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:54:37,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:54:40,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:54:41,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:54:43,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 08:54:45,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:54:47,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:54:47,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:54:47,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:54:47,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1209486.6666666667, ans=0.125 2023-10-03 08:54:48,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 08:54:48,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:54:48,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 08:54:52,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:54:54,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:54:56,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:54:56,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:54:59,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1209553.3333333333, ans=0.125 2023-10-03 08:55:00,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:55:00,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:55:04,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:55:05,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:55:05,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 08:55:07,384 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 08:55:07,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 08:55:08,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:55:08,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:55:10,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:55:10,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:55:15,961 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 08:55:16,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 08:55:19,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:55:21,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:55:25,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:55:29,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:55:29,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 08:55:30,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:55:32,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 08:55:38,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:55:39,647 INFO [train.py:1046] (1/4) Epoch 35, batch 850, loss[loss=0.15, simple_loss=0.2328, pruned_loss=0.03362, over 24507.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2399, pruned_loss=0.04086, over 4673260.28 frames. ], batch size: 63, lr: 2.93e-03, grad_scale: 32.0 2023-10-03 08:55:39,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:55:41,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 08:55:41,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:55:42,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:55:43,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 08:55:43,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:55:46,872 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.835e+02 2.028e+02 2.413e+02 3.992e+02, threshold=4.056e+02, percent-clipped=0.0 2023-10-03 08:55:46,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:55:47,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:55:47,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1209753.3333333333, ans=0.125 2023-10-03 08:55:49,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:55:50,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:55:52,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 08:55:52,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 08:55:52,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 08:55:52,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:55:52,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:55:56,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:55:56,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:55:56,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:56:00,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:56:01,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:56:02,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 08:56:06,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 08:56:09,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:56:10,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 08:56:14,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 08:56:16,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 08:56:17,690 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 08:56:17,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:56:17,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:56:17,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 08:56:21,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:56:22,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:56:22,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 08:56:26,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:56:26,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:56:27,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:56:27,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:56:30,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:56:30,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:56:31,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 08:56:34,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:56:34,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:56:35,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:56:35,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:56:37,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:56:40,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:56:41,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:56:43,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 08:56:43,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:56:44,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:56:48,179 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.09 vs. limit=12.0 2023-10-03 08:56:54,424 INFO [train.py:1046] (1/4) Epoch 35, batch 900, loss[loss=0.1603, simple_loss=0.2375, pruned_loss=0.04157, over 23250.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2408, pruned_loss=0.0412, over 4687797.02 frames. ], batch size: 93, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:56:54,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:56:54,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:56:54,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 08:56:55,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:56:55,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:56:57,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 08:57:04,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:57:05,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:57:06,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 08:57:10,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:57:10,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 08:57:11,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 08:57:13,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:57:13,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:57:14,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:57:14,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:57:14,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1210153.3333333333, ans=0.05 2023-10-03 08:57:22,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:57:22,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:57:23,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:57:27,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:57:30,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 08:57:31,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1210220.0, ans=0.1 2023-10-03 08:57:32,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:57:37,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 08:57:37,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff2.min_abs, batch_count=1210286.6666666667, ans=0.1 2023-10-03 08:57:38,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:57:38,364 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 08:57:39,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 08:57:43,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1210286.6666666667, ans=0.125 2023-10-03 08:57:45,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:57:45,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:57:47,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:57:53,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:57:53,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:57:55,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 08:57:55,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:57:56,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1210353.3333333333, ans=0.125 2023-10-03 08:57:58,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 08:58:01,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:58:01,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:58:02,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:58:02,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:58:06,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 08:58:06,819 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 08:58:08,144 INFO [train.py:1046] (1/4) Epoch 35, batch 950, loss[loss=0.1835, simple_loss=0.2638, pruned_loss=0.05154, over 24455.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2406, pruned_loss=0.0413, over 4692244.27 frames. ], batch size: 69, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:58:08,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 08:58:09,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 08:58:11,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:58:11,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1210420.0, ans=0.1 2023-10-03 08:58:14,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 08:58:17,059 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 2.074e+02 2.262e+02 2.631e+02 4.033e+02, threshold=4.525e+02, percent-clipped=0.0 2023-10-03 08:58:17,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:58:20,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:58:20,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:58:21,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=1210420.0, ans=22.5 2023-10-03 08:58:22,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:58:23,898 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 08:58:26,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:58:28,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:58:28,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:58:30,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:58:30,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 08:58:31,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 08:58:33,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:58:34,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 08:58:34,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:58:38,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:58:38,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:58:38,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:58:40,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 08:58:42,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:58:44,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:58:44,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:58:46,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1210553.3333333333, ans=0.125 2023-10-03 08:58:50,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:58:50,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:58:54,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 08:58:56,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 08:58:56,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:58:58,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:58:59,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:58:59,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:59:04,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 08:59:04,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:59:06,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1210620.0, ans=0.0 2023-10-03 08:59:07,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:59:08,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:59:09,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 08:59:09,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:59:09,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:59:09,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 08:59:12,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:59:14,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:59:15,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=1210686.6666666667, ans=15.0 2023-10-03 08:59:20,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:59:21,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 08:59:21,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 08:59:23,545 INFO [train.py:1046] (1/4) Epoch 35, batch 1000, loss[loss=0.1648, simple_loss=0.2555, pruned_loss=0.03704, over 24571.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2396, pruned_loss=0.04075, over 4702036.93 frames. ], batch size: 71, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:59:25,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:59:29,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 08:59:31,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:59:34,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:59:35,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 08:59:35,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 08:59:40,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:59:40,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:59:42,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:59:44,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 08:59:47,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 08:59:49,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 08:59:49,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:59:50,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 08:59:51,883 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.83 vs. limit=15.0 2023-10-03 08:59:52,742 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:59:53,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 08:59:53,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 08:59:55,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:59:55,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:00:04,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:00:05,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:00:05,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:00:05,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:00:05,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 09:00:05,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:00:07,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:00:07,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:00:08,446 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 09:00:12,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 09:00:13,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 09:00:13,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 09:00:17,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:00:23,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:00:23,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:00:25,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:00:25,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:00:27,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 09:00:29,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:00:29,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 09:00:30,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 09:00:32,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:00:32,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:00:34,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:00:36,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:00:38,081 INFO [train.py:1046] (1/4) Epoch 35, batch 1050, loss[loss=0.1445, simple_loss=0.2302, pruned_loss=0.02934, over 24446.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2379, pruned_loss=0.04029, over 4701520.48 frames. ], batch size: 58, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:00:38,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:00:41,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:00:43,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:00:44,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1211086.6666666667, ans=0.0 2023-10-03 09:00:45,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 09:00:45,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:00:46,415 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.826e+02 1.998e+02 2.224e+02 3.015e+02, threshold=3.995e+02, percent-clipped=0.0 2023-10-03 09:00:47,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:00:49,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:00:51,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:00:51,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:00:53,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:00:53,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:00:54,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:00:56,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 09:00:56,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:00:57,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 09:00:58,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:00:58,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 09:00:58,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:01:04,830 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.36 vs. limit=22.5 2023-10-03 09:01:05,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:01:06,098 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.96 vs. limit=15.0 2023-10-03 09:01:06,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:01:06,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:01:08,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 09:01:09,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 09:01:09,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:01:12,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 09:01:15,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 09:01:16,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:01:20,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 09:01:22,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:01:22,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:01:22,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:01:27,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:01:31,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 09:01:32,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 09:01:33,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 09:01:34,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:01:34,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:01:35,203 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.13 vs. limit=15.0 2023-10-03 09:01:35,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 09:01:37,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1211353.3333333333, ans=0.125 2023-10-03 09:01:39,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:01:42,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:01:42,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:01:43,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:01:43,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:01:43,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1211353.3333333333, ans=0.125 2023-10-03 09:01:46,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:01:46,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 09:01:47,166 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.14 vs. limit=10.0 2023-10-03 09:01:48,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:01:48,620 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.51 vs. limit=10.0 2023-10-03 09:01:49,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 09:01:49,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 09:01:49,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:01:50,794 INFO [train.py:1046] (1/4) Epoch 35, batch 1100, loss[loss=0.1492, simple_loss=0.2232, pruned_loss=0.0376, over 22888.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2373, pruned_loss=0.03987, over 4700552.51 frames. ], batch size: 322, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:01:52,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:01:59,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:02:03,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:02:03,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:02:04,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:02:04,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 09:02:06,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:02:07,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 09:02:10,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:02:13,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:02:13,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 09:02:14,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 09:02:15,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:02:17,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:02:18,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:02:21,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:02:26,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:02:29,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 09:02:31,440 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 09:02:31,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:02:33,746 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.12 vs. limit=15.0 2023-10-03 09:02:34,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:02:34,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:02:34,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:02:34,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1211620.0, ans=0.2 2023-10-03 09:02:35,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 09:02:37,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:02:37,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:02:37,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:02:37,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:02:37,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 09:02:40,392 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.42 vs. limit=15.0 2023-10-03 09:02:41,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1211620.0, ans=0.125 2023-10-03 09:02:41,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1211620.0, ans=0.125 2023-10-03 09:02:43,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:02:43,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 09:02:45,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:02:48,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:02:51,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 09:02:51,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 09:02:54,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:02:56,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:02:56,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:02:58,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 09:02:58,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:02:59,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:03:00,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 09:03:01,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:03:01,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 09:03:02,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:03:02,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:03:03,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:03:05,196 INFO [train.py:1046] (1/4) Epoch 35, batch 1150, loss[loss=0.1571, simple_loss=0.2449, pruned_loss=0.03463, over 24022.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2387, pruned_loss=0.04026, over 4704340.82 frames. ], batch size: 80, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:03:06,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:03:09,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:03:09,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1211753.3333333333, ans=0.1 2023-10-03 09:03:12,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:03:12,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:03:12,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 09:03:12,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:03:13,385 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.882e+02 2.034e+02 2.362e+02 3.611e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-03 09:03:15,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 09:03:16,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:03:16,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:03:22,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 09:03:23,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:03:28,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:03:29,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:03:29,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 09:03:29,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:03:29,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:03:33,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 09:03:34,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:03:36,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:03:44,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:03:49,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:03:49,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 09:03:49,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:03:50,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1211953.3333333333, ans=0.125 2023-10-03 09:03:51,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:03:56,249 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 09:03:57,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:04:02,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1212020.0, ans=0.125 2023-10-03 09:04:05,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1212020.0, ans=0.125 2023-10-03 09:04:06,292 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 09:04:09,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1212020.0, ans=0.04949747468305833 2023-10-03 09:04:10,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:04:10,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:04:11,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:04:11,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:04:14,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:04:17,440 INFO [train.py:1046] (1/4) Epoch 35, batch 1200, loss[loss=0.156, simple_loss=0.2507, pruned_loss=0.0307, over 24314.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2398, pruned_loss=0.04064, over 4709619.91 frames. ], batch size: 74, lr: 2.92e-03, grad_scale: 32.0 2023-10-03 09:04:20,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:04:20,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:04:21,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:04:21,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:04:23,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:04:23,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1212086.6666666667, ans=0.125 2023-10-03 09:04:26,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:04:26,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:04:29,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:04:29,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:04:33,099 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 09:04:35,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 09:04:40,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:04:42,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:04:44,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:04:44,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1212153.3333333333, ans=0.0 2023-10-03 09:04:46,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:04:46,915 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 09:04:48,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:04:55,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:04:55,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:04:55,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 09:04:57,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:05:00,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 09:05:05,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 09:05:05,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:05:05,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:05:07,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:05:08,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:05:09,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:05:09,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:05:09,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:05:11,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 09:05:12,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:05:12,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:05:12,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:05:13,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:05:14,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:05:16,797 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:05:19,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 09:05:20,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:05:23,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 09:05:23,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1212353.3333333333, ans=0.125 2023-10-03 09:05:27,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1212353.3333333333, ans=0.2 2023-10-03 09:05:29,251 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 09:05:31,165 INFO [train.py:1046] (1/4) Epoch 35, batch 1250, loss[loss=0.1753, simple_loss=0.2487, pruned_loss=0.05094, over 23469.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2404, pruned_loss=0.04077, over 4711259.48 frames. ], batch size: 285, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:05:31,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:05:32,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:05:33,410 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.60 vs. limit=10.0 2023-10-03 09:05:34,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:05:35,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:05:37,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 09:05:40,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:05:41,845 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.899e+02 2.182e+02 2.478e+02 3.266e+02, threshold=4.363e+02, percent-clipped=0.0 2023-10-03 09:05:42,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:05:43,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 09:05:44,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:05:44,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:05:48,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:05:48,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:05:49,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1212486.6666666667, ans=0.125 2023-10-03 09:05:50,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:05:50,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:05:53,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:05:57,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 09:05:57,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 09:05:57,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:05:59,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:06:00,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:06:03,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:06:04,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:06:10,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 09:06:10,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:06:12,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:06:14,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 09:06:14,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:06:15,638 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 09:06:15,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:06:15,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:06:15,892 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1212620.0, ans=0.125 2023-10-03 09:06:18,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:06:21,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:06:22,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:06:23,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 09:06:23,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 09:06:23,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 09:06:26,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:06:28,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 09:06:28,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:06:30,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1212686.6666666667, ans=0.2 2023-10-03 09:06:31,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1212686.6666666667, ans=0.07 2023-10-03 09:06:32,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 09:06:32,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:06:32,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 09:06:32,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 09:06:32,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:06:34,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:06:34,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:06:37,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 09:06:40,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:06:41,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:06:43,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:06:44,593 INFO [train.py:1046] (1/4) Epoch 35, batch 1300, loss[loss=0.1679, simple_loss=0.2452, pruned_loss=0.0453, over 23113.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2407, pruned_loss=0.04124, over 4706395.41 frames. ], batch size: 105, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:06:46,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 09:06:47,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1212753.3333333333, ans=0.1 2023-10-03 09:06:48,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:06:48,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 09:06:52,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:06:55,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:06:57,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:06:57,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:06:59,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1212820.0, ans=0.125 2023-10-03 09:06:59,552 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.89 vs. limit=6.0 2023-10-03 09:07:00,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:07:01,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1212820.0, ans=0.07 2023-10-03 09:07:02,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 09:07:02,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1212820.0, ans=0.035 2023-10-03 09:07:05,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:07:06,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:07:06,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 09:07:09,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:07:13,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:07:14,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:07:15,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:07:17,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:07:17,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:07:18,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 09:07:18,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 09:07:24,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:07:24,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:07:24,734 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=11.01 vs. limit=15.0 2023-10-03 09:07:27,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 09:07:27,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 09:07:30,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:07:33,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:07:33,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 09:07:33,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:07:34,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 09:07:34,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:07:36,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1212953.3333333333, ans=0.0 2023-10-03 09:07:39,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:07:39,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:07:42,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 09:07:43,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 09:07:44,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 09:07:48,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:07:50,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 09:07:53,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:07:57,295 INFO [train.py:1046] (1/4) Epoch 35, batch 1350, loss[loss=0.1586, simple_loss=0.2434, pruned_loss=0.03691, over 24683.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.24, pruned_loss=0.04092, over 4702036.24 frames. ], batch size: 65, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:07:59,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 09:08:01,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:08:03,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:08:07,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:08:08,375 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.947e+02 2.144e+02 2.393e+02 3.515e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 09:08:08,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:08:09,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:08:11,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:08:15,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:08:17,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 09:08:17,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 09:08:18,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:08:21,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 09:08:21,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:08:24,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:08:24,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 09:08:24,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1213153.3333333333, ans=0.125 2023-10-03 09:08:25,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 09:08:26,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 09:08:28,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:08:28,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 09:08:39,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:08:49,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:08:49,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:08:49,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 09:08:49,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1213286.6666666667, ans=0.125 2023-10-03 09:08:52,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:08:52,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 09:08:53,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 09:08:53,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:08:56,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:08:59,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 09:09:00,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:09:04,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 09:09:07,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 09:09:12,359 INFO [train.py:1046] (1/4) Epoch 35, batch 1400, loss[loss=0.1531, simple_loss=0.2283, pruned_loss=0.03895, over 23542.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2394, pruned_loss=0.04086, over 4698113.18 frames. ], batch size: 256, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:09:15,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 09:09:16,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:09:19,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:09:19,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:09:22,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1213420.0, ans=0.125 2023-10-03 09:09:23,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 09:09:26,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 09:09:28,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1213486.6666666667, ans=0.125 2023-10-03 09:09:34,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:09:36,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:09:39,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:09:39,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:09:42,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:09:45,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 09:09:49,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1213553.3333333333, ans=0.125 2023-10-03 09:09:52,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:09:53,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:09:56,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 09:09:58,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:09:58,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:09:58,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:09:59,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:10:01,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:10:01,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:10:01,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:10:03,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 09:10:03,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:10:05,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:12,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:10:16,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 09:10:18,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 09:10:19,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:10:20,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 09:10:22,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:10:24,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:10:26,292 INFO [train.py:1046] (1/4) Epoch 35, batch 1450, loss[loss=0.1397, simple_loss=0.2236, pruned_loss=0.02788, over 24272.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2386, pruned_loss=0.04035, over 4706329.68 frames. ], batch size: 61, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:10:29,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:10:31,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:10:31,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:31,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 09:10:36,987 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.854e+02 2.034e+02 2.256e+02 3.370e+02, threshold=4.069e+02, percent-clipped=0.0 2023-10-03 09:10:38,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:10:38,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:10:40,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:10:40,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 09:10:41,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:10:43,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 09:10:43,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:43,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:10:43,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 09:10:45,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:10:46,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:10:46,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 09:10:46,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:10:46,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:10:46,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1213820.0, ans=0.0 2023-10-03 09:10:48,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:50,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:10:55,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:10:55,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:10:56,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:10:57,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:59,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:10:59,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:10:59,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:59,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:11:02,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 09:11:07,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:11:10,451 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 09:11:11,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:11:13,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:11:15,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:11:15,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1213953.3333333333, ans=0.2 2023-10-03 09:11:15,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1213953.3333333333, ans=0.125 2023-10-03 09:11:16,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 09:11:19,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:11:20,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 09:11:22,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 09:11:23,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:11:27,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:11:27,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:11:28,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 09:11:32,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 09:11:33,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 09:11:33,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:11:35,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:11:41,108 INFO [train.py:1046] (1/4) Epoch 35, batch 1500, loss[loss=0.1623, simple_loss=0.2369, pruned_loss=0.04387, over 23829.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2388, pruned_loss=0.04037, over 4713747.23 frames. ], batch size: 195, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:11:41,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1214086.6666666667, ans=0.5 2023-10-03 09:11:42,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 09:11:42,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:11:42,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:11:44,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:11:44,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1214086.6666666667, ans=0.0 2023-10-03 09:11:45,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:11:45,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:11:47,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 09:11:48,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:11:48,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:11:48,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:11:48,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:11:51,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:11:53,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:11:58,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:11:58,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 09:11:59,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:11:59,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:12:01,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:12:05,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 09:12:06,600 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.81 vs. limit=10.0 2023-10-03 09:12:09,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 09:12:11,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:12:11,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 09:12:14,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 09:12:17,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:12:17,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:12:17,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:12:18,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 09:12:18,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:12:18,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:12:20,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 09:12:20,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:12:25,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:12:25,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 09:12:28,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1214286.6666666667, ans=0.125 2023-10-03 09:12:31,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:12:33,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:12:37,273 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 09:12:37,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:12:37,332 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 09:12:39,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:12:41,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:12:41,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1214353.3333333333, ans=0.125 2023-10-03 09:12:42,394 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 09:12:43,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:12:45,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 09:12:46,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:12:49,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:12:51,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:12:51,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:12:51,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:12:51,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:12:54,135 INFO [train.py:1046] (1/4) Epoch 35, batch 1550, loss[loss=0.1531, simple_loss=0.2352, pruned_loss=0.03552, over 23646.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2396, pruned_loss=0.04058, over 4709061.60 frames. ], batch size: 85, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:12:54,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 09:12:54,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 09:12:54,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:12:55,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 09:12:55,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 09:12:57,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1214420.0, ans=0.125 2023-10-03 09:12:58,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:12:58,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:12:59,005 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.71 vs. limit=12.0 2023-10-03 09:12:59,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:12:59,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:12:59,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:13:01,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:13:04,292 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.874e+02 2.151e+02 2.475e+02 3.456e+02, threshold=4.303e+02, percent-clipped=0.0 2023-10-03 09:13:04,415 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 09:13:04,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:13:04,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:13:05,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:13:09,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:13:09,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 09:13:11,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:13:12,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 09:13:13,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 09:13:13,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 09:13:13,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:13:15,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:13:19,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:13:22,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 09:13:22,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 09:13:25,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1214553.3333333333, ans=0.0 2023-10-03 09:13:31,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:13:34,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:13:34,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:13:34,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:13:35,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 09:13:35,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1214553.3333333333, ans=0.125 2023-10-03 09:13:41,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:13:43,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:13:46,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:13:46,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff3.min_abs, batch_count=1214620.0, ans=0.2 2023-10-03 09:13:47,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:13:49,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:13:49,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 09:13:49,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:13:52,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:13:52,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:13:54,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 09:13:54,221 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 09:13:57,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:13:57,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1214686.6666666667, ans=0.0 2023-10-03 09:13:58,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1214686.6666666667, ans=0.04949747468305833 2023-10-03 09:13:59,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 09:14:05,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:14:05,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:14:05,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 09:14:05,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1214686.6666666667, ans=0.125 2023-10-03 09:14:08,473 INFO [train.py:1046] (1/4) Epoch 35, batch 1600, loss[loss=0.1603, simple_loss=0.2473, pruned_loss=0.0367, over 24122.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2404, pruned_loss=0.04069, over 4718585.77 frames. ], batch size: 80, lr: 2.92e-03, grad_scale: 32.0 2023-10-03 09:14:08,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:14:09,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:14:09,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:14:09,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:14:11,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:14:16,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:14:17,304 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.73 vs. limit=10.0 2023-10-03 09:14:17,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 09:14:17,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 09:14:18,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1214753.3333333333, ans=0.125 2023-10-03 09:14:19,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 09:14:20,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:14:22,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 09:14:23,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:14:25,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:14:29,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:14:32,850 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.75 vs. limit=22.5 2023-10-03 09:14:33,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 09:14:33,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1214820.0, ans=0.1 2023-10-03 09:14:36,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:14:37,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 09:14:37,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:14:37,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 09:14:42,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 09:14:50,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:14:51,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 09:14:52,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:14:52,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:14:52,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:14:56,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 09:15:00,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 09:15:01,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:15:01,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:15:03,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:15:04,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:15:05,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:15:07,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:15:08,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:15:13,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:15:14,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:15:16,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 09:15:16,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:15:18,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 09:15:22,295 INFO [train.py:1046] (1/4) Epoch 35, batch 1650, loss[loss=0.1715, simple_loss=0.2543, pruned_loss=0.04437, over 23275.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2412, pruned_loss=0.04083, over 4718494.35 frames. ], batch size: 93, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:15:24,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:15:24,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:15:24,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1215086.6666666667, ans=0.125 2023-10-03 09:15:25,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:15:25,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 09:15:25,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 09:15:25,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 09:15:27,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 09:15:31,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:15:31,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:15:32,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:15:32,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:15:34,006 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.944e+02 2.112e+02 2.392e+02 3.284e+02, threshold=4.225e+02, percent-clipped=0.0 2023-10-03 09:15:35,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:15:36,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 09:15:39,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:15:39,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:15:39,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:15:39,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:15:39,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 09:15:39,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 09:15:39,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1215153.3333333333, ans=0.125 2023-10-03 09:15:43,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1215153.3333333333, ans=0.2 2023-10-03 09:15:45,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:15:47,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:15:49,939 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.00 vs. limit=15.0 2023-10-03 09:15:50,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1215220.0, ans=0.125 2023-10-03 09:15:55,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 09:15:56,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:15:58,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 09:15:59,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:16:02,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1215220.0, ans=0.09899494936611666 2023-10-03 09:16:03,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:16:04,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:16:04,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:16:04,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:16:05,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:16:07,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:16:09,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:16:09,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:16:11,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:16:11,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1215286.6666666667, ans=0.125 2023-10-03 09:16:12,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:16:13,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:16:17,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:16:17,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 09:16:20,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:16:20,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 09:16:21,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 09:16:21,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 09:16:21,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:16:23,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:16:24,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:16:24,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:16:24,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 09:16:28,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:16:29,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:16:29,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:16:33,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 09:16:36,037 INFO [train.py:1046] (1/4) Epoch 35, batch 1700, loss[loss=0.1749, simple_loss=0.2611, pruned_loss=0.04429, over 24430.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2402, pruned_loss=0.04087, over 4712064.10 frames. ], batch size: 77, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:16:36,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:16:36,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:16:37,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 09:16:37,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:16:38,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:16:38,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:16:41,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:16:41,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:16:41,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 09:16:43,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:16:48,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1215420.0, ans=0.035 2023-10-03 09:16:51,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:16:54,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:16:58,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:16:58,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:16:59,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:16:59,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:17:01,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 09:17:03,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:17:03,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:06,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:17:07,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:17:07,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1215553.3333333333, ans=0.04949747468305833 2023-10-03 09:17:08,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 09:17:10,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 09:17:12,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:13,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 09:17:15,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:17:20,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1215620.0, ans=0.125 2023-10-03 09:17:24,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:17:24,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:17:24,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:17:24,674 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.85 vs. limit=15.0 2023-10-03 09:17:25,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 09:17:25,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 09:17:25,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:17:28,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:28,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 09:17:28,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:17:28,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:17:28,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:28,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:17:31,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:17:31,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:17:32,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:17:32,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:17:34,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:17:34,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1215686.6666666667, ans=0.125 2023-10-03 09:17:34,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1215686.6666666667, ans=0.125 2023-10-03 09:17:36,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:17:38,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 09:17:38,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1215686.6666666667, ans=0.1 2023-10-03 09:17:40,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:17:42,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:17:45,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 09:17:51,154 INFO [train.py:1046] (1/4) Epoch 35, batch 1750, loss[loss=0.1938, simple_loss=0.2618, pruned_loss=0.06295, over 23767.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2388, pruned_loss=0.0408, over 4697716.21 frames. ], batch size: 179, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:17:51,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:17:53,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:17:54,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:17:55,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 09:17:55,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:59,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:17:59,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:18:02,912 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.871e+02 1.981e+02 2.197e+02 2.904e+02, threshold=3.962e+02, percent-clipped=0.0 2023-10-03 09:18:03,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 09:18:05,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:18:05,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1215820.0, ans=0.04949747468305833 2023-10-03 09:18:08,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 09:18:08,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:18:10,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:18:13,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 09:18:15,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 09:18:15,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:18:16,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 09:18:22,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:18:22,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1215886.6666666667, ans=0.125 2023-10-03 09:18:24,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:18:24,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:18:28,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:18:28,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:18:31,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:18:33,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:18:34,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:18:35,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:18:37,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 09:18:38,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:18:40,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 09:18:42,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:18:42,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1215953.3333333333, ans=0.125 2023-10-03 09:18:43,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:18:45,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:18:50,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:18:50,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 09:18:51,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:18:52,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:18:55,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1216020.0, ans=0.125 2023-10-03 09:18:56,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:18:59,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:19:01,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:19:02,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 09:19:02,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:19:04,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:19:04,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:19:05,617 INFO [train.py:1046] (1/4) Epoch 35, batch 1800, loss[loss=0.1695, simple_loss=0.2598, pruned_loss=0.03956, over 24564.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2378, pruned_loss=0.0405, over 4697679.28 frames. ], batch size: 71, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:19:05,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:19:05,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:19:05,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:19:08,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:19:08,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:19:10,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 09:19:13,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:19:15,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:19:17,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:19:20,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:19:21,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:19:21,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:19:23,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:19:26,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:19:26,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 09:19:26,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:19:30,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:19:34,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 09:19:35,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 09:19:37,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 09:19:37,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:19:37,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:19:37,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:19:39,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:19:45,348 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 09:19:45,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:19:47,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:19:49,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 09:19:49,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 09:19:51,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:19:52,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:19:53,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:19:54,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1216286.6666666667, ans=0.125 2023-10-03 09:19:55,712 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:19:58,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 09:20:05,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:20:05,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 09:20:05,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:20:05,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:20:06,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:20:06,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 09:20:11,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:20:11,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:20:12,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 09:20:12,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:20:15,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:20:16,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:20:16,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:20:16,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:20:18,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:20:19,689 INFO [train.py:1046] (1/4) Epoch 35, batch 1850, loss[loss=0.17, simple_loss=0.2418, pruned_loss=0.04906, over 23723.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2388, pruned_loss=0.04056, over 4715386.39 frames. ], batch size: 232, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:20:19,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:20:19,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:20:21,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1216420.0, ans=0.0 2023-10-03 09:20:22,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:20:23,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:20:30,535 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.898e+02 2.066e+02 2.341e+02 4.051e+02, threshold=4.131e+02, percent-clipped=1.0 2023-10-03 09:20:30,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:20:30,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 09:20:32,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=1216486.6666666667, ans=0.05 2023-10-03 09:20:34,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 09:20:37,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 09:20:40,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:20:40,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 09:20:40,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 09:20:47,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1216553.3333333333, ans=0.09899494936611666 2023-10-03 09:20:51,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:20:52,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 09:20:53,192 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.01 vs. limit=12.0 2023-10-03 09:20:54,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1216553.3333333333, ans=0.0 2023-10-03 09:20:55,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:20:56,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:20:58,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1216553.3333333333, ans=0.125 2023-10-03 09:20:59,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 09:20:59,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:00,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:21:02,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:21:02,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1216620.0, ans=0.1 2023-10-03 09:21:03,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1216620.0, ans=0.125 2023-10-03 09:21:05,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:21:05,824 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.13 vs. limit=15.0 2023-10-03 09:21:07,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:21:09,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:21:09,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:10,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 09:21:10,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:21:12,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:21:14,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:21:17,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 09:21:17,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:21:22,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:21:23,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:21:23,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 09:21:23,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 09:21:25,134 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 09:21:26,439 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 09:21:27,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:21:27,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:21:27,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:21:29,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:29,236 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 09:21:30,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:21:30,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:31,901 INFO [train.py:1046] (1/4) Epoch 35, batch 1900, loss[loss=0.1579, simple_loss=0.234, pruned_loss=0.04088, over 24344.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2399, pruned_loss=0.04066, over 4726708.07 frames. ], batch size: 56, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:21:31,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:21:32,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:21:32,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1216753.3333333333, ans=0.1 2023-10-03 09:21:33,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:21:33,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 09:21:36,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:36,218 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 09:21:36,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:21:37,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1216753.3333333333, ans=0.125 2023-10-03 09:21:38,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:21:42,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:21:44,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:21:45,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1216820.0, ans=0.125 2023-10-03 09:21:46,239 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 09:21:46,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 09:21:47,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:21:49,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:21:49,531 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 09:21:49,556 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 09:21:54,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 09:21:55,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:21:59,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 09:22:00,443 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.61 vs. limit=15.0 2023-10-03 09:22:01,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 09:22:08,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 09:22:10,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 09:22:10,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:22:12,373 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 09:22:12,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 09:22:12,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 09:22:13,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 09:22:13,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:22:15,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1216953.3333333333, ans=0.0 2023-10-03 09:22:18,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 09:22:20,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:22:23,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:22:23,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 09:22:25,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1216953.3333333333, ans=0.0 2023-10-03 09:22:26,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:22:30,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 09:22:31,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:22:37,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:22:37,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:22:37,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:22:37,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:22:39,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:22:40,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 09:22:40,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:22:43,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:22:43,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:22:44,700 INFO [train.py:1046] (1/4) Epoch 35, batch 1950, loss[loss=0.1507, simple_loss=0.2313, pruned_loss=0.03505, over 24642.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2406, pruned_loss=0.04137, over 4715833.09 frames. ], batch size: 65, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:22:45,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=1217086.6666666667, ans=0.1 2023-10-03 09:22:46,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:22:46,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:22:46,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:22:47,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:22:52,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:22:53,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:22:55,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:22:55,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:22:56,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 09:22:58,297 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.842e+02 2.075e+02 2.339e+02 3.045e+02, threshold=4.150e+02, percent-clipped=0.0 2023-10-03 09:22:58,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 09:22:58,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:22:59,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:23:00,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1217153.3333333333, ans=0.125 2023-10-03 09:23:00,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1217153.3333333333, ans=0.125 2023-10-03 09:23:01,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:23:01,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:01,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:02,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:23:03,076 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:23:05,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:23:06,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:23:06,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:23:06,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:09,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:12,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:23:12,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:12,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 09:23:12,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 09:23:12,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:23:12,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:23:13,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:23:17,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1217220.0, ans=0.0 2023-10-03 09:23:18,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:20,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:23:22,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1217220.0, ans=0.125 2023-10-03 09:23:25,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:23:28,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:23:29,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:23:29,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 09:23:29,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:23:31,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1217286.6666666667, ans=0.125 2023-10-03 09:23:32,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:23:34,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:23:34,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:23:41,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=1217286.6666666667, ans=0.025 2023-10-03 09:23:42,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:43,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:46,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:49,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:23:53,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:23:54,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:23:55,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 09:23:55,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:23:56,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:56,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 09:23:58,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:23:59,663 INFO [train.py:1046] (1/4) Epoch 35, batch 2000, loss[loss=0.1702, simple_loss=0.2518, pruned_loss=0.04428, over 23728.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2418, pruned_loss=0.04135, over 4707977.69 frames. ], batch size: 85, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:24:02,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:24:02,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:24:03,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:24:04,473 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.44 vs. limit=10.0 2023-10-03 09:24:05,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:24:07,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:24:10,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 09:24:10,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:24:13,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:24:13,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1217486.6666666667, ans=0.0 2023-10-03 09:24:14,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 09:24:16,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:24:16,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:24:19,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:24:20,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 09:24:22,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:23,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:25,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:25,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 09:24:25,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:24:28,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 09:24:28,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:24:31,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:24:32,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 09:24:32,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:32,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:24:34,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:24:35,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 09:24:38,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 09:24:38,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:24:38,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:24:41,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1217620.0, ans=0.125 2023-10-03 09:24:41,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1217620.0, ans=0.0 2023-10-03 09:24:42,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:24:44,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:24:44,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:24:44,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1217620.0, ans=0.0 2023-10-03 09:24:45,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:24:46,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:24:46,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:24:46,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1217620.0, ans=0.125 2023-10-03 09:24:47,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:24:47,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:24:49,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:52,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:24:54,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 09:24:59,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:25:00,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:02,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:03,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:25:05,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:08,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:25:08,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:09,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:25:09,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:25:12,121 INFO [train.py:1046] (1/4) Epoch 35, batch 2050, loss[loss=0.1594, simple_loss=0.214, pruned_loss=0.05246, over 19311.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2405, pruned_loss=0.04139, over 4683867.64 frames. ], batch size: 388, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:25:12,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:12,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:13,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:25:15,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:19,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:25:21,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1217753.3333333333, ans=0.09899494936611666 2023-10-03 09:25:22,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:25:24,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:25,765 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.915e+02 2.069e+02 2.253e+02 3.253e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-03 09:25:25,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:25:27,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 09:25:27,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:25:29,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:25:29,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:25:40,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:25:40,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:43,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 09:25:44,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:44,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 09:25:44,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:25:47,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:25:50,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:25:52,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:25:52,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:25:53,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:25:55,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:25:55,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:25:57,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1217953.3333333333, ans=0.125 2023-10-03 09:26:00,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:26:01,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:26:03,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:26:04,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:26:07,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:26:12,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:26:12,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 09:26:13,946 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.84 vs. limit=15.0 2023-10-03 09:26:17,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:26:18,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:26:18,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1218020.0, ans=0.125 2023-10-03 09:26:20,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:26:22,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 09:26:26,576 INFO [train.py:1046] (1/4) Epoch 35, batch 2100, loss[loss=0.1607, simple_loss=0.2262, pruned_loss=0.04756, over 23426.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2398, pruned_loss=0.0411, over 4699629.35 frames. ], batch size: 285, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:26:26,644 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 09:26:26,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:26:28,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:26:28,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:26:29,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:26:29,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 09:26:29,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 09:26:31,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1218086.6666666667, ans=0.0 2023-10-03 09:26:32,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:26:35,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:26:35,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:26:37,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1218086.6666666667, ans=0.1 2023-10-03 09:26:39,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:26:39,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:26:40,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 09:26:42,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:26:42,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 09:26:42,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 09:26:43,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:26:43,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:26:43,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 09:26:44,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 09:26:50,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 09:26:50,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:26:53,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:26:53,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:26:57,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:26:58,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 09:26:58,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:26:58,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 09:26:58,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1218220.0, ans=0.1 2023-10-03 09:27:00,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 09:27:00,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:27:00,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 09:27:01,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 09:27:01,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 09:27:03,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:27:04,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:27:06,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:27:07,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:27:08,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:10,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:27:10,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 09:27:11,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:27:11,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:27:11,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:11,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 09:27:14,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 09:27:15,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 09:27:17,780 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.50 vs. limit=15.0 2023-10-03 09:27:19,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:27:20,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1218286.6666666667, ans=0.2 2023-10-03 09:27:23,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:27:23,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 09:27:28,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:27:31,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:27:31,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:27:31,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:27:31,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 09:27:32,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:27:34,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:27:34,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:27:35,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:27:35,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:37,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 09:27:38,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 09:27:38,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:27:40,018 INFO [train.py:1046] (1/4) Epoch 35, batch 2150, loss[loss=0.1452, simple_loss=0.2291, pruned_loss=0.03064, over 24665.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2388, pruned_loss=0.0406, over 4704166.58 frames. ], batch size: 65, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:27:40,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:27:40,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:27:40,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:27:41,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:27:44,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1218420.0, ans=0.1 2023-10-03 09:27:45,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 09:27:47,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:27:48,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:51,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:27:51,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:27:51,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:27:54,160 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.825e+02 1.971e+02 2.203e+02 3.479e+02, threshold=3.943e+02, percent-clipped=0.0 2023-10-03 09:27:55,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:55,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:27:55,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:28:00,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:00,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 09:28:05,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:28:05,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:28:06,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:06,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:28:06,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:06,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:28:07,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:28:07,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:28:07,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:28:09,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 09:28:10,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:28:12,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:28:13,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:28:13,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:28:14,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:28:18,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:28:18,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:28:18,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1218553.3333333333, ans=0.2 2023-10-03 09:28:19,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:28:19,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 09:28:20,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:28:22,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:28:23,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:24,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:28:25,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:28:27,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:28:28,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:28,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 09:28:30,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 09:28:30,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:28:30,465 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 09:28:31,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:28:31,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:28:33,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 09:28:33,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:28:33,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 09:28:33,189 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 09:28:33,189 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 09:28:34,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 09:28:36,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:28:36,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:28:36,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:28:36,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:37,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 09:28:37,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:28:37,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:39,452 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:28:47,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:28:48,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 09:28:51,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:28:53,566 INFO [train.py:1046] (1/4) Epoch 35, batch 2200, loss[loss=0.1699, simple_loss=0.2451, pruned_loss=0.04733, over 23450.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2388, pruned_loss=0.04074, over 4707422.28 frames. ], batch size: 285, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:28:58,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:59,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:28:59,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:29:01,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:29:04,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:29:04,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:29:04,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 09:29:08,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 09:29:10,639 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.56 vs. limit=22.5 2023-10-03 09:29:11,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:29:15,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 09:29:17,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:29:18,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:29:18,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:29:23,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:29:23,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 09:29:27,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:29:29,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:29:29,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 09:29:32,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:29:35,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:29:35,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1218886.6666666667, ans=0.125 2023-10-03 09:29:36,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:29:36,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:29:39,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 09:29:39,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:29:40,614 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.52 vs. limit=15.0 2023-10-03 09:29:41,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 09:29:42,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:29:42,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 09:29:42,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:29:45,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:29:45,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:29:45,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:29:45,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:29:46,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:29:48,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:29:49,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:29:52,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 09:29:54,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:29:57,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:29:58,898 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 09:30:00,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:30:00,398 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 09:30:00,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1219020.0, ans=0.2 2023-10-03 09:30:01,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:30:01,752 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 09:30:03,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:30:03,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 09:30:06,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:30:07,817 INFO [train.py:1046] (1/4) Epoch 35, batch 2250, loss[loss=0.2101, simple_loss=0.2731, pruned_loss=0.07354, over 19261.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2397, pruned_loss=0.04108, over 4709452.88 frames. ], batch size: 389, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:30:07,934 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 09:30:10,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:30:11,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:30:17,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:30:18,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:30:19,284 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:30:21,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1219153.3333333333, ans=0.125 2023-10-03 09:30:22,205 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.843e+02 1.976e+02 2.228e+02 2.990e+02, threshold=3.951e+02, percent-clipped=0.0 2023-10-03 09:30:23,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:30:24,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:30:26,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:30:27,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 09:30:27,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:30:28,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:30:29,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 09:30:29,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:30:29,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:30:30,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1219153.3333333333, ans=0.125 2023-10-03 09:30:31,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:30:38,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:30:38,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 09:30:40,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:30:41,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 09:30:43,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:30:44,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:30:47,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:30:48,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:30:50,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:30:50,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:30:52,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1219286.6666666667, ans=0.0 2023-10-03 09:30:53,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:30:53,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:30:56,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:30:59,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:31:04,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:31:04,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:31:05,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:31:12,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 09:31:15,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:31:15,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 09:31:15,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:31:17,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:31:18,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 09:31:21,141 INFO [train.py:1046] (1/4) Epoch 35, batch 2300, loss[loss=0.1416, simple_loss=0.2249, pruned_loss=0.02913, over 24327.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2406, pruned_loss=0.04168, over 4702980.68 frames. ], batch size: 56, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:31:21,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:31:22,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:31:28,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:31:28,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:31:30,606 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 09:31:31,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:31:38,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:31:38,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 09:31:38,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:31:39,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:31:39,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 09:31:41,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:31:42,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:31:43,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:31:46,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:31:48,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:31:52,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:31:53,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1219553.3333333333, ans=0.125 2023-10-03 09:31:57,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:31:57,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:32:00,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:32:03,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:32:07,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:32:08,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:32:08,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:32:08,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 09:32:09,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1219620.0, ans=0.1 2023-10-03 09:32:10,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1219620.0, ans=0.0 2023-10-03 09:32:13,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 09:32:13,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:32:15,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:32:15,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:32:15,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:32:16,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 09:32:16,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:32:16,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 09:32:16,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:32:16,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:32:17,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 09:32:18,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1219620.0, ans=0.0 2023-10-03 09:32:23,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1219686.6666666667, ans=0.125 2023-10-03 09:32:24,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:32:27,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1219686.6666666667, ans=0.1 2023-10-03 09:32:29,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:32:34,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:32:34,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:32:34,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:32:35,435 INFO [train.py:1046] (1/4) Epoch 35, batch 2350, loss[loss=0.1446, simple_loss=0.2223, pruned_loss=0.03344, over 20622.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2414, pruned_loss=0.0422, over 4702046.36 frames. ], batch size: 45, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:32:35,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:32:37,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:32:37,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:32:38,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 09:32:44,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:32:44,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 09:32:49,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 09:32:50,632 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.929e+02 2.127e+02 2.368e+02 3.367e+02, threshold=4.254e+02, percent-clipped=0.0 2023-10-03 09:32:52,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:32:54,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:32:54,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:32:56,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:32:56,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:32:57,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 09:33:00,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:33:05,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 09:33:07,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:33:10,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:33:10,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:33:12,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:33:15,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 09:33:15,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:33:17,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:33:17,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:33:18,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:33:21,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:33:22,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 09:33:22,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1219953.3333333333, ans=0.125 2023-10-03 09:33:23,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:33:25,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:33:25,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:33:26,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 09:33:28,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:33:31,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 09:33:31,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:33:37,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 09:33:40,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 09:33:40,983 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.34 vs. limit=22.5 2023-10-03 09:33:42,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:33:42,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 09:33:42,206 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 09:33:42,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 09:33:43,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 09:33:43,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1220020.0, ans=0.0 2023-10-03 09:33:46,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:33:49,762 INFO [train.py:1046] (1/4) Epoch 35, batch 2400, loss[loss=0.1671, simple_loss=0.2422, pruned_loss=0.04594, over 23594.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2408, pruned_loss=0.04158, over 4709239.45 frames. ], batch size: 149, lr: 2.91e-03, grad_scale: 16.0 2023-10-03 09:33:49,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:33:55,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:33:56,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:33:56,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 09:33:56,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 09:34:02,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1220153.3333333333, ans=0.1 2023-10-03 09:34:04,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 09:34:04,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:34:05,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 09:34:05,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:34:07,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:34:07,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 09:34:10,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1220153.3333333333, ans=0.2 2023-10-03 09:34:13,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:34:14,944 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:34:16,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 09:34:16,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1220153.3333333333, ans=0.0 2023-10-03 09:34:20,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:34:25,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 09:34:26,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:34:28,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:34:32,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:34:32,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 09:34:34,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:34:36,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1220286.6666666667, ans=0.125 2023-10-03 09:34:41,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:34:42,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:34:44,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:34:45,095 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.95 vs. limit=8.0 2023-10-03 09:34:45,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:34:45,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 09:34:46,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:34:46,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:34:46,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:34:46,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:34:51,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:34:53,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:34:53,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 09:34:53,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 09:34:55,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:34:55,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:34:55,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 09:34:57,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 09:34:57,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 09:34:57,323 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 09:34:57,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 09:34:57,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1220353.3333333333, ans=0.2 2023-10-03 09:35:00,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:35:01,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:35:01,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:35:03,236 INFO [train.py:1046] (1/4) Epoch 35, batch 2450, loss[loss=0.1581, simple_loss=0.2268, pruned_loss=0.0447, over 22721.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2393, pruned_loss=0.04128, over 4702800.21 frames. ], batch size: 322, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:35:03,290 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 09:35:04,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:35:04,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 09:35:08,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:35:09,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:35:11,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:35:11,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:35:12,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 09:35:16,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:35:16,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:35:19,799 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.894e+02 2.090e+02 2.378e+02 3.280e+02, threshold=4.179e+02, percent-clipped=0.0 2023-10-03 09:35:20,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1220486.6666666667, ans=0.04949747468305833 2023-10-03 09:35:21,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:35:21,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:35:21,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:35:21,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 09:35:26,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:35:27,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:35:29,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:35:32,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:35:34,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:35:34,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:35:36,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:35:37,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 09:35:39,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:35:46,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:35:47,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:35:47,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:35:47,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:35:47,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:35:49,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:35:50,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 09:35:53,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:35:55,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:35:56,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:35:56,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:36:02,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:36:02,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 09:36:02,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:36:04,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:36:04,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 09:36:05,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:36:06,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:36:07,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1220686.6666666667, ans=0.125 2023-10-03 09:36:11,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:36:12,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:36:14,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:36:14,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1220686.6666666667, ans=0.95 2023-10-03 09:36:16,952 INFO [train.py:1046] (1/4) Epoch 35, batch 2500, loss[loss=0.1383, simple_loss=0.2196, pruned_loss=0.02849, over 24371.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2384, pruned_loss=0.04075, over 4707523.49 frames. ], batch size: 56, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:36:17,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 09:36:18,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:36:24,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:36:29,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1220753.3333333333, ans=0.125 2023-10-03 09:36:30,872 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.18 vs. limit=10.0 2023-10-03 09:36:31,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1220820.0, ans=0.2 2023-10-03 09:36:33,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:36:33,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:36:34,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:36:34,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 09:36:37,288 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.83 vs. limit=15.0 2023-10-03 09:36:42,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:36:42,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:36:44,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 09:36:44,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 09:36:44,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 09:36:45,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:36:46,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:36:46,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 09:36:46,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:36:48,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 09:36:49,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:36:51,685 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:36:52,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:36:52,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:36:55,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:36:55,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 09:36:55,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:36:58,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:37:00,235 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=7.689e-03 2023-10-03 09:37:01,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:37:06,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:37:08,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:37:13,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 09:37:16,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 09:37:16,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:37:16,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 09:37:20,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:37:20,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:37:20,941 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 09:37:20,942 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 09:37:20,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 09:37:25,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:37:26,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 09:37:27,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 09:37:27,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:37:29,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 09:37:30,699 INFO [train.py:1046] (1/4) Epoch 35, batch 2550, loss[loss=0.125, simple_loss=0.2085, pruned_loss=0.02079, over 24305.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2384, pruned_loss=0.04061, over 4719320.06 frames. ], batch size: 56, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:37:32,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 09:37:35,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:37:36,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:37:36,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:37:38,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:37:39,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 09:37:39,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:37:42,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 09:37:44,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:37:46,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:37:47,471 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.892e+02 2.118e+02 2.497e+02 3.276e+02, threshold=4.237e+02, percent-clipped=0.0 2023-10-03 09:37:50,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:37:50,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 09:37:50,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:37:50,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:37:51,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:37:53,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:37:53,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 09:37:53,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 09:37:53,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:37:53,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 09:37:53,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1221153.3333333333, ans=0.1 2023-10-03 09:38:05,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:38:08,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:38:09,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:38:09,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:38:11,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:38:16,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:38:18,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:38:18,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:38:18,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:38:18,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:38:20,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:38:23,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:38:23,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:38:25,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1221286.6666666667, ans=0.125 2023-10-03 09:38:25,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.whiten.whitening_limit, batch_count=1221286.6666666667, ans=15.0 2023-10-03 09:38:30,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:38:30,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 09:38:30,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:38:30,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:38:31,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:38:32,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:38:33,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:38:40,191 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:38:41,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:38:42,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:38:45,746 INFO [train.py:1046] (1/4) Epoch 35, batch 2600, loss[loss=0.1529, simple_loss=0.2407, pruned_loss=0.03255, over 23976.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2389, pruned_loss=0.04042, over 4722869.94 frames. ], batch size: 86, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:38:45,887 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 09:38:48,610 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 09:38:48,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:38:49,919 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 09:38:49,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 09:38:50,005 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 09:38:53,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:38:53,335 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 09:38:54,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 09:38:56,043 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 09:38:58,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:39:00,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 09:39:02,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 09:39:04,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:39:06,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 09:39:07,505 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 09:39:07,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 09:39:15,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:39:15,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:39:15,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:39:15,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 09:39:17,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:39:23,826 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 09:39:27,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1221553.3333333333, ans=0.0 2023-10-03 09:39:27,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1221553.3333333333, ans=0.125 2023-10-03 09:39:29,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:39:29,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:39:31,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 09:39:31,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:39:31,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:39:32,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 09:39:34,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:39:34,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:39:34,693 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.22 vs. limit=15.0 2023-10-03 09:39:35,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1221620.0, ans=0.0 2023-10-03 09:39:36,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:39:36,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1221620.0, ans=0.2 2023-10-03 09:39:41,547 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 09:39:41,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:39:41,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:39:47,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:39:47,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:39:47,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 09:39:48,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1221686.6666666667, ans=0.0 2023-10-03 09:39:49,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:39:51,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:39:52,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:39:54,504 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.41 vs. limit=15.0 2023-10-03 09:39:58,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 09:39:58,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:40:00,023 INFO [train.py:1046] (1/4) Epoch 35, batch 2650, loss[loss=0.1621, simple_loss=0.2488, pruned_loss=0.03768, over 24649.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2397, pruned_loss=0.0409, over 4720354.82 frames. ], batch size: 65, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:40:00,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:40:00,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1221753.3333333333, ans=0.125 2023-10-03 09:40:01,951 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.15 vs. limit=22.5 2023-10-03 09:40:04,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 09:40:04,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:40:05,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:40:06,994 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 09:40:07,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:40:08,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:40:11,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 09:40:13,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:40:14,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:40:15,674 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.893e+02 2.177e+02 2.429e+02 3.374e+02, threshold=4.354e+02, percent-clipped=0.0 2023-10-03 09:40:15,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 09:40:15,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:40:16,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:40:19,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 09:40:22,565 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 09:40:23,341 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.56 vs. limit=22.5 2023-10-03 09:40:25,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:40:26,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 09:40:28,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:40:28,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 09:40:31,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:40:31,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 09:40:31,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:40:32,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:40:36,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 09:40:36,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 09:40:41,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:40:44,732 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.14 vs. limit=15.0 2023-10-03 09:40:45,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 09:40:45,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:40:46,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:40:46,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:40:46,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:40:48,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:40:51,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:40:51,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:40:51,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1221953.3333333333, ans=0.0 2023-10-03 09:40:54,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:40:54,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:40:55,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:40:57,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:40:57,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:40:58,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:40:58,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:41:00,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 09:41:02,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:41:03,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:41:03,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:41:03,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 09:41:08,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:41:10,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:41:11,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:41:13,655 INFO [train.py:1046] (1/4) Epoch 35, batch 2700, loss[loss=0.1631, simple_loss=0.2454, pruned_loss=0.04037, over 23476.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2408, pruned_loss=0.04142, over 4708575.89 frames. ], batch size: 93, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:41:13,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:13,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:41:13,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:16,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:41:16,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 09:41:17,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1222086.6666666667, ans=0.1 2023-10-03 09:41:18,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:41:19,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 09:41:22,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:41:22,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:22,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:25,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:41:25,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:41:25,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:41:25,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1222086.6666666667, ans=0.125 2023-10-03 09:41:26,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:41:26,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 09:41:28,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:41:29,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:41:29,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:41:29,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:41:32,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:41:34,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 09:41:34,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:41:40,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:41:40,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:41:43,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1222220.0, ans=0.125 2023-10-03 09:41:44,453 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.52 vs. limit=15.0 2023-10-03 09:41:45,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:41:45,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:41:45,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:41:45,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:41:49,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:41:51,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:41:52,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:41:52,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:41:52,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1222220.0, ans=0.0 2023-10-03 09:41:55,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:55,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:42:04,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:42:06,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:42:09,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:42:09,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:42:10,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:42:11,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:42:12,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:42:13,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:15,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:42:17,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:42:19,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:42:21,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:42:21,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:42:22,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1222353.3333333333, ans=0.1 2023-10-03 09:42:23,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 09:42:23,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:42:27,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:42:27,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 09:42:28,558 INFO [train.py:1046] (1/4) Epoch 35, batch 2750, loss[loss=0.1663, simple_loss=0.2533, pruned_loss=0.03971, over 24587.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.241, pruned_loss=0.04169, over 4703372.09 frames. ], batch size: 71, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:42:28,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 09:42:28,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:42:33,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:42:33,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:42:34,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:34,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:42:35,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:38,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:42:38,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 09:42:38,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:42:38,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:38,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 09:42:38,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:42:39,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:42:40,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1222420.0, ans=15.0 2023-10-03 09:42:44,655 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.884e+02 2.035e+02 2.268e+02 3.504e+02, threshold=4.069e+02, percent-clipped=0.0 2023-10-03 09:42:46,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 09:42:47,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:42:47,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1222486.6666666667, ans=0.1 2023-10-03 09:42:49,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:50,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:42:50,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 09:42:52,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:42:53,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:42:53,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:42:53,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:42:58,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:42:59,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 09:42:59,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:42:59,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:43:01,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:43:04,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1222553.3333333333, ans=0.125 2023-10-03 09:43:08,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:43:11,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:43:11,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:43:14,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:43:14,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:43:14,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:43:19,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:43:20,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:43:20,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 09:43:25,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:43:26,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 09:43:33,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 09:43:34,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:43:34,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 09:43:36,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:43:38,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:43:38,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 09:43:38,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:43:42,686 INFO [train.py:1046] (1/4) Epoch 35, batch 2800, loss[loss=0.1381, simple_loss=0.1998, pruned_loss=0.03818, over 23450.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2394, pruned_loss=0.0412, over 4685692.48 frames. ], batch size: 285, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:43:42,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 09:43:42,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:43:42,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:43:44,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 09:43:44,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:43:44,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:43:47,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:43:47,782 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 09:43:47,783 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 09:43:51,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:43:52,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:43:52,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:43:55,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:43:56,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 09:43:58,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 09:44:00,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 09:44:01,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:44:03,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:44:03,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:44:06,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:44:07,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:44:07,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:44:08,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:44:15,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.91 vs. limit=15.0 2023-10-03 09:44:17,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:44:19,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:44:21,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:44:21,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:44:22,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:44:24,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1222886.6666666667, ans=0.1 2023-10-03 09:44:28,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:44:28,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 09:44:28,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:44:29,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:44:29,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:44:30,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1222953.3333333333, ans=0.0 2023-10-03 09:44:33,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:44:34,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:44:34,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1222953.3333333333, ans=0.125 2023-10-03 09:44:37,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:44:40,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:44:40,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:44:40,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:44:40,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:44:40,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:44:43,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:44:43,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 09:44:43,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:44:44,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:44:45,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:44:45,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 09:44:46,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:44:46,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:44:46,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:44:47,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 09:44:51,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1223020.0, ans=0.07 2023-10-03 09:44:53,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:44:53,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:44:53,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:44:56,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:44:58,075 INFO [train.py:1046] (1/4) Epoch 35, batch 2850, loss[loss=0.1705, simple_loss=0.2569, pruned_loss=0.04201, over 23416.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2382, pruned_loss=0.04071, over 4694488.92 frames. ], batch size: 93, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:45:00,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:45:00,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:45:01,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:45:01,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1223086.6666666667, ans=0.2 2023-10-03 09:45:04,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:45:06,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:45:07,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:45:07,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 09:45:13,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 09:45:13,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:45:15,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 09:45:15,864 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.65 vs. limit=15.0 2023-10-03 09:45:16,441 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.902e+02 2.080e+02 2.462e+02 6.971e+02, threshold=4.161e+02, percent-clipped=1.0 2023-10-03 09:45:16,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:45:19,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 09:45:20,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 09:45:20,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:45:33,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:45:35,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:45:35,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:45:37,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:45:37,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:45:37,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:45:40,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:45:40,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 09:45:43,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:45:43,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:45:43,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:45:44,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:45:46,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:45:46,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:45:47,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:45:47,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:45:50,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:45:51,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:45:51,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1223286.6666666667, ans=0.125 2023-10-03 09:45:53,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:45:54,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:45:59,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:46:00,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 09:46:01,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 09:46:01,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1223353.3333333333, ans=0.0 2023-10-03 09:46:04,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 09:46:04,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:46:04,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 09:46:06,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:46:06,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:46:06,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:46:06,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:46:06,516 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 09:46:07,249 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.22 vs. limit=6.0 2023-10-03 09:46:07,802 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 09:46:07,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:46:09,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:46:12,867 INFO [train.py:1046] (1/4) Epoch 35, batch 2900, loss[loss=0.1649, simple_loss=0.2449, pruned_loss=0.04244, over 23399.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2376, pruned_loss=0.04073, over 4680873.22 frames. ], batch size: 106, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:46:14,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:46:14,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:46:14,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:46:15,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 09:46:21,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:46:21,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 09:46:21,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 09:46:22,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:46:22,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:46:24,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:46:24,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1223420.0, ans=0.125 2023-10-03 09:46:26,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:46:28,404 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.78 vs. limit=15.0 2023-10-03 09:46:29,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:46:29,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:46:29,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1223486.6666666667, ans=0.0 2023-10-03 09:46:33,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:46:33,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 09:46:33,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:46:34,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:46:34,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1223486.6666666667, ans=0.5 2023-10-03 09:46:35,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 09:46:36,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 09:46:39,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:46:39,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 09:46:39,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:46:43,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:46:43,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:46:45,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:46:45,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:46:49,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:46:52,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:46:55,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 09:46:55,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 09:46:55,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:47:00,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:47:01,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 09:47:01,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1223620.0, ans=0.125 2023-10-03 09:47:02,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:47:08,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:47:18,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:47:18,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:47:19,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 09:47:22,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:47:22,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 09:47:22,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:47:22,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:47:22,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1223686.6666666667, ans=0.0 2023-10-03 09:47:26,630 INFO [train.py:1046] (1/4) Epoch 35, batch 2950, loss[loss=0.1928, simple_loss=0.2667, pruned_loss=0.05949, over 19490.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2391, pruned_loss=0.04105, over 4687034.13 frames. ], batch size: 388, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:47:29,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:47:31,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 09:47:31,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:47:31,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:47:32,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:47:34,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:47:35,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 09:47:36,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 09:47:38,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:47:38,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:47:43,479 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.90 vs. limit=15.0 2023-10-03 09:47:44,986 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.855e+02 2.022e+02 2.286e+02 3.734e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-03 09:47:46,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:47:48,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:47:49,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:47:49,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:47:52,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:47:52,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:47:54,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:47:54,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:47:54,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:47:55,201 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.50 vs. limit=6.0 2023-10-03 09:47:57,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 09:48:02,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 09:48:02,991 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 09:48:04,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:48:04,424 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 09:48:07,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 09:48:07,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:48:07,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:48:07,076 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 09:48:07,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 09:48:09,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 09:48:09,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:48:09,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:48:13,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:48:14,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:48:14,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:48:16,296 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 09:48:16,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:48:16,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 09:48:22,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:48:23,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:48:23,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 09:48:23,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:48:25,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 09:48:28,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:48:28,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:48:28,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:48:31,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:48:31,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 09:48:32,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:48:32,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:48:32,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:48:34,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 09:48:34,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:48:35,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:48:37,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:48:37,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 09:48:38,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:48:40,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1224086.6666666667, ans=0.125 2023-10-03 09:48:40,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1224086.6666666667, ans=0.0 2023-10-03 09:48:41,555 INFO [train.py:1046] (1/4) Epoch 35, batch 3000, loss[loss=0.1772, simple_loss=0.2453, pruned_loss=0.05452, over 23898.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2398, pruned_loss=0.04071, over 4705843.83 frames. ], batch size: 195, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:48:41,555 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 09:48:53,197 INFO [train.py:1078] (1/4) Epoch 35, validation: loss=0.3596, simple_loss=0.2732, pruned_loss=0.223, over 1125622.00 frames. 2023-10-03 09:48:53,197 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 09:48:53,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:48:53,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:48:53,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1224086.6666666667, ans=0.1 2023-10-03 09:48:56,721 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 09:48:56,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 09:48:59,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:48:59,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:49:01,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 09:49:01,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:49:05,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1224086.6666666667, ans=0.0 2023-10-03 09:49:08,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1224153.3333333333, ans=0.1 2023-10-03 09:49:09,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:49:19,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:49:27,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 09:49:28,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:49:28,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1224220.0, ans=0.2 2023-10-03 09:49:33,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:49:33,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:49:34,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:49:34,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1224220.0, ans=0.1 2023-10-03 09:49:36,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:49:36,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 09:49:37,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 09:49:38,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:49:38,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 09:49:40,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:49:40,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:49:40,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:49:40,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:49:40,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1224286.6666666667, ans=0.0 2023-10-03 09:49:45,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:49:47,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:49:47,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:49:48,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:49:51,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 09:49:51,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:49:51,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:49:51,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:49:55,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:49:55,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:49:58,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 09:49:58,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 09:49:58,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:49:59,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 09:49:59,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:50:03,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 09:50:07,192 INFO [train.py:1046] (1/4) Epoch 35, batch 3050, loss[loss=0.1638, simple_loss=0.2452, pruned_loss=0.04117, over 23465.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2407, pruned_loss=0.04095, over 4713309.53 frames. ], batch size: 93, lr: 2.91e-03, grad_scale: 4.0 2023-10-03 09:50:07,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 09:50:08,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 09:50:08,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 09:50:08,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 09:50:08,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 09:50:10,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:50:10,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:50:10,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 09:50:10,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:11,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:50:13,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1224420.0, ans=0.125 2023-10-03 09:50:14,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 09:50:17,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:50:19,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:50:19,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:50:23,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:26,934 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.384e+02 1.939e+02 2.109e+02 2.349e+02 4.315e+02, threshold=4.217e+02, percent-clipped=1.0 2023-10-03 09:50:27,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 09:50:31,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 09:50:31,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 09:50:31,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:50:34,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:50:34,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1224486.6666666667, ans=0.125 2023-10-03 09:50:38,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:38,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:50:38,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:50:41,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:50:41,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 09:50:42,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:50:42,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:50:42,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:50:43,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:45,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:50:46,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:50:48,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 09:50:49,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:49,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:50:50,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1224620.0, ans=0.125 2023-10-03 09:50:53,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:50:53,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:50:54,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:50:54,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:50:58,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:51:00,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:51:04,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:06,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:51:06,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:51:07,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:51:07,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 09:51:09,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:51:10,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 09:51:11,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:51:11,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:12,440 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.64 vs. limit=15.0 2023-10-03 09:51:13,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 09:51:14,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:51:20,346 INFO [train.py:1046] (1/4) Epoch 35, batch 3100, loss[loss=0.1719, simple_loss=0.2306, pruned_loss=0.05662, over 19613.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.24, pruned_loss=0.04108, over 4702908.95 frames. ], batch size: 389, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:51:20,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:51:22,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:51:25,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:51:25,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1224753.3333333333, ans=0.125 2023-10-03 09:51:26,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 09:51:29,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 09:51:29,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1224753.3333333333, ans=0.2 2023-10-03 09:51:30,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 09:51:32,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:51:35,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:51:36,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:38,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 09:51:38,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1224820.0, ans=0.125 2023-10-03 09:51:41,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:41,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1224820.0, ans=0.125 2023-10-03 09:51:41,919 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.96 vs. limit=10.0 2023-10-03 09:51:46,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 09:51:51,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 09:51:53,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:51:53,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:51:53,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:51:54,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 09:51:56,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:51:56,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 09:51:56,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:51:57,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:58,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 09:52:00,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:52:04,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:52:04,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 09:52:07,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 09:52:07,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:07,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:52:09,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:09,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:10,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:52:10,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:52:10,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:52:13,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:52:13,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:52:13,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:14,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 09:52:17,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:52:17,887 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.23 vs. limit=15.0 2023-10-03 09:52:18,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 09:52:20,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:52:22,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 09:52:22,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:24,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:24,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 09:52:26,616 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.60 vs. limit=6.0 2023-10-03 09:52:34,873 INFO [train.py:1046] (1/4) Epoch 35, batch 3150, loss[loss=0.161, simple_loss=0.2537, pruned_loss=0.03421, over 24315.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2397, pruned_loss=0.04064, over 4695714.88 frames. ], batch size: 74, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:52:35,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 09:52:37,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:52:37,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:39,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:52:39,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:52:39,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 09:52:40,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:52:40,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 09:52:40,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 09:52:42,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:43,709 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 09:52:46,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 09:52:46,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:52:46,656 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 09:52:48,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 09:52:49,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 09:52:50,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 09:52:50,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 09:52:50,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:50,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:52:51,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:53,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 09:52:54,302 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.875e+02 2.111e+02 2.412e+02 4.030e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 09:52:57,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:52:57,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:52:57,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:52:58,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:53:03,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 09:53:04,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:53:04,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1225220.0, ans=0.0 2023-10-03 09:53:06,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:53:06,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:53:06,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 09:53:08,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 09:53:08,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:53:10,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 09:53:10,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 09:53:10,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:53:10,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:53:11,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:53:11,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 09:53:13,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 09:53:13,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:53:14,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:17,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:53:17,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:53:19,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 09:53:19,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:53:22,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 09:53:22,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:22,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 09:53:24,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 09:53:24,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:53:24,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:53:25,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 09:53:27,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 09:53:27,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:53:29,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:53:31,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:32,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:53:35,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:53:36,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:38,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 09:53:39,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1225353.3333333333, ans=0.125 2023-10-03 09:53:39,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1225353.3333333333, ans=0.125 2023-10-03 09:53:42,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:53:42,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:53:47,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:50,255 INFO [train.py:1046] (1/4) Epoch 35, batch 3200, loss[loss=0.1487, simple_loss=0.2273, pruned_loss=0.03511, over 24298.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2381, pruned_loss=0.04029, over 4679283.47 frames. ], batch size: 56, lr: 2.91e-03, grad_scale: 16.0 2023-10-03 09:53:50,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:53:50,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 09:53:53,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:53:56,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:54:01,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:54:09,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:54:19,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 09:54:19,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:54:19,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1225553.3333333333, ans=0.0 2023-10-03 09:54:22,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 09:54:22,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 09:54:26,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:54:26,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:54:28,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:54:32,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 09:54:33,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 09:54:35,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 09:54:37,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 09:54:39,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 09:54:45,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:54:45,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:54:45,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:54:45,564 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 09:54:45,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 09:54:45,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1225620.0, ans=0.125 2023-10-03 09:54:50,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:54:52,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 09:54:52,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 09:54:52,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1225686.6666666667, ans=0.125 2023-10-03 09:54:53,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 09:54:54,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 09:54:56,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:54:59,258 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.37 vs. limit=6.0 2023-10-03 09:54:59,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:54:59,713 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 09:54:59,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:54:59,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:01,088 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 09:55:04,072 INFO [train.py:1046] (1/4) Epoch 35, batch 3250, loss[loss=0.1511, simple_loss=0.245, pruned_loss=0.02866, over 24689.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2386, pruned_loss=0.04019, over 4707402.49 frames. ], batch size: 73, lr: 2.91e-03, grad_scale: 16.0 2023-10-03 09:55:05,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:55:09,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:55:15,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:55:15,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 09:55:16,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:55:18,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:55:18,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:55:20,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:55:20,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:55:23,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:23,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:55:24,721 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 2.063e+02 2.296e+02 2.650e+02 3.939e+02, threshold=4.593e+02, percent-clipped=0.0 2023-10-03 09:55:24,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:55:24,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:24,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:24,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:55:29,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:55:30,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:55:32,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:55:32,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:33,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:55:35,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:55:35,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:55:37,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1225886.6666666667, ans=0.125 2023-10-03 09:55:39,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 09:55:41,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:55:41,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:55:42,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:55:42,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:55:44,882 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.61 vs. limit=22.5 2023-10-03 09:55:48,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:55:54,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:55:56,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:55:56,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 09:55:56,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:55:56,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 09:55:56,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:55:57,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1225953.3333333333, ans=0.125 2023-10-03 09:56:00,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 09:56:00,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 09:56:00,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:56:02,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:56:02,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:56:02,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 09:56:03,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:56:04,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1226020.0, ans=0.2 2023-10-03 09:56:08,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:56:08,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:56:09,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 09:56:09,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:56:12,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:56:12,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 09:56:15,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:56:15,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 09:56:18,033 INFO [train.py:1046] (1/4) Epoch 35, batch 3300, loss[loss=0.1751, simple_loss=0.2525, pruned_loss=0.04883, over 23221.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2397, pruned_loss=0.04021, over 4716676.73 frames. ], batch size: 105, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:56:18,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 09:56:19,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 09:56:19,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:56:22,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:56:24,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:56:24,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:25,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 09:56:25,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:56:28,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:56:30,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:56:35,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 09:56:35,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:56:37,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:56:38,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:40,048 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 09:56:41,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:56:41,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 09:56:42,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:56:42,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:56:43,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1226153.3333333333, ans=0.1 2023-10-03 09:56:44,151 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 09:56:48,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:56:48,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:56:50,424 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.56 vs. limit=15.0 2023-10-03 09:56:50,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:50,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 09:56:52,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 09:56:52,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:53,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:56:56,287 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 09:56:56,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 09:56:58,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:57:01,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 09:57:02,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:57:04,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:57:05,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:57:08,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=1226286.6666666667, ans=10.0 2023-10-03 09:57:09,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:57:09,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:57:09,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:57:09,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:57:11,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:57:12,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:57:12,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:57:12,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1226286.6666666667, ans=0.1 2023-10-03 09:57:13,472 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 09:57:13,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 09:57:16,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 09:57:17,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:57:17,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:57:19,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:57:19,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:57:20,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:57:20,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:57:20,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:57:21,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:57:24,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:57:26,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1226353.3333333333, ans=0.125 2023-10-03 09:57:27,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 09:57:29,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:57:30,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:57:31,605 INFO [train.py:1046] (1/4) Epoch 35, batch 3350, loss[loss=0.1593, simple_loss=0.2362, pruned_loss=0.04118, over 23457.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2399, pruned_loss=0.0405, over 4718604.03 frames. ], batch size: 120, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:57:31,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1226420.0, ans=10.0 2023-10-03 09:57:33,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:57:33,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:57:34,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:57:35,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:57:35,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:57:39,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:57:39,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:57:41,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:57:43,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:57:44,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:57:45,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:57:47,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:57:48,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 09:57:48,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1226486.6666666667, ans=0.1 2023-10-03 09:57:49,892 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 09:57:49,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:57:52,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 09:57:52,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 09:57:53,951 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.867e+02 2.273e+02 2.647e+02 3.558e+02, threshold=4.546e+02, percent-clipped=0.0 2023-10-03 09:57:54,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:57:54,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:57:55,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:57:55,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 09:57:55,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:57:56,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:57:56,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:58:00,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:58:00,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:58:01,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:58:04,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:58:08,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:58:08,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:58:12,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:58:12,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:58:14,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:58:15,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:58:17,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:58:19,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 09:58:19,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:58:19,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 09:58:19,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:58:21,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 09:58:22,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:58:24,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:58:26,085 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.50 vs. limit=6.0 2023-10-03 09:58:32,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:58:33,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 09:58:33,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:58:35,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:58:36,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1226686.6666666667, ans=0.07 2023-10-03 09:58:38,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:58:41,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:58:43,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 09:58:45,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:58:46,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:58:47,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:58:48,891 INFO [train.py:1046] (1/4) Epoch 35, batch 3400, loss[loss=0.1623, simple_loss=0.2376, pruned_loss=0.0435, over 23849.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2408, pruned_loss=0.04129, over 4714315.72 frames. ], batch size: 195, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:58:48,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 09:58:48,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:58:49,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 09:58:49,677 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.95 vs. limit=15.0 2023-10-03 09:58:50,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:58:51,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:58:51,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:58:53,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:58:53,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 09:58:54,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1226753.3333333333, ans=0.0 2023-10-03 09:58:57,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 09:58:57,320 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 09:58:57,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:59:01,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:59:01,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:59:01,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:59:01,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:59:08,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:59:08,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 09:59:16,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:59:16,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1226820.0, ans=0.125 2023-10-03 09:59:18,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:59:19,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:59:20,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 09:59:27,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:59:30,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 09:59:30,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1226886.6666666667, ans=0.125 2023-10-03 09:59:34,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:59:36,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:59:36,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 09:59:36,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:59:37,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:59:37,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:59:39,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:59:42,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:59:45,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:59:45,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:59:50,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:59:51,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 09:59:53,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1227020.0, ans=0.1 2023-10-03 09:59:57,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 09:59:57,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1227020.0, ans=0.125 2023-10-03 10:00:02,665 INFO [train.py:1046] (1/4) Epoch 35, batch 3450, loss[loss=0.1527, simple_loss=0.2305, pruned_loss=0.03744, over 24326.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2404, pruned_loss=0.04089, over 4711447.93 frames. ], batch size: 61, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 10:00:02,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 10:00:06,163 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.81 vs. limit=22.5 2023-10-03 10:00:06,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=1227086.6666666667, ans=15.0 2023-10-03 10:00:07,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 10:00:08,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:00:08,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:00:08,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 10:00:10,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:00:13,957 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1227086.6666666667, ans=0.0 2023-10-03 10:00:15,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:00:17,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1227153.3333333333, ans=0.1 2023-10-03 10:00:18,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:00:18,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:00:19,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:00:19,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:00:23,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:00:27,177 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.826e+02 1.965e+02 2.200e+02 4.257e+02, threshold=3.929e+02, percent-clipped=0.0 2023-10-03 10:00:28,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 10:00:31,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 10:00:32,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 10:00:33,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:00:34,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:00:40,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 10:00:42,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:00:45,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1227220.0, ans=0.0 2023-10-03 10:00:46,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:00:46,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:00:49,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:00:50,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:00:52,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 10:00:52,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:00:55,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:00:56,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:00:59,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 10:01:01,880 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.57 vs. limit=10.0 2023-10-03 10:01:02,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:01:07,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:01:07,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1227353.3333333333, ans=0.0 2023-10-03 10:01:08,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:12,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:01:15,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:15,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:01:17,066 INFO [train.py:1046] (1/4) Epoch 35, batch 3500, loss[loss=0.1687, simple_loss=0.2575, pruned_loss=0.03996, over 24629.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2391, pruned_loss=0.04057, over 4717053.93 frames. ], batch size: 68, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 10:01:17,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:01:19,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:01:21,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:01:24,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:01:26,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 10:01:27,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:01:29,808 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.43 vs. limit=15.0 2023-10-03 10:01:30,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 10:01:33,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:01:33,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 10:01:33,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1227486.6666666667, ans=0.0 2023-10-03 10:01:39,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:01:39,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:01:39,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:01:39,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:01:41,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 10:01:41,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:42,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:01:42,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 10:01:44,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1227486.6666666667, ans=0.5 2023-10-03 10:01:45,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:45,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:01:47,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:01:50,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:50,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 10:01:51,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:01:53,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:01:54,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:01:55,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:58,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:01:59,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:02:01,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 10:02:01,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 10:02:02,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 10:02:02,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:02:04,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:02:05,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:02:05,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:02:10,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 10:02:10,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:02:14,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:02:16,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 10:02:16,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 10:02:16,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:02:18,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1227686.6666666667, ans=0.125 2023-10-03 10:02:19,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:02:20,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:02:21,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1227686.6666666667, ans=0.0 2023-10-03 10:02:22,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:02:23,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 10:02:23,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:02:25,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:02:25,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 10:02:28,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 10:02:31,061 INFO [train.py:1046] (1/4) Epoch 35, batch 3550, loss[loss=0.1582, simple_loss=0.2276, pruned_loss=0.04444, over 23479.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2377, pruned_loss=0.04004, over 4711893.61 frames. ], batch size: 285, lr: 2.90e-03, grad_scale: 4.0 2023-10-03 10:02:31,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:02:32,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:02:32,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:02:32,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:02:37,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:02:44,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:02:46,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 10:02:48,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1227820.0, ans=0.0 2023-10-03 10:02:49,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:02:50,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:02:52,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:02:52,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:02:52,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:02:56,325 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.952e+02 2.132e+02 2.407e+02 4.209e+02, threshold=4.264e+02, percent-clipped=1.0 2023-10-03 10:02:56,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:02:56,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:02:57,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:02:57,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 10:02:59,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:03:03,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:03:05,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:03:07,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:03:07,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:03:07,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:03:07,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 10:03:07,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:03:09,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:03:11,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 10:03:17,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:03:17,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:03:18,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:03:19,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 10:03:19,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:03:23,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 10:03:23,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:03:26,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:03:26,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:03:29,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 10:03:30,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:03:32,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1228020.0, ans=0.125 2023-10-03 10:03:35,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:03:35,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 10:03:35,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:03:39,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:03:39,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1228020.0, ans=0.125 2023-10-03 10:03:41,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 10:03:43,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1228020.0, ans=0.1 2023-10-03 10:03:45,621 INFO [train.py:1046] (1/4) Epoch 35, batch 3600, loss[loss=0.1507, simple_loss=0.2384, pruned_loss=0.03144, over 24489.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2382, pruned_loss=0.0396, over 4725651.82 frames. ], batch size: 66, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:03:46,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 10:03:47,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:03:48,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:03:49,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:03:49,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:03:52,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:03:55,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:03:57,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:03:58,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:03:58,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:04:00,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:04:00,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 10:04:02,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:04:03,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:04:04,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:04:09,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:04:09,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:04:10,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:04:12,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 10:04:12,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:04:15,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:04:16,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:04:18,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:04:18,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:04:19,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:04:20,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 10:04:29,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:04:29,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:04:31,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 10:04:35,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:04:36,468 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.27 vs. limit=12.0 2023-10-03 10:04:39,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:04:43,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:04:43,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1228353.3333333333, ans=0.125 2023-10-03 10:04:43,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1228353.3333333333, ans=0.1 2023-10-03 10:04:49,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:04:49,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:04:49,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 10:04:50,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 10:04:52,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 10:04:53,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:04:53,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:04:56,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 10:04:56,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:04:56,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:04:56,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:04:58,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 10:04:58,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 10:04:59,416 INFO [train.py:1046] (1/4) Epoch 35, batch 3650, loss[loss=0.1673, simple_loss=0.2556, pruned_loss=0.03951, over 24639.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2382, pruned_loss=0.0396, over 4724950.80 frames. ], batch size: 68, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:05:00,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:05:02,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 10:05:08,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 10:05:09,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:05:10,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1228420.0, ans=0.1 2023-10-03 10:05:14,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 10:05:14,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 10:05:19,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:05:19,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:05:19,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:05:22,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 10:05:22,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:05:22,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 10:05:24,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:05:24,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:05:24,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 10:05:25,446 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.828e+02 1.992e+02 2.156e+02 3.543e+02, threshold=3.984e+02, percent-clipped=0.0 2023-10-03 10:05:25,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 10:05:26,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:05:26,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:05:28,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:05:28,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1228553.3333333333, ans=0.0 2023-10-03 10:05:31,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 10:05:31,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 10:05:31,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:05:34,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 10:05:35,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:05:36,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:05:39,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1228553.3333333333, ans=0.1 2023-10-03 10:05:43,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 10:05:44,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:05:44,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:05:45,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:05:46,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:05:46,790 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.69 vs. limit=10.0 2023-10-03 10:05:48,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:05:52,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:05:52,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:05:52,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:05:54,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 10:05:55,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:05:55,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:02,497 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 10:06:06,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1228686.6666666667, ans=0.1 2023-10-03 10:06:07,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:06:07,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:07,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:06:08,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:06:10,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:06:11,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:06:11,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 10:06:11,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:06:14,645 INFO [train.py:1046] (1/4) Epoch 35, batch 3700, loss[loss=0.1717, simple_loss=0.25, pruned_loss=0.0467, over 23729.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2389, pruned_loss=0.04, over 4736023.46 frames. ], batch size: 232, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:06:14,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 10:06:17,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:06:17,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:06:20,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:06:20,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 10:06:20,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:06:22,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:06:22,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:06:26,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:06:28,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:06:29,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:06:31,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:06:31,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:06:32,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 10:06:33,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1228820.0, ans=0.0 2023-10-03 10:06:35,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:06:36,530 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 10:06:45,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:06:45,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 10:06:46,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:06:46,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 10:06:48,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:06:49,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:51,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 10:06:53,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:54,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:06:56,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:56,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:06:59,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:06:59,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1228953.3333333333, ans=0.125 2023-10-03 10:07:02,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:07:02,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 10:07:03,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:07:03,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 10:07:07,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:07:07,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:07:12,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:07:13,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 10:07:16,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:07:16,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 10:07:16,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:07:16,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:07:20,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:07:21,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 10:07:21,590 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.60 vs. limit=15.0 2023-10-03 10:07:22,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 10:07:24,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:07:24,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:07:25,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:07:25,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:07:28,305 INFO [train.py:1046] (1/4) Epoch 35, batch 3750, loss[loss=0.1379, simple_loss=0.2255, pruned_loss=0.02519, over 24281.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2404, pruned_loss=0.04053, over 4742554.54 frames. ], batch size: 61, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:07:28,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:07:30,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:07:31,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:07:33,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 10:07:34,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 10:07:37,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:07:37,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 10:07:38,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:07:40,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:07:41,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:07:43,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:07:46,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:07:48,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:07:48,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:07:52,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:07:53,851 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.913e+02 2.089e+02 2.337e+02 3.206e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-03 10:07:55,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:07:55,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 10:07:56,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:07:58,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:07:58,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:07:58,846 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.84 vs. limit=15.0 2023-10-03 10:08:01,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 10:08:04,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 10:08:05,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:08:05,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:08:07,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:08:11,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:08:14,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 10:08:17,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 10:08:20,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:08:24,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:08:25,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:08:29,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:08:33,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 10:08:34,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:08:35,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:08:37,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:08:39,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 10:08:42,698 INFO [train.py:1046] (1/4) Epoch 35, batch 3800, loss[loss=0.1832, simple_loss=0.2464, pruned_loss=0.05996, over 23758.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2403, pruned_loss=0.04084, over 4731100.90 frames. ], batch size: 179, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:08:46,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:08:49,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:08:49,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 10:08:49,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1229420.0, ans=0.125 2023-10-03 10:08:51,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 10:08:51,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1229420.0, ans=0.0 2023-10-03 10:08:51,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1229420.0, ans=0.2 2023-10-03 10:08:53,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:08:55,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:08:56,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 10:08:59,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 10:08:59,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:09:01,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:09:02,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:09:02,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:09:03,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:09:04,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 10:09:08,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 10:09:08,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:09:10,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:09:12,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1229553.3333333333, ans=0.1 2023-10-03 10:09:13,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:09:13,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:09:15,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 10:09:15,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:09:18,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:09:18,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1229553.3333333333, ans=0.125 2023-10-03 10:09:20,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:09:20,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1229553.3333333333, ans=0.125 2023-10-03 10:09:24,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 10:09:24,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 10:09:27,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:09:28,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1229620.0, ans=0.125 2023-10-03 10:09:34,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:09:39,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:09:40,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 10:09:42,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 10:09:42,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:09:45,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:09:45,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:09:46,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 10:09:49,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 10:09:49,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 10:09:49,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:09:51,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:09:57,217 INFO [train.py:1046] (1/4) Epoch 35, batch 3850, loss[loss=0.1432, simple_loss=0.203, pruned_loss=0.04166, over 22709.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2392, pruned_loss=0.04074, over 4727545.34 frames. ], batch size: 322, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:09:57,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:09:58,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:10:02,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1229753.3333333333, ans=0.125 2023-10-03 10:10:03,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:10:03,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 10:10:04,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:10:06,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:10:10,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 10:10:10,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1229820.0, ans=0.125 2023-10-03 10:10:12,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:10:14,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 10:10:15,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 10:10:22,240 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.944e+02 2.219e+02 2.451e+02 3.928e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-03 10:10:22,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:23,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:10:25,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:10:26,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:10:29,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:29,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:10:30,167 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.91 vs. limit=22.5 2023-10-03 10:10:30,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:10:30,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:10:31,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1229886.6666666667, ans=0.0 2023-10-03 10:10:31,432 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-03 10:10:32,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:10:32,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:10:33,500 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.65 vs. limit=6.0 2023-10-03 10:10:33,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:33,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:10:35,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 10:10:35,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 10:10:35,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:10:35,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:38,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:10:39,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:39,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 10:10:42,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 10:10:43,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:10:45,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1229953.3333333333, ans=0.5 2023-10-03 10:10:46,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 10:10:47,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 10:10:53,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:10:54,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:56,169 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:10:58,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:10:58,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 10:11:01,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 10:11:04,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:11:04,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:11:06,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:11:07,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 10:11:07,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:07,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:07,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:11:07,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 10:11:08,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:11:09,713 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.78 vs. limit=15.0 2023-10-03 10:11:10,221 INFO [train.py:1046] (1/4) Epoch 35, batch 3900, loss[loss=0.1512, simple_loss=0.2387, pruned_loss=0.03189, over 24651.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2385, pruned_loss=0.04047, over 4725962.78 frames. ], batch size: 73, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:11:10,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 10:11:10,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:10,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:11:11,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:11:13,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:13,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:11:14,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:11:14,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:11:15,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:11:15,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 10:11:15,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:19,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:11:19,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 10:11:19,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:11:22,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:11:23,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 10:11:23,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:25,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:11:27,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 10:11:27,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:11:30,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 10:11:31,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:32,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=1230153.3333333333, ans=15.0 2023-10-03 10:11:32,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 10:11:34,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 10:11:38,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:11:40,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:11:40,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:11:41,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:11:45,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:11:46,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:11:49,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:11:49,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:11:51,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:11:53,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1230286.6666666667, ans=0.125 2023-10-03 10:11:56,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:11:56,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:12:00,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:12:02,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:12:05,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1230286.6666666667, ans=0.125 2023-10-03 10:12:08,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1230353.3333333333, ans=0.035 2023-10-03 10:12:12,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:12:16,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:12:16,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 10:12:16,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 10:12:16,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:12:19,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 10:12:20,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:12:20,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 10:12:24,443 INFO [train.py:1046] (1/4) Epoch 35, batch 3950, loss[loss=0.1524, simple_loss=0.2404, pruned_loss=0.03224, over 24480.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2378, pruned_loss=0.04007, over 4698759.67 frames. ], batch size: 69, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:12:26,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1230420.0, ans=0.0 2023-10-03 10:12:26,648 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.59 vs. limit=15.0 2023-10-03 10:12:28,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:12:29,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 10:12:30,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:12:34,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:12:35,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:12:40,294 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 10:12:41,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:12:41,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 10:12:41,729 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 10:12:42,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:12:45,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:12:45,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:12:45,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:12:48,446 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.899e+02 2.029e+02 2.399e+02 3.247e+02, threshold=4.058e+02, percent-clipped=0.0 2023-10-03 10:12:48,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 10:12:49,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:12:51,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:12:51,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:12:52,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:12:53,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:12:56,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1230553.3333333333, ans=0.125 2023-10-03 10:12:59,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1230553.3333333333, ans=0.125 2023-10-03 10:13:03,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:13:03,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:13:03,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1230553.3333333333, ans=0.0 2023-10-03 10:13:09,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 10:13:12,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 10:13:12,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 10:13:13,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:13:13,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:13:21,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:13:21,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:13:21,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:13:23,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:13:23,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 10:13:26,359 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.54 vs. limit=15.0 2023-10-03 10:13:27,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:13:28,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:13:32,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 10:13:34,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1230686.6666666667, ans=0.07 2023-10-03 10:13:37,215 INFO [train.py:1046] (1/4) Epoch 35, batch 4000, loss[loss=0.1667, simple_loss=0.2416, pruned_loss=0.04595, over 23566.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2384, pruned_loss=0.04043, over 4696259.52 frames. ], batch size: 256, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:13:40,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:13:48,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:13:50,752 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.77 vs. limit=15.0 2023-10-03 10:13:54,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:13:54,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:13:54,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:13:54,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 10:13:56,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:13:56,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 10:13:56,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:13:56,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 10:14:00,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:14:04,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:14:04,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:14:04,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:14:05,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:14:05,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 10:14:06,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:14:10,247 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 10:14:10,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:14:11,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:14:14,443 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 10:14:15,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 10:14:15,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:14:22,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 10:14:22,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:14:25,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:14:25,717 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 10:14:27,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:14:27,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 10:14:27,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:14:28,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:14:30,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:14:32,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:14:32,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:14:32,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:14:35,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 10:14:35,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:14:36,816 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 10:14:38,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1231020.0, ans=0.0 2023-10-03 10:14:41,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:14:44,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 10:14:47,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:14:47,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:14:47,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:14:48,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:14:50,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1231086.6666666667, ans=0.125 2023-10-03 10:14:51,152 INFO [train.py:1046] (1/4) Epoch 35, batch 4050, loss[loss=0.1654, simple_loss=0.2341, pruned_loss=0.04836, over 22762.00 frames. ], tot_loss[loss=0.16, simple_loss=0.239, pruned_loss=0.04043, over 4702022.89 frames. ], batch size: 322, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:14:53,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:14:55,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 10:14:56,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 10:14:58,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:14:58,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:15:00,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:15:01,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:15:03,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:15:05,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:15:08,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:15:09,042 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.42 vs. limit=15.0 2023-10-03 10:15:09,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 10:15:11,728 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.72 vs. limit=6.0 2023-10-03 10:15:12,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:15:12,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:15:17,182 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.812e+02 1.989e+02 2.164e+02 3.056e+02, threshold=3.978e+02, percent-clipped=0.0 2023-10-03 10:15:17,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:15:18,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:15:20,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 10:15:21,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 10:15:21,653 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 10:15:22,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:15:29,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 10:15:31,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:15:33,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:15:37,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:15:37,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:15:37,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:15:41,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:15:44,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 10:15:44,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 10:15:46,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1231286.6666666667, ans=0.0 2023-10-03 10:15:47,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:15:47,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 10:15:52,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:15:58,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 10:15:58,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:15:58,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:16:03,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 10:16:03,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 10:16:03,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:16:05,204 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:16:06,182 INFO [train.py:1046] (1/4) Epoch 35, batch 4100, loss[loss=0.1696, simple_loss=0.2414, pruned_loss=0.04888, over 23738.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2396, pruned_loss=0.04063, over 4704905.51 frames. ], batch size: 179, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:16:06,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:16:09,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:09,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:16:16,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 10:16:17,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 10:16:18,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1231420.0, ans=0.0 2023-10-03 10:16:19,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 10:16:19,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 10:16:19,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:16:19,776 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.32 vs. limit=12.0 2023-10-03 10:16:20,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:20,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:20,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:16:20,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1231486.6666666667, ans=0.0 2023-10-03 10:16:21,971 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 10:16:24,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:16:26,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:16:26,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:16:26,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:16:30,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:16:32,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:16:33,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:16:33,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 10:16:34,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:34,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:16:34,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:16:34,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:16:34,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 10:16:37,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:16:39,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 10:16:40,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:16:42,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:16:42,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 10:16:43,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:16:45,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:16:45,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:16:46,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 10:16:48,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:16:50,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:16:52,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 10:16:52,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:52,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:16:55,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:16:55,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1231620.0, ans=0.125 2023-10-03 10:17:00,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:17:02,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:17:03,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:17:08,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:17:09,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:17:12,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:17:12,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1231686.6666666667, ans=0.1 2023-10-03 10:17:15,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:17:18,754 INFO [train.py:1046] (1/4) Epoch 35, batch 4150, loss[loss=0.1535, simple_loss=0.2445, pruned_loss=0.0312, over 24567.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2401, pruned_loss=0.04057, over 4714950.84 frames. ], batch size: 71, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:17:20,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:17:21,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:17:22,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:17:22,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:17:25,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 10:17:26,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:17:26,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 10:17:28,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 10:17:28,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 10:17:30,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:17:34,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:17:34,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:17:36,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1231820.0, ans=0.125 2023-10-03 10:17:39,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:17:39,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:17:40,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:17:40,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:17:42,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:17:43,363 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.871e+02 2.078e+02 2.359e+02 3.570e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-03 10:17:43,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 10:17:48,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:17:52,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:17:52,925 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=16.51 vs. limit=15.0 2023-10-03 10:17:53,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 10:17:54,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 10:17:54,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:17:57,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 10:17:57,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:17:57,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:17:58,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:17:59,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:18:01,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1231953.3333333333, ans=0.1 2023-10-03 10:18:01,480 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.01 vs. limit=15.0 2023-10-03 10:18:03,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 10:18:06,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:18:07,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:18:07,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 10:18:08,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:18:10,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 10:18:10,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1231953.3333333333, ans=0.125 2023-10-03 10:18:11,201 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.69 vs. limit=10.0 2023-10-03 10:18:13,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:18:15,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:18:16,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:18:17,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 10:18:17,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:18:17,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 10:18:19,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 10:18:20,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 10:18:21,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1232020.0, ans=0.1 2023-10-03 10:18:22,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:18:22,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:18:22,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:18:22,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 10:18:23,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:18:23,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 10:18:24,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:18:26,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:18:27,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 10:18:27,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:18:32,330 INFO [train.py:1046] (1/4) Epoch 35, batch 4200, loss[loss=0.1474, simple_loss=0.2261, pruned_loss=0.03439, over 21901.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2388, pruned_loss=0.04064, over 4689650.36 frames. ], batch size: 48, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:18:32,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:18:33,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 10:18:34,505 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.26 vs. limit=15.0 2023-10-03 10:18:35,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:18:37,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:18:39,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:18:39,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:18:39,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:18:41,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 10:18:44,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 10:18:45,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:18:48,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:18:48,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1232153.3333333333, ans=0.125 2023-10-03 10:18:50,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:18:52,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1232153.3333333333, ans=0.05 2023-10-03 10:18:53,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 10:18:55,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:18:55,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:18:55,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 10:18:55,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:18:57,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:18:57,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:18:57,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:18:58,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1232153.3333333333, ans=0.2 2023-10-03 10:18:59,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:19:01,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 10:19:01,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:19:06,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 10:19:07,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:19:07,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1232220.0, ans=0.125 2023-10-03 10:19:10,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:19:10,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:19:11,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:19:11,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 10:19:12,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:19:14,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:19:14,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1232220.0, ans=0.125 2023-10-03 10:19:19,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:19:20,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:19:22,747 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.89 vs. limit=15.0 2023-10-03 10:19:27,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:19:30,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 10:19:30,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1232353.3333333333, ans=0.1 2023-10-03 10:19:31,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:19:32,467 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.32 vs. limit=15.0 2023-10-03 10:19:33,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1232353.3333333333, ans=0.0 2023-10-03 10:19:34,225 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.17 vs. limit=15.0 2023-10-03 10:19:36,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 10:19:36,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:19:38,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 10:19:43,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:19:46,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1232420.0, ans=0.125 2023-10-03 10:19:47,742 INFO [train.py:1046] (1/4) Epoch 35, batch 4250, loss[loss=0.1599, simple_loss=0.2504, pruned_loss=0.03471, over 24649.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.238, pruned_loss=0.04014, over 4702151.25 frames. ], batch size: 73, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:19:49,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:19:49,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:19:51,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:19:56,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:19:57,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 10:19:57,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:20:00,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:20:05,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:20:09,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:20:09,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:20:11,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:20:12,891 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.847e+02 2.096e+02 2.391e+02 3.957e+02, threshold=4.191e+02, percent-clipped=0.0 2023-10-03 10:20:12,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:20:13,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:20:14,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:20:14,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:20:14,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1232486.6666666667, ans=0.1 2023-10-03 10:20:17,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:20:19,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:20:19,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 10:20:23,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 10:20:23,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:20:24,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:20:24,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:20:25,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:20:25,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:20:25,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:20:30,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 10:20:31,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:20:35,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:20:37,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:20:39,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 10:20:39,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:20:39,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 10:20:40,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:20:40,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1232620.0, ans=0.0 2023-10-03 10:20:41,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:20:43,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:20:43,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:20:46,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 10:20:49,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:20:49,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:20:50,135 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:20:53,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:20:55,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:20:58,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:20:58,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:20:59,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:20:59,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:21:01,288 INFO [train.py:1046] (1/4) Epoch 35, batch 4300, loss[loss=0.1508, simple_loss=0.2425, pruned_loss=0.02954, over 24635.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2373, pruned_loss=0.0401, over 4699684.79 frames. ], batch size: 68, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:21:01,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:21:01,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 10:21:04,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:21:10,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:21:10,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:21:13,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:21:19,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:21:19,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 10:21:21,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:21:21,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1232820.0, ans=0.0 2023-10-03 10:21:22,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:21:23,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:21:23,971 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 10:21:26,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 10:21:28,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:21:30,228 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.52 vs. limit=15.0 2023-10-03 10:21:31,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 10:21:31,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:21:31,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 10:21:33,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 10:21:35,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:21:38,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:21:38,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:21:40,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:21:41,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:21:42,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:21:42,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 10:21:44,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 10:21:46,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:21:49,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:21:49,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 10:21:49,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1232953.3333333333, ans=0.2 2023-10-03 10:21:50,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:21:50,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:21:50,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 10:21:50,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 10:21:50,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 10:21:52,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:21:52,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 10:21:52,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 10:21:56,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:21:56,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1232953.3333333333, ans=0.125 2023-10-03 10:21:57,578 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 10:21:58,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:22:00,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:22:00,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1233020.0, ans=0.0 2023-10-03 10:22:01,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:22:03,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 10:22:05,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:22:05,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:22:06,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:22:06,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:22:07,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:22:11,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:22:12,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:22:13,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:22:13,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:22:15,226 INFO [train.py:1046] (1/4) Epoch 35, batch 4350, loss[loss=0.1698, simple_loss=0.2372, pruned_loss=0.05118, over 23775.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2385, pruned_loss=0.04022, over 4712717.97 frames. ], batch size: 164, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:22:18,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 10:22:19,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:22:24,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:22:25,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:22:28,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:22:28,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:22:34,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:22:37,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:22:40,393 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.912e+02 2.047e+02 2.309e+02 3.251e+02, threshold=4.094e+02, percent-clipped=0.0 2023-10-03 10:22:40,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:22:40,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:22:43,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:22:45,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:22:46,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:22:51,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 10:22:51,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:22:52,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:22:57,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:22:59,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 10:23:03,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:23:04,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:23:08,732 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 10:23:12,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:23:12,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:23:12,441 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:23:13,498 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 10:23:13,572 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 10:23:13,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:23:13,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:23:14,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:23:16,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:23:17,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:23:17,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:23:19,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1233353.3333333333, ans=0.125 2023-10-03 10:23:20,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 10:23:20,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:20,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:23:20,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:22,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 10:23:23,470 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 10:23:23,474 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 10:23:23,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 10:23:24,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:23:26,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:23:26,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:23:28,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:23:29,410 INFO [train.py:1046] (1/4) Epoch 35, batch 4400, loss[loss=0.1713, simple_loss=0.248, pruned_loss=0.04727, over 23115.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2398, pruned_loss=0.04073, over 4703336.60 frames. ], batch size: 105, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:23:29,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 10:23:30,961 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 10:23:30,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:36,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:23:36,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:38,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:23:39,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 10:23:39,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 10:23:39,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 10:23:41,318 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 10:23:42,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:23:42,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:23:45,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 10:23:46,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1233486.6666666667, ans=0.125 2023-10-03 10:23:48,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:48,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:23:48,746 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 10:23:51,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:23:51,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 10:23:52,954 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 10:23:56,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 10:23:56,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 10:23:56,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 10:23:56,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:23:57,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:23:59,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:23:59,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1233553.3333333333, ans=0.125 2023-10-03 10:24:01,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:24:02,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 10:24:02,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1233553.3333333333, ans=0.125 2023-10-03 10:24:03,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 10:24:04,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:24:05,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:24:05,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:24:06,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:24:06,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:24:06,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 10:24:08,091 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 10:24:10,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1233553.3333333333, ans=0.0 2023-10-03 10:24:11,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:24:18,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:24:21,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 10:24:25,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:24:27,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:24:30,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:24:31,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 10:24:31,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:24:31,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:24:31,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:24:31,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:24:36,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 10:24:38,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 10:24:39,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 10:24:39,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:24:39,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 10:24:41,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:24:44,091 INFO [train.py:1046] (1/4) Epoch 35, batch 4450, loss[loss=0.1667, simple_loss=0.2388, pruned_loss=0.0473, over 23611.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2408, pruned_loss=0.04116, over 4705324.27 frames. ], batch size: 149, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:24:46,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:24:47,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 10:24:50,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:24:53,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:24:53,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:24:54,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1233753.3333333333, ans=0.125 2023-10-03 10:24:56,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:24:56,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:24:59,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:25:01,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:25:04,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:25:05,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:25:05,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 10:25:05,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:25:07,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:25:07,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:25:07,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:25:10,452 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.919e+02 2.096e+02 2.482e+02 3.695e+02, threshold=4.192e+02, percent-clipped=0.0 2023-10-03 10:25:10,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 10:25:14,090 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.43 vs. limit=15.0 2023-10-03 10:25:15,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:25:15,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:25:16,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:25:16,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:25:18,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:25:24,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 10:25:25,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 10:25:25,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 10:25:25,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:25:27,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:25:28,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 10:25:31,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:25:35,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:25:35,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 10:25:35,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:25:35,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:25:35,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:25:35,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:25:37,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:25:41,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:25:41,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1234020.0, ans=0.125 2023-10-03 10:25:43,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 10:25:44,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:25:46,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1234020.0, ans=0.0 2023-10-03 10:25:47,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:25:47,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:25:49,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:25:49,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 10:25:51,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:25:53,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 10:25:55,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:25:57,833 INFO [train.py:1046] (1/4) Epoch 35, batch 4500, loss[loss=0.1718, simple_loss=0.2641, pruned_loss=0.03971, over 24441.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2419, pruned_loss=0.04172, over 4692693.64 frames. ], batch size: 69, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:25:59,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:26:00,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 10:26:00,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 10:26:01,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:26:05,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:26:05,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:26:06,035 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.95 vs. limit=10.0 2023-10-03 10:26:06,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:26:08,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:26:08,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:26:08,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:26:21,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:26:21,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:26:23,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:26:25,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1234153.3333333333, ans=0.0 2023-10-03 10:26:26,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:26:27,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:26:32,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 10:26:34,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1234220.0, ans=0.0 2023-10-03 10:26:37,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:26:42,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:26:45,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:26:45,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 10:26:45,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:26:47,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:26:49,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:26:49,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:26:50,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=1234286.6666666667, ans=10.0 2023-10-03 10:26:52,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:26:52,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 10:26:52,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:26:52,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:26:52,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1234286.6666666667, ans=0.0 2023-10-03 10:26:58,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:26:58,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:26:59,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1234353.3333333333, ans=0.025 2023-10-03 10:27:00,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:27:02,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:27:04,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:27:04,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 10:27:06,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 10:27:06,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 10:27:09,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 10:27:12,139 INFO [train.py:1046] (1/4) Epoch 35, batch 4550, loss[loss=0.1522, simple_loss=0.2054, pruned_loss=0.04952, over 19452.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2409, pruned_loss=0.0412, over 4709333.96 frames. ], batch size: 388, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:27:12,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 10:27:12,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:27:16,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:27:17,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:27:19,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:27:22,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1234420.0, ans=0.0 2023-10-03 10:27:23,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:27:25,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:27:27,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:27:27,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:27:27,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:27:29,077 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.88 vs. limit=15.0 2023-10-03 10:27:29,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:27:31,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:27:34,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:27:37,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 10:27:37,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 10:27:38,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:27:39,284 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.927e+02 2.055e+02 2.353e+02 3.694e+02, threshold=4.110e+02, percent-clipped=0.0 2023-10-03 10:27:39,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 10:27:41,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1234553.3333333333, ans=0.125 2023-10-03 10:27:42,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 10:27:43,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:27:47,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 10:27:49,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:27:53,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:27:53,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:27:53,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:27:56,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 10:27:59,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:28:01,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:28:01,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:28:02,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:28:04,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 10:28:04,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 10:28:05,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:28:07,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 10:28:10,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 10:28:10,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:28:10,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:10,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:28:12,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:28:12,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:28:13,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:28:14,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 10:28:16,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=1234686.6666666667, ans=6.0 2023-10-03 10:28:17,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:28:17,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 10:28:17,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 10:28:17,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:28:18,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 10:28:21,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:28:21,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:28:23,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:28:23,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:28:24,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 10:28:26,309 INFO [train.py:1046] (1/4) Epoch 35, batch 4600, loss[loss=0.1459, simple_loss=0.2235, pruned_loss=0.03416, over 23685.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2392, pruned_loss=0.04049, over 4717139.98 frames. ], batch size: 149, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:28:26,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:28:29,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:28:30,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:32,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:28:35,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:28:35,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:28:36,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:28:37,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 10:28:39,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:28:43,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:28:44,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:28:46,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:51,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 10:28:53,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:56,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:58,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:28:58,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:29:03,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 10:29:03,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 10:29:03,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:29:09,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:29:09,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:29:11,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:29:14,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 10:29:15,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 10:29:15,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1234953.3333333333, ans=0.125 2023-10-03 10:29:21,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:29:21,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:29:24,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:29:24,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 10:29:24,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:29:25,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 10:29:25,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:29:26,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1235020.0, ans=0.125 2023-10-03 10:29:27,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:29:28,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:29:28,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:29:30,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:29:31,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 10:29:32,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 10:29:32,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 10:29:33,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:29:34,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:29:35,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:29:36,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:29:37,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1235020.0, ans=0.0 2023-10-03 10:29:40,759 INFO [train.py:1046] (1/4) Epoch 35, batch 4650, loss[loss=0.1612, simple_loss=0.2331, pruned_loss=0.04469, over 23529.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2389, pruned_loss=0.04006, over 4730271.18 frames. ], batch size: 256, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:29:45,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:29:48,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:29:48,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:29:49,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:29:49,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:29:49,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:29:49,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1235086.6666666667, ans=0.09899494936611666 2023-10-03 10:29:51,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:29:55,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 10:29:58,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:30:01,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 10:30:01,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:30:02,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 10:30:02,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:30:03,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 10:30:03,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 10:30:03,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:30:04,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:30:07,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:30:08,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:30:08,845 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 10:30:10,230 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.852e+02 2.049e+02 2.338e+02 3.401e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-03 10:30:13,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:30:14,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 10:30:18,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:30:18,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:30:18,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 10:30:19,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:30:22,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:30:27,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:30:27,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1235286.6666666667, ans=0.2 2023-10-03 10:30:31,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:30:32,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:30:34,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:30:34,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:30:37,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 10:30:37,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 10:30:38,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 10:30:38,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 10:30:40,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:30:46,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:30:46,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:30:46,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 10:30:46,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:30:47,095 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.29 vs. limit=15.0 2023-10-03 10:30:47,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:30:47,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:30:49,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:30:52,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:30:52,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:30:52,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:30:55,120 INFO [train.py:1046] (1/4) Epoch 35, batch 4700, loss[loss=0.1877, simple_loss=0.2539, pruned_loss=0.06076, over 19631.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2393, pruned_loss=0.04006, over 4723344.86 frames. ], batch size: 388, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:30:57,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:30:58,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:30:58,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:30:58,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 10:30:59,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:31:01,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 10:31:03,266 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.50 vs. limit=15.0 2023-10-03 10:31:08,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:31:09,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1235486.6666666667, ans=0.1 2023-10-03 10:31:10,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:31:10,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:31:11,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:31:13,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 10:31:16,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 10:31:17,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 10:31:19,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:31:21,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:31:21,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:31:24,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:31:27,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1235553.3333333333, ans=0.0 2023-10-03 10:31:31,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 10:31:32,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 10:31:34,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:31:40,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 10:31:41,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:31:42,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1235620.0, ans=0.125 2023-10-03 10:31:44,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:31:45,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 10:31:47,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:31:51,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:31:52,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 10:31:53,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:31:53,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:31:57,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:31:58,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:31:58,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 10:31:59,917 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 10:32:01,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:32:01,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:32:01,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:32:01,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 10:32:02,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:32:06,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1235686.6666666667, ans=0.125 2023-10-03 10:32:07,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 10:32:10,142 INFO [train.py:1046] (1/4) Epoch 35, batch 4750, loss[loss=0.1714, simple_loss=0.2445, pruned_loss=0.04914, over 23520.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.24, pruned_loss=0.0408, over 4713196.13 frames. ], batch size: 285, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:32:10,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:32:10,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:32:15,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:32:15,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:32:19,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 10:32:20,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:32:23,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 10:32:24,204 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.29 vs. limit=22.5 2023-10-03 10:32:25,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:32:25,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:32:25,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:32:30,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 10:32:32,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1235820.0, ans=0.95 2023-10-03 10:32:35,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:32:36,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 10:32:37,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:32:39,854 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.852e+02 2.146e+02 2.471e+02 3.124e+02, threshold=4.292e+02, percent-clipped=0.0 2023-10-03 10:32:41,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:32:41,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:32:41,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:32:44,133 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 10:32:44,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 10:32:50,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 10:32:51,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:32:53,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:32:55,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:32:55,962 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 10:32:55,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:32:56,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1235953.3333333333, ans=0.125 2023-10-03 10:32:59,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:33:02,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:33:03,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 10:33:03,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 10:33:03,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1235953.3333333333, ans=0.1 2023-10-03 10:33:05,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:33:05,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:33:06,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:33:06,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:33:06,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 10:33:09,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 10:33:11,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1236020.0, ans=0.2 2023-10-03 10:33:13,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:33:15,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:33:15,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 10:33:15,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:33:17,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:33:18,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:33:18,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:33:20,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 10:33:21,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:33:21,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 10:33:24,456 INFO [train.py:1046] (1/4) Epoch 35, batch 4800, loss[loss=0.159, simple_loss=0.2504, pruned_loss=0.03375, over 24418.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2401, pruned_loss=0.04062, over 4722734.85 frames. ], batch size: 69, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:33:24,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 10:33:24,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 10:33:27,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:33:28,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:33:29,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 10:33:34,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:33:34,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:33:39,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:33:39,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:33:39,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:33:41,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 10:33:41,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:33:41,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:33:43,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:33:45,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1236153.3333333333, ans=0.125 2023-10-03 10:33:47,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:33:50,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:33:50,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:33:50,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:33:51,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 10:33:51,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:33:53,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:33:55,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:33:56,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1236220.0, ans=0.1 2023-10-03 10:33:59,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:33:59,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1236220.0, ans=0.125 2023-10-03 10:34:00,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:34:00,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:34:03,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 10:34:04,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:34:05,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 10:34:07,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 10:34:07,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:34:08,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:34:08,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:34:08,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:34:08,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:34:10,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:34:11,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:34:14,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:34:17,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:17,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:34:22,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 10:34:23,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:34:23,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:25,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:34:25,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:34:29,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:34:29,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:34:29,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:31,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:34:31,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:34:33,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:34:35,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:34:35,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:35,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:34:37,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 10:34:38,452 INFO [train.py:1046] (1/4) Epoch 35, batch 4850, loss[loss=0.1434, simple_loss=0.2195, pruned_loss=0.0337, over 24570.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2403, pruned_loss=0.04074, over 4728712.44 frames. ], batch size: 60, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:34:40,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 10:34:40,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:34:40,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:34:41,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:34:41,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:42,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:34:51,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 10:34:53,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:34:56,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:34:57,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 10:34:57,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:35:01,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:35:02,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:35:03,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:35:03,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 10:35:06,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:35:07,875 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.944e+02 2.127e+02 2.497e+02 3.827e+02, threshold=4.255e+02, percent-clipped=0.0 2023-10-03 10:35:08,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:35:09,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 10:35:09,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:35:09,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 10:35:12,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:35:12,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:35:18,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:35:18,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 10:35:18,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 10:35:19,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:35:27,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:35:27,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 10:35:28,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:35:28,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:35:29,344 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.41 vs. limit=15.0 2023-10-03 10:35:32,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:35:32,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1236620.0, ans=0.0 2023-10-03 10:35:33,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 10:35:33,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:35:33,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 10:35:33,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:35:35,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:35:36,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 10:35:44,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:35:49,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:35:49,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:35:49,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1236686.6666666667, ans=0.1 2023-10-03 10:35:52,514 INFO [train.py:1046] (1/4) Epoch 35, batch 4900, loss[loss=0.1355, simple_loss=0.211, pruned_loss=0.03003, over 24453.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2398, pruned_loss=0.04026, over 4726437.44 frames. ], batch size: 58, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:35:54,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1236753.3333333333, ans=0.1 2023-10-03 10:35:55,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 10:35:55,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:35:55,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1236753.3333333333, ans=0.0 2023-10-03 10:35:58,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:35:59,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:35:59,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:36:03,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 10:36:06,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 10:36:07,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=1236820.0, ans=22.5 2023-10-03 10:36:11,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 10:36:12,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 10:36:12,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:36:12,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:36:13,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:36:13,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:36:13,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:36:14,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 10:36:17,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 10:36:19,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:36:19,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:36:20,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:36:22,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:36:23,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:36:25,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:36:25,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 10:36:26,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1236886.6666666667, ans=0.09899494936611666 2023-10-03 10:36:27,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:36:29,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:36:30,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 10:36:30,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 10:36:33,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 10:36:36,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:36:36,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:36:36,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:36:37,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:36:38,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 10:36:38,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:36:38,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 10:36:40,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:36:42,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 10:36:44,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:36:47,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 10:36:47,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:36:48,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 10:36:49,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 10:36:55,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1237020.0, ans=0.09899494936611666 2023-10-03 10:36:56,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:36:57,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:36:59,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 10:36:59,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 10:36:59,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:37:01,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:37:05,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:37:05,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:37:06,716 INFO [train.py:1046] (1/4) Epoch 35, batch 4950, loss[loss=0.1536, simple_loss=0.221, pruned_loss=0.0431, over 23487.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2389, pruned_loss=0.04009, over 4727008.74 frames. ], batch size: 285, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:37:06,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:37:06,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 10:37:08,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:37:10,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:37:11,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 10:37:13,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 10:37:13,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 10:37:13,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:37:13,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 10:37:15,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:15,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:37:15,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:37:15,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:37:17,512 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.72 vs. limit=22.5 2023-10-03 10:37:18,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:37:19,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:37:21,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:37:21,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:37:24,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:24,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:37:28,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:37:33,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:34,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:37:36,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:36,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:37:37,597 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.912e+02 2.157e+02 2.432e+02 3.456e+02, threshold=4.313e+02, percent-clipped=0.0 2023-10-03 10:37:37,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:37:37,883 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:37:39,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 10:37:40,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 10:37:43,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:37:44,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:37:44,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:37:45,073 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.24 vs. limit=6.0 2023-10-03 10:37:45,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:37:45,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:37:47,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:37:49,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:37:51,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:37:53,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:37:55,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:56,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:37:56,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 10:37:56,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:37:59,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:38:02,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:38:04,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:38:04,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:38:05,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:38:05,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:38:05,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:38:08,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:38:08,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:38:08,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:38:11,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 10:38:12,966 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.53 vs. limit=15.0 2023-10-03 10:38:15,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:38:19,251 INFO [train.py:1046] (1/4) Epoch 35, batch 5000, loss[loss=0.1495, simple_loss=0.2261, pruned_loss=0.03646, over 24304.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2384, pruned_loss=0.03953, over 4731448.43 frames. ], batch size: 56, lr: 2.89e-03, grad_scale: 8.0 2023-10-03 10:38:19,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 10:38:19,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 10:38:21,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1237420.0, ans=0.07 2023-10-03 10:38:24,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:38:25,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:38:27,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 10:38:28,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 10:38:29,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1237420.0, ans=0.1 2023-10-03 10:38:29,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1237420.0, ans=0.125 2023-10-03 10:38:30,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:38:33,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 10:38:33,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:38:33,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:38:33,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1237486.6666666667, ans=0.125 2023-10-03 10:38:34,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 10:38:36,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:38:36,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:38:36,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 10:38:36,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:38:37,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:38:37,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 10:38:38,574 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.49 vs. limit=15.0 2023-10-03 10:38:39,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 10:38:39,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:38:39,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 10:38:39,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:38:40,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:38:40,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:38:40,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 10:38:40,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 10:38:41,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1237486.6666666667, ans=0.2 2023-10-03 10:38:43,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 10:38:43,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:38:44,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:38:46,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 10:38:46,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:38:48,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:38:49,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:38:52,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 10:38:53,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 10:38:53,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:38:53,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1237553.3333333333, ans=0.125 2023-10-03 10:38:53,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1237553.3333333333, ans=0.025 2023-10-03 10:38:54,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:38:59,643 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 10:39:01,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:39:02,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:39:02,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:05,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 10:39:05,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:39:07,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:39:07,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:39:09,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.whiten.whitening_limit, batch_count=1237620.0, ans=12.0 2023-10-03 10:39:09,905 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.09 vs. limit=15.0 2023-10-03 10:39:10,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 10:39:10,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:39:14,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:39:14,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:39:20,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 10:39:21,269 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.66 vs. limit=15.0 2023-10-03 10:39:23,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:30,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1237686.6666666667, ans=0.0 2023-10-03 10:39:32,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1237753.3333333333, ans=0.0 2023-10-03 10:39:33,442 INFO [train.py:1046] (1/4) Epoch 35, batch 5050, loss[loss=0.1644, simple_loss=0.2399, pruned_loss=0.04443, over 23401.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2386, pruned_loss=0.0396, over 4732196.25 frames. ], batch size: 285, lr: 2.89e-03, grad_scale: 8.0 2023-10-03 10:39:33,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:39:34,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:34,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:39:34,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:39:36,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:39:36,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:39:36,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:36,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1237753.3333333333, ans=0.1 2023-10-03 10:39:41,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:41,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 10:39:42,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:39:43,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:39:44,674 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.83 vs. limit=10.0 2023-10-03 10:39:45,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:39:45,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 10:39:46,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:39:46,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:39:48,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:39:49,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:39:49,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:39:50,817 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.48 vs. limit=15.0 2023-10-03 10:39:58,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 10:39:59,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 10:40:00,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:40:00,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 10:40:01,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:40:01,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:40:01,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:40:03,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:40:03,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 10:40:04,410 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 1.891e+02 1.998e+02 2.212e+02 3.004e+02, threshold=3.997e+02, percent-clipped=0.0 2023-10-03 10:40:04,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 10:40:06,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:40:06,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1237886.6666666667, ans=0.125 2023-10-03 10:40:08,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:40:11,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:40:11,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 10:40:13,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:40:16,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 10:40:16,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:40:17,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:40:18,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:40:19,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:40:22,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:40:24,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:40:24,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:40:25,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:40:25,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:40:25,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 10:40:25,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:40:28,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:40:31,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:40:31,772 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 10:40:31,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 10:40:31,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:40:33,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:40:33,227 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 10:40:37,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:40:37,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 10:40:37,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:40:42,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:40:42,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:40:42,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 10:40:45,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 10:40:46,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:40:46,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:40:46,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:40:46,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1238086.6666666667, ans=0.1 2023-10-03 10:40:48,028 INFO [train.py:1046] (1/4) Epoch 35, batch 5100, loss[loss=0.1654, simple_loss=0.2374, pruned_loss=0.04667, over 23908.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2389, pruned_loss=0.03968, over 4732572.37 frames. ], batch size: 212, lr: 2.89e-03, grad_scale: 8.0 2023-10-03 10:40:49,605 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 10:40:51,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1238086.6666666667, ans=0.2 2023-10-03 10:40:52,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:40:55,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 10:40:55,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 10:40:57,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:41:00,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:41:01,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:41:01,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1238153.3333333333, ans=0.125 2023-10-03 10:41:03,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 10:41:03,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 10:41:09,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:41:10,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:41:13,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:41:17,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 10:41:17,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:41:20,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:41:20,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 10:41:21,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:41:22,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:41:23,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 10:41:24,746 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 10:41:26,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:41:26,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 10:41:26,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 10:41:29,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:41:34,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1238286.6666666667, ans=0.125 2023-10-03 10:41:38,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:41:41,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 10:41:41,434 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 10:41:41,447 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 10:41:44,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 10:41:44,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:41:47,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 10:41:51,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 10:41:54,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:41:54,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:41:55,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 10:41:57,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 10:41:58,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 10:42:01,735 INFO [train.py:1046] (1/4) Epoch 35, batch 5150, loss[loss=0.2107, simple_loss=0.2755, pruned_loss=0.073, over 19095.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.24, pruned_loss=0.03981, over 4735264.94 frames. ], batch size: 388, lr: 2.89e-03, grad_scale: 8.0 2023-10-03 10:42:01,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:42:01,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:42:01,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:42:03,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:42:04,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 10:42:04,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:42:05,011 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:42:06,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 10:42:06,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 10:42:07,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 10:42:07,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:42:07,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 10:42:08,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:42:10,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 10:42:12,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:42:13,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:42:17,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:42:17,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 10:42:18,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:42:18,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1238486.6666666667, ans=0.125 2023-10-03 10:42:18,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1238486.6666666667, ans=0.1 2023-10-03 10:42:19,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:42:21,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:42:21,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:42:22,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:42:22,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:42:22,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:42:23,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 10:42:23,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:42:25,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:42:26,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:42:28,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 10:42:29,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:42:33,233 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 2.014e+02 2.285e+02 2.770e+02 4.713e+02, threshold=4.570e+02, percent-clipped=3.0 2023-10-03 10:42:35,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:42:37,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 10:42:40,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:42:45,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:42:45,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:42:50,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:42:52,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:42:54,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 10:42:59,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:42:59,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:42:59,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:43:02,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:43:04,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:43:04,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 10:43:05,522 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.19 vs. limit=22.5 2023-10-03 10:43:06,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1238686.6666666667, ans=0.2 2023-10-03 10:43:10,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:43:10,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:43:13,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:43:13,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:43:14,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 10:43:14,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:43:14,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:43:16,100 INFO [train.py:1046] (1/4) Epoch 35, batch 5200, loss[loss=0.1649, simple_loss=0.2569, pruned_loss=0.0365, over 24463.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2402, pruned_loss=0.03977, over 4741014.24 frames. ], batch size: 69, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:43:16,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:43:19,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:43:22,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:43:23,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:43:26,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 10:43:27,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:43:28,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:43:31,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:43:31,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:43:31,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:43:32,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 10:43:35,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:43:35,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:43:39,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 10:43:42,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:43:42,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:43:44,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 10:43:45,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 10:43:47,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 10:43:47,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:43:49,142 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 10:43:49,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:43:49,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1238886.6666666667, ans=0.2 2023-10-03 10:43:50,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:43:50,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:43:51,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 10:43:51,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:43:54,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:43:57,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 10:43:57,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 10:43:57,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 10:44:00,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1238953.3333333333, ans=0.0 2023-10-03 10:44:03,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 10:44:03,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:44:10,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:44:10,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:44:12,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 10:44:13,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:44:13,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 10:44:13,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:44:13,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:44:15,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:44:17,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:44:17,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1239020.0, ans=0.125 2023-10-03 10:44:21,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:44:21,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:44:21,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:44:24,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:44:25,121 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1239020.0, ans=0.125 2023-10-03 10:44:26,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 10:44:27,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:44:27,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:44:28,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:44:29,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 10:44:30,844 INFO [train.py:1046] (1/4) Epoch 35, batch 5250, loss[loss=0.1423, simple_loss=0.2172, pruned_loss=0.03363, over 24347.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2397, pruned_loss=0.03996, over 4741798.33 frames. ], batch size: 56, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:44:30,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:44:32,871 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.39 vs. limit=12.0 2023-10-03 10:44:34,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:44:36,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:44:36,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:44:38,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1239086.6666666667, ans=0.125 2023-10-03 10:44:39,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:44:43,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:44:46,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:44:48,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:44:48,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:44:51,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 10:44:51,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:44:51,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:45:00,735 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.950e+02 2.167e+02 2.359e+02 3.354e+02, threshold=4.333e+02, percent-clipped=0.0 2023-10-03 10:45:39,410 INFO [train.py:1046] (1/4) Epoch 35, batch 5300, loss[loss=0.163, simple_loss=0.2385, pruned_loss=0.0437, over 23207.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2386, pruned_loss=0.03992, over 4721381.91 frames. ], batch size: 119, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:45:43,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1239420.0, ans=0.0 2023-10-03 10:45:50,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1239420.0, ans=0.125 2023-10-03 10:45:53,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:45:53,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 10:45:53,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 10:45:53,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:45:54,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:45:54,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:45:54,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:45:54,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:45:54,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:45:54,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:45:54,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 10:45:54,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:45:54,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 10:45:54,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 10:45:54,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 10:45:54,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 10:45:54,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 10:45:55,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 10:45:55,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:45:55,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:45:55,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:45:55,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:45:55,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:45:56,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:45:56,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:45:56,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:45:56,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:45:56,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:45:56,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:45:56,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:45:56,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:45:57,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 10:45:57,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:45:57,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:45:57,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 10:45:57,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 10:45:57,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:45:57,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:45:57,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 10:45:58,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 10:45:58,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:45:58,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:45:58,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:45:58,830 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 10:45:58,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 10:45:58,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:45:59,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:45:59,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 10:45:59,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 10:45:59,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 10:45:59,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:46:05,606 INFO [train.py:1046] (1/4) Epoch 36, batch 0, loss[loss=0.1504, simple_loss=0.2271, pruned_loss=0.03686, over 23326.00 frames. ], tot_loss[loss=0.1504, simple_loss=0.2271, pruned_loss=0.03686, over 23326.00 frames. ], batch size: 134, lr: 2.85e-03, grad_scale: 32.0 2023-10-03 10:46:05,607 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 10:46:17,647 INFO [train.py:1078] (1/4) Epoch 36, validation: loss=0.3188, simple_loss=0.2685, pruned_loss=0.1846, over 1125622.00 frames. 2023-10-03 10:46:17,648 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 10:46:17,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1239500.0, ans=0.0 2023-10-03 10:46:20,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 10:46:20,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:46:23,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:46:26,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:46:26,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:46:26,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:46:27,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 10:46:29,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 10:46:30,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:46:32,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:46:32,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1239566.6666666667, ans=0.125 2023-10-03 10:46:33,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1239566.6666666667, ans=0.125 2023-10-03 10:46:33,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1239566.6666666667, ans=0.0 2023-10-03 10:46:36,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:46:37,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:46:37,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:46:37,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:46:40,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 10:46:41,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:46:48,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:46:48,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:46:50,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 10:46:54,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:46:54,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:46:57,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:47:00,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1239700.0, ans=0.125 2023-10-03 10:47:02,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:47:07,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:47:11,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 10:47:14,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 10:47:16,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:47:16,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:47:18,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:47:18,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:47:21,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 10:47:22,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:47:24,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:47:27,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:47:31,226 INFO [train.py:1046] (1/4) Epoch 36, batch 50, loss[loss=0.1684, simple_loss=0.2424, pruned_loss=0.04724, over 23805.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2401, pruned_loss=0.04013, over 1074270.60 frames. ], batch size: 164, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:47:31,319 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 10:47:32,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:47:36,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:47:37,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:47:37,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 10:47:38,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:47:38,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:47:40,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:47:40,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:47:41,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:47:41,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1239833.3333333333, ans=0.125 2023-10-03 10:47:45,661 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.883e+02 2.047e+02 2.460e+02 5.185e+02, threshold=4.094e+02, percent-clipped=4.0 2023-10-03 10:47:45,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 10:47:45,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:47:48,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1239900.0, ans=0.125 2023-10-03 10:47:52,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:47:53,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 10:47:55,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 10:47:56,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:47:58,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:47:58,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:47:59,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:47:59,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 10:47:59,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1239966.6666666667, ans=0.125 2023-10-03 10:48:00,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:48:00,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:48:08,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:48:10,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:48:10,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:48:11,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 10:48:12,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:48:14,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:48:14,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 10:48:14,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:48:16,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 10:48:25,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:48:25,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:48:27,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:48:28,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:48:28,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:48:29,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1240033.3333333333, ans=10.0 2023-10-03 10:48:29,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 10:48:29,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 10:48:31,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:48:31,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1240100.0, ans=0.125 2023-10-03 10:48:32,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:48:34,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:48:34,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:48:34,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 10:48:35,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 10:48:36,087 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1240100.0, ans=0.0 2023-10-03 10:48:37,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 10:48:38,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:48:38,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:48:39,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 10:48:39,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 10:48:42,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:48:42,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:48:43,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:48:44,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:48:45,320 INFO [train.py:1046] (1/4) Epoch 36, batch 100, loss[loss=0.1623, simple_loss=0.251, pruned_loss=0.03677, over 24430.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2406, pruned_loss=0.04001, over 1894063.21 frames. ], batch size: 69, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:48:45,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:48:48,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:48:51,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:48:52,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 10:48:52,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:48:57,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:48:57,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:48:57,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:48:57,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:48:58,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:49:00,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 10:49:02,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:49:02,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:49:02,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:49:04,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:49:07,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 10:49:07,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:49:07,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1240233.3333333333, ans=0.1 2023-10-03 10:49:09,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:49:10,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:49:13,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:49:17,214 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 10:49:17,236 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 10:49:18,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:49:18,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:49:21,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:49:23,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:49:25,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:25,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1240300.0, ans=0.125 2023-10-03 10:49:30,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:32,629 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 10:49:34,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 10:49:36,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:49:38,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:49:38,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1240366.6666666667, ans=0.2 2023-10-03 10:49:40,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:42,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:49:45,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:49:45,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:49:48,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:48,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:49:49,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:49:49,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:49:50,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1240433.3333333333, ans=10.0 2023-10-03 10:49:51,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:51,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 10:49:51,791 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 10:49:51,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:49:53,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:49:54,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:49:54,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:49:54,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 10:49:54,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:49:54,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:49:54,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:49:54,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:49:56,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:49:57,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:49:57,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:49:59,054 INFO [train.py:1046] (1/4) Epoch 36, batch 150, loss[loss=0.1559, simple_loss=0.2382, pruned_loss=0.03677, over 24465.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2395, pruned_loss=0.03994, over 2520576.81 frames. ], batch size: 63, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:49:59,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:50:03,548 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.20 vs. limit=8.0 2023-10-03 10:50:03,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:50:03,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:50:03,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:06,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1240500.0, ans=0.125 2023-10-03 10:50:06,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1240500.0, ans=0.0 2023-10-03 10:50:08,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:50:08,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:11,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:50:12,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:13,949 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.846e+02 1.951e+02 2.154e+02 3.020e+02, threshold=3.902e+02, percent-clipped=0.0 2023-10-03 10:50:15,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 10:50:15,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 10:50:15,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 10:50:18,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:50:18,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:50:19,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:50:19,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:50:19,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:50:19,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:21,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:23,106 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 10:50:24,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:50:28,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1240633.3333333333, ans=0.125 2023-10-03 10:50:28,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1240633.3333333333, ans=0.2 2023-10-03 10:50:30,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:50:36,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:50:36,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 10:50:41,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:50:41,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:50:41,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:50:42,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:50:45,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:50:45,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:50:46,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:50:46,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 10:50:47,234 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.76 vs. limit=6.0 2023-10-03 10:50:52,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:50:53,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:50:54,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:50:55,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:50:56,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:50:58,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 10:51:01,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:51:02,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:51:03,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:51:04,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:51:04,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 10:51:04,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:51:04,337 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 10:51:10,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:51:12,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:51:12,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:51:13,584 INFO [train.py:1046] (1/4) Epoch 36, batch 200, loss[loss=0.1687, simple_loss=0.2519, pruned_loss=0.04275, over 23192.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2417, pruned_loss=0.04111, over 2999728.37 frames. ], batch size: 105, lr: 2.85e-03, grad_scale: 8.0 2023-10-03 10:51:16,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 10:51:16,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:51:16,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:51:19,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 10:51:20,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 10:51:20,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:51:22,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:51:26,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:51:26,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:51:26,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:51:30,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1240900.0, ans=0.0 2023-10-03 10:51:47,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:51:47,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:51:48,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:51:49,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:51:49,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 10:51:49,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:51:51,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:51:52,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:51:52,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:51:52,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:51:52,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 10:51:53,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1240966.6666666667, ans=0.125 2023-10-03 10:51:53,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:51:53,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:51:58,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:52:02,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:52:08,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1241033.3333333333, ans=0.125 2023-10-03 10:52:11,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:11,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:52:16,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:19,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 10:52:19,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:52:19,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:52:21,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:52:22,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:52:23,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 10:52:25,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:52:25,423 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 10:52:26,144 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.95 vs. limit=22.5 2023-10-03 10:52:26,681 INFO [train.py:1046] (1/4) Epoch 36, batch 250, loss[loss=0.1392, simple_loss=0.2273, pruned_loss=0.02555, over 24483.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2401, pruned_loss=0.04048, over 3392147.89 frames. ], batch size: 66, lr: 2.85e-03, grad_scale: 4.0 2023-10-03 10:52:28,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:31,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:52:32,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:32,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:52:34,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:52:34,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:36,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:52:39,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:52:45,133 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.828e+02 1.971e+02 2.158e+02 2.749e+02, threshold=3.942e+02, percent-clipped=0.0 2023-10-03 10:52:48,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:52:49,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1241233.3333333333, ans=0.09899494936611666 2023-10-03 10:52:49,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1241233.3333333333, ans=0.0 2023-10-03 10:52:50,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:52:50,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:52:56,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 10:52:57,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:52:58,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:52:59,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:52:59,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:52:59,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:53:01,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:53:03,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:53:06,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 10:53:06,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:53:08,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:53:08,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:53:08,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:53:10,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:53:12,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:53:12,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:53:15,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:53:15,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:53:16,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:53:20,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:53:23,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:53:26,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:53:31,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:53:32,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:53:35,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 10:53:35,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:53:35,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:53:38,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 10:53:38,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:53:41,196 INFO [train.py:1046] (1/4) Epoch 36, batch 300, loss[loss=0.1495, simple_loss=0.203, pruned_loss=0.04797, over 18751.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2378, pruned_loss=0.0401, over 3676878.05 frames. ], batch size: 388, lr: 2.85e-03, grad_scale: 8.0 2023-10-03 10:53:41,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:53:41,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 10:53:42,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1241500.0, ans=0.0 2023-10-03 10:53:44,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1241500.0, ans=0.1 2023-10-03 10:53:45,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:53:47,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:53:50,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:53:50,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 10:53:51,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:53:52,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 10:53:53,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 10:53:53,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:53:58,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:54:01,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1241566.6666666667, ans=0.2 2023-10-03 10:54:04,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:54:04,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 10:54:06,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 10:54:06,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:09,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:54:09,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:09,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 10:54:09,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:54:12,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:54:13,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:54:15,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:54:20,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 10:54:20,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 10:54:21,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:54:24,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:24,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 10:54:25,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:54:26,236 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.21 vs. limit=22.5 2023-10-03 10:54:28,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1241700.0, ans=0.1 2023-10-03 10:54:29,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:54:32,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:54:32,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 10:54:37,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:37,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:54:40,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:41,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:54:41,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 10:54:41,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:54:42,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:54:44,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 10:54:47,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:47,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:54:48,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:54:48,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:54:48,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:54:53,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:54:53,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 10:54:55,182 INFO [train.py:1046] (1/4) Epoch 36, batch 350, loss[loss=0.1374, simple_loss=0.2241, pruned_loss=0.02536, over 24680.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2369, pruned_loss=0.03932, over 3916378.61 frames. ], batch size: 68, lr: 2.85e-03, grad_scale: 8.0 2023-10-03 10:54:56,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:00,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:55:06,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:55:06,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:08,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 10:55:10,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:55:10,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 10:55:11,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:13,004 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.890e+02 2.063e+02 2.336e+02 3.358e+02, threshold=4.126e+02, percent-clipped=0.0 2023-10-03 10:55:13,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 10:55:13,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:55:17,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 10:55:17,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:55:19,954 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.43 vs. limit=22.5 2023-10-03 10:55:20,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:55:22,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:55:22,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:55:22,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:55:23,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:55:23,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:55:25,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:55:25,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1241966.6666666667, ans=0.125 2023-10-03 10:55:26,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:55:26,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:32,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:55:32,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:55:34,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:55:35,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:55:40,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 10:55:40,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:45,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:55:45,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:55:46,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:55:49,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 10:55:50,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:55:51,699 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 10:55:51,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 10:55:51,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:55:53,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1242100.0, ans=0.125 2023-10-03 10:55:56,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:55:56,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 10:55:57,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:55:59,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:56:00,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:56:02,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:56:02,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:56:05,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:56:06,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:56:08,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:56:08,732 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1242166.6666666667, ans=0.125 2023-10-03 10:56:10,303 INFO [train.py:1046] (1/4) Epoch 36, batch 400, loss[loss=0.1722, simple_loss=0.2467, pruned_loss=0.04887, over 23764.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2368, pruned_loss=0.03942, over 4090224.82 frames. ], batch size: 179, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:56:10,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 10:56:10,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:56:10,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:56:12,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:56:13,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:56:14,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:56:16,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:56:17,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 10:56:20,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 10:56:20,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:56:22,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 10:56:22,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:56:26,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:56:26,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:56:26,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 10:56:28,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:56:28,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:56:28,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:56:28,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:56:30,839 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 10:56:30,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 10:56:35,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:56:37,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:56:38,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 10:56:40,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 10:56:41,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:56:43,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:56:50,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1242300.0, ans=0.1 2023-10-03 10:56:51,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 10:56:54,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 10:56:57,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 10:56:58,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1242366.6666666667, ans=0.0 2023-10-03 10:56:59,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:56:59,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:57:00,097 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.97 vs. limit=22.5 2023-10-03 10:57:00,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 10:57:04,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:57:05,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:57:06,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:57:09,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:57:11,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 10:57:12,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:57:14,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 10:57:17,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:57:17,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:57:19,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 10:57:21,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:57:21,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:57:21,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 10:57:22,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 10:57:23,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:57:24,722 INFO [train.py:1046] (1/4) Epoch 36, batch 450, loss[loss=0.1477, simple_loss=0.2332, pruned_loss=0.03113, over 24500.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2378, pruned_loss=0.04003, over 4230619.50 frames. ], batch size: 69, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:57:24,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:57:24,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:57:24,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 10:57:24,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:57:27,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:57:29,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:57:39,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:57:39,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:57:42,359 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.918e+02 2.094e+02 2.356e+02 3.401e+02, threshold=4.188e+02, percent-clipped=0.0 2023-10-03 10:57:42,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 10:57:42,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 10:57:47,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:57:47,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1242566.6666666667, ans=0.95 2023-10-03 10:57:48,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:57:51,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:57:56,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:57:57,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:57:57,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1242633.3333333333, ans=10.0 2023-10-03 10:57:58,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 10:58:00,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 10:58:00,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 10:58:00,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:58:01,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:58:02,036 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1242633.3333333333, ans=0.125 2023-10-03 10:58:03,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:58:05,031 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 10:58:05,046 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 10:58:07,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:58:08,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:58:08,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 10:58:08,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1242700.0, ans=0.5 2023-10-03 10:58:11,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 10:58:11,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:58:12,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 10:58:12,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 10:58:14,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:58:14,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1242700.0, ans=0.125 2023-10-03 10:58:17,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:58:17,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 10:58:19,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 10:58:23,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:58:23,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 10:58:25,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 10:58:26,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:58:29,453 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:58:31,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:58:31,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:58:35,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:58:35,250 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 10:58:39,711 INFO [train.py:1046] (1/4) Epoch 36, batch 500, loss[loss=0.1533, simple_loss=0.2294, pruned_loss=0.03861, over 23375.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2387, pruned_loss=0.04013, over 4350929.30 frames. ], batch size: 285, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:58:39,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:58:39,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:58:41,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:58:42,492 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 10:58:43,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 10:58:43,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:58:45,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:58:49,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:58:51,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:58:51,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1242833.3333333333, ans=0.1 2023-10-03 10:58:52,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:58:53,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:58:54,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:58:58,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1242900.0, ans=0.125 2023-10-03 10:59:06,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:59:06,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 10:59:06,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:59:06,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:59:07,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 10:59:07,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:59:10,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:59:12,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:59:12,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:59:12,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:59:12,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 10:59:15,114 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 10:59:17,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:59:19,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:59:19,437 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:59:21,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:59:21,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:59:21,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 10:59:25,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 10:59:25,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1243033.3333333333, ans=0.0 2023-10-03 10:59:27,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:59:29,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:59:32,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:59:35,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:59:43,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:59:46,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 10:59:46,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:59:46,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:59:49,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 10:59:49,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 10:59:50,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:59:53,732 INFO [train.py:1046] (1/4) Epoch 36, batch 550, loss[loss=0.1498, simple_loss=0.2262, pruned_loss=0.03669, over 23275.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2398, pruned_loss=0.04068, over 4422494.11 frames. ], batch size: 105, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:59:56,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 10:59:58,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 10:59:58,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:59:58,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 10:59:59,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:59:59,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:00:00,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:00,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:00,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:00:02,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:00:05,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:00:06,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 11:00:06,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:00:06,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1243233.3333333333, ans=0.0 2023-10-03 11:00:11,069 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.849e+02 2.026e+02 2.312e+02 3.706e+02, threshold=4.052e+02, percent-clipped=0.0 2023-10-03 11:00:11,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:11,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:14,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:00:14,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:16,381 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.50 vs. limit=15.0 2023-10-03 11:00:18,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 11:00:19,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 11:00:19,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:00:24,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:00:24,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:00:26,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:00:28,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:28,757 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 11:00:30,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:31,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 11:00:32,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1243300.0, ans=0.125 2023-10-03 11:00:34,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:00:34,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:00:34,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:00:34,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:36,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 11:00:38,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 11:00:39,757 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:00:40,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:00:40,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:00:42,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:00:42,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:00:45,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:00:45,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:00:47,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1243366.6666666667, ans=0.0 2023-10-03 11:00:48,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:00:48,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:49,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 11:00:51,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:00:54,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:00:55,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 11:00:56,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:57,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1243433.3333333333, ans=0.125 2023-10-03 11:00:58,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:00:58,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 11:01:02,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1243433.3333333333, ans=0.125 2023-10-03 11:01:05,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 11:01:06,993 INFO [train.py:1046] (1/4) Epoch 36, batch 600, loss[loss=0.1598, simple_loss=0.2397, pruned_loss=0.03996, over 23355.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2407, pruned_loss=0.04108, over 4487507.89 frames. ], batch size: 105, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 11:01:08,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 11:01:09,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:01:09,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:01:09,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:01:10,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1243500.0, ans=0.0 2023-10-03 11:01:17,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:01:18,455 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.94 vs. limit=6.0 2023-10-03 11:01:19,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 11:01:20,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 11:01:22,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:01:25,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:01:26,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:01:28,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 11:01:28,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:01:29,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1243566.6666666667, ans=0.125 2023-10-03 11:01:32,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 11:01:35,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:01:35,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:01:35,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:01:43,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:01:43,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:01:43,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:01:49,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:01:50,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1243700.0, ans=0.1 2023-10-03 11:01:54,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:01:54,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:01:54,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:02:01,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 11:02:06,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 11:02:07,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:02:10,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.whiten.whitening_limit, batch_count=1243766.6666666667, ans=12.0 2023-10-03 11:02:11,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 11:02:12,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:02:14,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 11:02:14,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:02:15,078 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.26 vs. limit=22.5 2023-10-03 11:02:15,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:02:21,601 INFO [train.py:1046] (1/4) Epoch 36, batch 650, loss[loss=0.1517, simple_loss=0.2406, pruned_loss=0.03143, over 24669.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2395, pruned_loss=0.04066, over 4548030.32 frames. ], batch size: 73, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:02:23,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 11:02:24,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 11:02:26,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:02:26,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:02:26,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1243833.3333333333, ans=0.0 2023-10-03 11:02:29,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:02:31,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 11:02:31,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:02:38,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:02:38,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:02:39,289 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.445e+02 1.874e+02 2.139e+02 2.425e+02 3.850e+02, threshold=4.279e+02, percent-clipped=0.0 2023-10-03 11:02:42,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:02:45,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 11:02:46,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:02:47,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:02:50,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:02:51,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 11:02:51,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1243966.6666666667, ans=0.035 2023-10-03 11:02:52,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:02:54,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:02:54,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:02:56,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:02:57,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:02:59,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 11:02:59,324 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 11:02:59,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:02:59,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:03:02,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1243966.6666666667, ans=0.1 2023-10-03 11:03:03,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:03,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:03:05,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:03:05,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:03:06,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 11:03:06,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:03:07,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:03:09,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:03:09,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:03:10,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:03:12,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 11:03:12,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 11:03:14,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:14,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:03:14,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:03:14,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:03:15,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:03:15,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1244033.3333333333, ans=0.025 2023-10-03 11:03:21,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:22,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:03:24,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:03:26,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:03:26,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:03:26,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:03:31,418 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.56 vs. limit=15.0 2023-10-03 11:03:33,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:03:33,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:03:33,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:03:33,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:03:35,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1244166.6666666667, ans=0.125 2023-10-03 11:03:36,721 INFO [train.py:1046] (1/4) Epoch 36, batch 700, loss[loss=0.1564, simple_loss=0.2368, pruned_loss=0.03799, over 24054.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.237, pruned_loss=0.04013, over 4563714.29 frames. ], batch size: 86, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:03:38,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 11:03:40,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 11:03:41,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 11:03:42,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:43,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1244166.6666666667, ans=0.0 2023-10-03 11:03:45,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:03:47,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 11:03:51,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:03:54,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:03:54,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1244233.3333333333, ans=0.2 2023-10-03 11:03:55,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:59,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:03:59,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:04:01,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:04:04,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 11:04:04,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:04:07,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 11:04:10,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 11:04:12,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:04:14,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:04:15,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:04:19,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:04:21,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 11:04:24,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:04:25,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:04:25,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 11:04:28,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:04:30,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:04:31,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:04:36,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:04:36,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 11:04:40,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 11:04:42,335 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.71 vs. limit=12.0 2023-10-03 11:04:42,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 11:04:45,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:04:46,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:04:46,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:04:50,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:04:50,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 11:04:51,444 INFO [train.py:1046] (1/4) Epoch 36, batch 750, loss[loss=0.1502, simple_loss=0.2432, pruned_loss=0.02863, over 24464.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2366, pruned_loss=0.03996, over 4595338.45 frames. ], batch size: 69, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:04:52,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 11:04:53,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 11:04:53,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 11:04:54,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 11:04:55,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 11:04:55,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:04:57,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 11:04:57,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:04:59,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:05:00,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:05:03,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:05:03,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:05:03,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:05:03,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1244500.0, ans=0.125 2023-10-03 11:05:04,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:05:05,477 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.56 vs. limit=6.0 2023-10-03 11:05:06,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:05:09,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:05:10,661 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.795e+02 1.933e+02 2.137e+02 2.929e+02, threshold=3.865e+02, percent-clipped=0.0 2023-10-03 11:05:10,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:05:12,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:05:12,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 11:05:13,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:05:15,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:05:17,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:05:17,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1244566.6666666667, ans=0.0 2023-10-03 11:05:18,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 11:05:19,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 11:05:19,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:05:19,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 11:05:21,298 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 11:05:21,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 11:05:21,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:05:23,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:05:25,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:05:32,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:05:32,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:05:32,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:05:34,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:05:36,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:05:37,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 11:05:37,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:05:38,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 11:05:38,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:05:40,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:05:42,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 11:05:42,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:05:47,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:05:48,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:05:49,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:05:51,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:05:54,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 11:05:55,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:05:56,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:05:59,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1244766.6666666667, ans=0.09899494936611666 2023-10-03 11:05:59,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1244766.6666666667, ans=0.125 2023-10-03 11:06:00,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:06:00,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:06:04,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:06:04,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:06:05,748 INFO [train.py:1046] (1/4) Epoch 36, batch 800, loss[loss=0.1601, simple_loss=0.2451, pruned_loss=0.03756, over 24473.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2371, pruned_loss=0.03962, over 4626391.50 frames. ], batch size: 66, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:06:13,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:06:13,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:15,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:06:16,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:06:16,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:17,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:06:17,629 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:06:18,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:22,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:06:23,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:06:25,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 11:06:26,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:06:28,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:06:28,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:06:28,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:06:28,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 11:06:28,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:06:30,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 11:06:32,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:34,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:06:36,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:06:36,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:06:38,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:06:38,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:06:38,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1244966.6666666667, ans=0.0 2023-10-03 11:06:42,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:06:44,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:06:44,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 11:06:45,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1244966.6666666667, ans=0.125 2023-10-03 11:06:46,286 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.23 vs. limit=15.0 2023-10-03 11:06:46,891 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 11:06:46,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 11:06:46,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:06:46,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:06:48,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:48,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:06:48,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1245033.3333333333, ans=0.125 2023-10-03 11:06:50,888 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.34 vs. limit=15.0 2023-10-03 11:06:54,433 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 11:06:54,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 11:06:55,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:06:57,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:07:00,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:07:03,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:07:04,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 11:07:04,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:07:07,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 11:07:09,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1245100.0, ans=0.125 2023-10-03 11:07:12,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:07:16,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:07:16,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 11:07:16,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:07:17,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:07:19,164 INFO [train.py:1046] (1/4) Epoch 36, batch 850, loss[loss=0.1574, simple_loss=0.2469, pruned_loss=0.03396, over 24438.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2378, pruned_loss=0.0402, over 4634902.40 frames. ], batch size: 63, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:07:19,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 11:07:19,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:07:19,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1245166.6666666667, ans=0.04949747468305833 2023-10-03 11:07:20,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:07:20,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:07:23,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:07:23,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:07:25,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 11:07:25,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 11:07:25,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 11:07:28,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:07:28,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:07:29,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:07:29,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:07:29,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:07:33,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:07:35,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:07:35,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 11:07:38,453 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.85 vs. limit=6.0 2023-10-03 11:07:39,061 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.931e+02 2.185e+02 2.563e+02 4.001e+02, threshold=4.370e+02, percent-clipped=1.0 2023-10-03 11:07:40,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 11:07:43,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:07:45,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 11:07:49,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 11:07:51,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 11:07:53,976 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 11:07:53,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:07:53,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:07:54,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 11:07:56,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:07:58,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:07:58,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 11:07:59,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:08:01,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:08:01,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:08:03,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 11:08:05,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:08:08,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 11:08:09,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 11:08:10,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:08:12,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:08:13,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:08:13,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:08:13,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:08:15,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:08:18,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:08:18,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:08:19,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:08:21,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:08:23,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1245433.3333333333, ans=0.2 2023-10-03 11:08:27,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:08:28,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:08:29,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 11:08:29,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:08:31,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:08:32,858 INFO [train.py:1046] (1/4) Epoch 36, batch 900, loss[loss=0.153, simple_loss=0.2269, pruned_loss=0.03955, over 23625.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2398, pruned_loss=0.04109, over 4643871.93 frames. ], batch size: 134, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:08:32,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 11:08:39,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:08:41,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:08:41,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 11:08:45,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:08:45,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 11:08:46,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 11:08:49,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:08:49,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:08:49,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:08:49,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:08:59,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:08:59,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:08:59,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:09:01,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1245633.3333333333, ans=0.125 2023-10-03 11:09:02,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:09:02,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1245633.3333333333, ans=0.2 2023-10-03 11:09:02,979 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.08 vs. limit=15.0 2023-10-03 11:09:08,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 11:09:10,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:09:14,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:09:15,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:09:16,013 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 11:09:17,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 11:09:21,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1245700.0, ans=0.0 2023-10-03 11:09:23,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:09:23,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:09:24,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:09:31,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:09:31,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:09:33,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 11:09:33,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:09:35,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 11:09:36,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1245766.6666666667, ans=0.125 2023-10-03 11:09:37,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:09:37,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:09:40,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:09:40,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:09:44,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 11:09:44,291 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 11:09:45,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 11:09:45,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 11:09:46,951 INFO [train.py:1046] (1/4) Epoch 36, batch 950, loss[loss=0.1528, simple_loss=0.2349, pruned_loss=0.03534, over 22212.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2398, pruned_loss=0.04081, over 4665789.71 frames. ], batch size: 49, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:09:48,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:09:51,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 11:09:54,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:09:57,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:09:57,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:09:57,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 11:10:01,450 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 11:10:04,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:10:05,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:10:07,344 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.892e+02 2.032e+02 2.261e+02 4.305e+02, threshold=4.064e+02, percent-clipped=0.0 2023-10-03 11:10:07,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:10:07,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:10:07,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 11:10:08,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 11:10:10,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:10:10,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 11:10:11,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:10:14,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:10:15,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:10:15,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:10:16,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 11:10:18,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 11:10:22,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:10:24,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:10:27,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:10:28,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:10:31,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 11:10:32,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 11:10:32,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:10:33,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:10:33,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:10:33,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:10:37,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 11:10:38,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:10:41,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:10:41,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:10:41,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1246033.3333333333, ans=0.125 2023-10-03 11:10:42,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 11:10:42,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:10:42,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:10:42,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 11:10:44,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1246100.0, ans=0.125 2023-10-03 11:10:46,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:10:49,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:10:50,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1246100.0, ans=0.0 2023-10-03 11:10:51,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1246100.0, ans=0.1 2023-10-03 11:10:55,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:10:56,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 11:10:56,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 11:10:59,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:11:00,876 INFO [train.py:1046] (1/4) Epoch 36, batch 1000, loss[loss=0.1723, simple_loss=0.259, pruned_loss=0.04285, over 23280.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2391, pruned_loss=0.04072, over 4670795.07 frames. ], batch size: 93, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:11:03,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 11:11:05,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:09,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:11:09,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 11:11:09,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1246166.6666666667, ans=0.0 2023-10-03 11:11:09,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1246166.6666666667, ans=0.1 2023-10-03 11:11:10,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 11:11:14,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:11:15,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:11:15,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:11:19,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 11:11:21,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 11:11:22,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 11:11:23,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:11:25,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 11:11:26,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 11:11:26,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 11:11:27,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:11:27,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:27,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1246233.3333333333, ans=0.0 2023-10-03 11:11:31,143 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.30 vs. limit=15.0 2023-10-03 11:11:34,117 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:11:36,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:11:36,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:11:37,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:37,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:11:37,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 11:11:39,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:11:39,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:11:40,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:11:40,750 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 11:11:43,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 11:11:45,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 11:11:47,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 11:11:49,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:11:56,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:56,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:11:56,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:57,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1246366.6666666667, ans=0.125 2023-10-03 11:11:58,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:12:00,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 11:12:00,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1246433.3333333333, ans=0.0 2023-10-03 11:12:02,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:12:02,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 11:12:03,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 11:12:05,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:12:05,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:12:06,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:12:09,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:12:10,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:12:13,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:12:15,122 INFO [train.py:1046] (1/4) Epoch 36, batch 1050, loss[loss=0.1425, simple_loss=0.2224, pruned_loss=0.03136, over 24423.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2381, pruned_loss=0.04019, over 4681211.36 frames. ], batch size: 58, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:12:15,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:12:17,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 11:12:18,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:12:19,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:12:24,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:12:25,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:12:26,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1246500.0, ans=0.125 2023-10-03 11:12:28,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:12:28,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:12:28,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:12:30,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:12:31,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 11:12:32,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:12:32,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 11:12:36,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:12:36,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 11:12:36,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:12:37,303 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.909e+02 2.045e+02 2.275e+02 3.142e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-03 11:12:42,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:12:42,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:12:42,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:12:45,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 11:12:45,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 11:12:47,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:12:49,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 11:12:51,207 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.88 vs. limit=12.0 2023-10-03 11:12:52,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 11:12:54,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:12:54,985 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.02 vs. limit=12.0 2023-10-03 11:12:58,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 11:12:59,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:12:59,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:13:01,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:13:04,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:13:07,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 11:13:09,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 11:13:10,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 11:13:11,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:13:11,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:13:13,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 11:13:16,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:13:17,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:13:17,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:13:19,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:13:19,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:13:22,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:13:22,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 11:13:25,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:13:25,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 11:13:25,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 11:13:25,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:13:29,567 INFO [train.py:1046] (1/4) Epoch 36, batch 1100, loss[loss=0.1669, simple_loss=0.2415, pruned_loss=0.04617, over 23782.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2375, pruned_loss=0.03989, over 4701052.00 frames. ], batch size: 179, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:13:29,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:13:32,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1246833.3333333333, ans=0.0 2023-10-03 11:13:33,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:13:38,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:13:41,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:13:41,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:13:41,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 11:13:42,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:13:43,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 11:13:47,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:13:48,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:13:48,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 11:13:50,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 11:13:52,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:13:52,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:13:54,277 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1246900.0, ans=0.125 2023-10-03 11:13:55,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:13:56,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:14:02,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:14:03,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 11:14:05,426 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 11:14:06,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:14:09,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:14:09,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=1246966.6666666667, ans=0.025 2023-10-03 11:14:11,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:14:11,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:14:14,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 11:14:15,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:14:15,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:14:15,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:14:15,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:14:15,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 11:14:23,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:14:23,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 11:14:24,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:14:30,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:14:31,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1247100.0, ans=0.09899494936611666 2023-10-03 11:14:32,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 11:14:32,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 11:14:33,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:14:36,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:14:36,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:14:39,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 11:14:39,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:14:39,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:14:40,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 11:14:40,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:14:42,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 11:14:43,470 INFO [train.py:1046] (1/4) Epoch 36, batch 1150, loss[loss=0.1686, simple_loss=0.2382, pruned_loss=0.04955, over 23757.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2381, pruned_loss=0.04013, over 4698257.37 frames. ], batch size: 179, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:14:43,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:14:43,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:14:44,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:14:50,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:14:51,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:14:52,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:14:54,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:14:54,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 11:14:54,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:14:57,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1247166.6666666667, ans=0.07 2023-10-03 11:14:57,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 11:14:59,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:14:59,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:14:59,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1247233.3333333333, ans=0.0 2023-10-03 11:15:04,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 11:15:06,092 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.838e+02 2.022e+02 2.263e+02 3.460e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-03 11:15:07,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:15:07,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1247233.3333333333, ans=0.125 2023-10-03 11:15:10,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:15:10,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:15:12,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 11:15:12,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:15:12,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:15:15,851 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.21 vs. limit=12.0 2023-10-03 11:15:17,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 11:15:19,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:15:19,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1247300.0, ans=0.05 2023-10-03 11:15:20,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:15:28,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:15:34,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:15:34,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 11:15:35,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:15:36,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:15:40,440 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 11:15:43,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:15:50,627 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 11:15:53,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:15:55,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:15:56,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:15:56,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:15:57,889 INFO [train.py:1046] (1/4) Epoch 36, batch 1200, loss[loss=0.2019, simple_loss=0.2733, pruned_loss=0.0652, over 19266.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.239, pruned_loss=0.04023, over 4706953.78 frames. ], batch size: 388, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:15:59,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:16:04,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:16:04,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:16:06,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:16:06,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:16:06,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:16:08,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:16:11,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:16:11,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:16:11,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:16:13,862 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 11:16:16,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 11:16:19,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:16:24,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:16:26,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:16:28,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:16:28,723 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 11:16:30,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:16:36,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:16:36,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:16:37,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 11:16:38,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:16:41,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 11:16:43,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 11:16:43,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:16:45,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:16:46,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:16:47,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:16:48,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:16:48,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:16:50,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:16:50,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 11:16:50,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:16:51,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:16:51,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:16:53,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:16:53,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:16:58,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 11:17:00,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:17:04,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 11:17:08,148 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 11:17:09,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:17:10,805 INFO [train.py:1046] (1/4) Epoch 36, batch 1250, loss[loss=0.1639, simple_loss=0.2547, pruned_loss=0.03657, over 24350.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2394, pruned_loss=0.04047, over 4714019.26 frames. ], batch size: 74, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:17:12,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:17:13,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:17:14,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:17:16,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 11:17:20,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:17:21,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:17:23,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 11:17:23,440 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:17:25,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:17:26,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:17:30,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 11:17:31,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:17:31,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:17:32,977 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.893e+02 2.096e+02 2.315e+02 3.131e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-03 11:17:33,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:17:36,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:17:39,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 11:17:39,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 11:17:39,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:17:41,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:17:41,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:17:44,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:17:44,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:17:49,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 11:17:49,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:17:49,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1247966.6666666667, ans=0.125 2023-10-03 11:17:50,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:17:52,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 11:17:53,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:17:53,985 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 11:17:54,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:17:54,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:17:58,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:18:01,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:18:02,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:18:05,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 11:18:05,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 11:18:05,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 11:18:08,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:18:09,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 11:18:09,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:18:12,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 11:18:12,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:18:13,011 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.25 vs. limit=10.0 2023-10-03 11:18:13,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 11:18:13,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 11:18:13,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:18:13,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:18:14,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:18:16,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 11:18:19,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:18:20,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:18:21,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1248100.0, ans=0.0 2023-10-03 11:18:22,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:18:24,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 11:18:25,574 INFO [train.py:1046] (1/4) Epoch 36, batch 1300, loss[loss=0.1414, simple_loss=0.2183, pruned_loss=0.0323, over 23514.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2399, pruned_loss=0.04078, over 4706048.70 frames. ], batch size: 134, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:18:25,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1248166.6666666667, ans=0.0 2023-10-03 11:18:27,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:18:27,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 11:18:31,093 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.01 vs. limit=15.0 2023-10-03 11:18:31,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:18:33,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:18:34,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:18:34,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:18:37,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:18:37,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 11:18:44,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:18:44,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:18:47,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 11:18:50,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:18:52,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:18:54,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:18:54,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:18:56,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:18:58,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:18:58,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 11:18:58,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 11:19:04,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:19:04,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:19:06,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 11:19:06,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 11:19:07,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:19:10,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:19:11,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 11:19:11,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:19:11,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 11:19:12,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:19:15,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:19:15,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:19:21,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 11:19:22,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 11:19:23,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 11:19:29,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:19:31,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 11:19:33,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:19:40,019 INFO [train.py:1046] (1/4) Epoch 36, batch 1350, loss[loss=0.1753, simple_loss=0.258, pruned_loss=0.04635, over 24362.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2391, pruned_loss=0.04059, over 4696478.90 frames. ], batch size: 77, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:19:41,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 11:19:44,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:19:45,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:19:48,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:19:48,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:19:48,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1248500.0, ans=0.125 2023-10-03 11:19:50,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:19:52,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:19:55,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:19:57,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 11:19:58,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:20:00,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:20:01,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1248566.6666666667, ans=0.0 2023-10-03 11:20:01,941 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.023e+02 2.236e+02 2.575e+02 4.914e+02, threshold=4.472e+02, percent-clipped=3.0 2023-10-03 11:20:02,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 11:20:03,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:20:03,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:20:03,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 11:20:06,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 11:20:09,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 11:20:11,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:20:11,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 11:20:11,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1248633.3333333333, ans=0.125 2023-10-03 11:20:21,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:20:26,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1248700.0, ans=0.125 2023-10-03 11:20:28,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1248700.0, ans=0.0 2023-10-03 11:20:29,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:20:29,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:20:31,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 11:20:34,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:20:34,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 11:20:34,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:20:35,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:20:37,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:20:39,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 11:20:40,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:20:43,992 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.04 vs. limit=22.5 2023-10-03 11:20:46,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 11:20:48,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 11:20:49,620 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.94 vs. limit=22.5 2023-10-03 11:20:52,871 INFO [train.py:1046] (1/4) Epoch 36, batch 1400, loss[loss=0.1512, simple_loss=0.2343, pruned_loss=0.03405, over 24450.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2378, pruned_loss=0.04035, over 4702155.32 frames. ], batch size: 63, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:20:54,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 11:20:54,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:20:58,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:21:00,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:21:04,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 11:21:05,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 11:21:15,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:21:15,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1248900.0, ans=0.0 2023-10-03 11:21:17,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:21:18,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:21:19,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 11:21:21,735 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:21:23,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:21:23,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 11:21:33,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:21:35,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:21:39,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1249033.3333333333, ans=0.125 2023-10-03 11:21:40,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 11:21:40,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:21:41,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:21:41,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:21:41,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:21:41,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1249033.3333333333, ans=0.09899494936611666 2023-10-03 11:21:44,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:21:44,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:21:44,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:21:44,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1249033.3333333333, ans=0.125 2023-10-03 11:21:45,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 11:21:47,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:21:50,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:21:52,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:21:58,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 11:22:00,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 11:22:01,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:22:02,196 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.95 vs. limit=22.5 2023-10-03 11:22:02,442 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=7.73 vs. limit=10.0 2023-10-03 11:22:05,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 11:22:05,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:22:06,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:22:08,014 INFO [train.py:1046] (1/4) Epoch 36, batch 1450, loss[loss=0.1606, simple_loss=0.2395, pruned_loss=0.04089, over 23730.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2377, pruned_loss=0.04013, over 4708046.60 frames. ], batch size: 149, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:22:08,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:22:10,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:22:10,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:10,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 11:22:15,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:22:15,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:22:16,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:22:16,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 11:22:17,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:22:19,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 11:22:20,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:23,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:22:23,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 11:22:23,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:22:23,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1249233.3333333333, ans=0.0 2023-10-03 11:22:25,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:22:25,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 11:22:25,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:22:26,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:22:26,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1249233.3333333333, ans=0.0 2023-10-03 11:22:28,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:29,487 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.817e+02 1.972e+02 2.237e+02 2.946e+02, threshold=3.944e+02, percent-clipped=0.0 2023-10-03 11:22:30,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:22:34,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:22:34,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:22:37,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:22:37,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:39,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:22:40,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:22:40,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:40,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:22:44,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 11:22:47,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:22:51,560 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 11:22:51,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:22:53,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:22:55,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:22:55,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 11:23:00,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:23:00,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 11:23:01,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 11:23:05,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:23:08,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:23:08,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:23:09,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1249433.3333333333, ans=0.125 2023-10-03 11:23:10,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 11:23:11,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 11:23:13,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 11:23:14,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:23:15,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:23:21,485 INFO [train.py:1046] (1/4) Epoch 36, batch 1500, loss[loss=0.1511, simple_loss=0.2311, pruned_loss=0.03561, over 23709.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2382, pruned_loss=0.03997, over 4722046.76 frames. ], batch size: 149, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:23:25,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 11:23:25,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:23:25,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:23:27,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:23:27,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:23:27,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1249500.0, ans=0.0 2023-10-03 11:23:29,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:23:30,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 11:23:31,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:23:31,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:23:33,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:23:33,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:23:34,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:23:34,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:23:40,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:23:40,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 11:23:41,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:23:41,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:23:43,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:23:45,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 11:23:51,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 11:23:52,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:23:54,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 11:23:55,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1249633.3333333333, ans=0.1 2023-10-03 11:23:56,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 11:23:58,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.whiten.whitening_limit, batch_count=1249633.3333333333, ans=12.0 2023-10-03 11:23:59,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:23:59,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1249633.3333333333, ans=0.1 2023-10-03 11:24:00,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:24:00,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:24:02,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 11:24:02,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:24:02,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:24:02,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 11:24:02,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1249633.3333333333, ans=0.1 2023-10-03 11:24:04,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:24:09,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:24:09,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 11:24:11,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1249700.0, ans=0.0 2023-10-03 11:24:15,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:24:16,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:24:19,384 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 11:24:19,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:24:19,433 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 11:24:19,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:24:20,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:24:22,243 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 11:24:22,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1249766.6666666667, ans=0.0 2023-10-03 11:24:23,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:24:26,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 11:24:27,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:24:30,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:24:30,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:24:31,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:24:31,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:24:33,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:24:34,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1249833.3333333333, ans=0.125 2023-10-03 11:24:35,217 INFO [train.py:1046] (1/4) Epoch 36, batch 1550, loss[loss=0.1645, simple_loss=0.2474, pruned_loss=0.04076, over 23285.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2387, pruned_loss=0.04039, over 4711840.28 frames. ], batch size: 93, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:24:35,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 11:24:35,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 11:24:36,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:24:36,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 11:24:36,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1249833.3333333333, ans=0.1 2023-10-03 11:24:38,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 11:24:40,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:24:41,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:24:41,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1249833.3333333333, ans=0.0 2023-10-03 11:24:42,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:24:42,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:24:44,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:24:44,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:24:47,213 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 11:24:47,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:24:47,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:24:48,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:24:50,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:24:50,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 11:24:51,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:24:51,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 11:24:52,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 11:24:52,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 11:24:54,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:24:55,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:24:56,773 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.814e+02 1.950e+02 2.198e+02 3.185e+02, threshold=3.899e+02, percent-clipped=0.0 2023-10-03 11:24:59,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:25:03,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 11:25:03,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 11:25:11,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:25:14,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:25:15,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:25:15,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:25:17,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 11:25:21,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:25:23,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:25:25,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:25:27,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:25:28,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:25:28,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 11:25:28,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:25:31,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:25:31,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:25:32,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 11:25:32,691 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 11:25:35,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:25:38,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 11:25:45,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:25:46,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:25:46,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 11:25:49,037 INFO [train.py:1046] (1/4) Epoch 36, batch 1600, loss[loss=0.1499, simple_loss=0.2311, pruned_loss=0.03435, over 24588.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2391, pruned_loss=0.04043, over 4719279.20 frames. ], batch size: 60, lr: 2.84e-03, grad_scale: 32.0 2023-10-03 11:25:49,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:25:50,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:25:50,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:25:50,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:25:50,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:25:50,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1250166.6666666667, ans=0.1 2023-10-03 11:25:53,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:25:53,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 11:25:53,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1250166.6666666667, ans=0.2 2023-10-03 11:25:54,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 11:25:56,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 11:25:57,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:26:00,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 11:26:00,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:26:00,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1250166.6666666667, ans=0.2 2023-10-03 11:26:03,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:26:08,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:26:09,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1250233.3333333333, ans=0.125 2023-10-03 11:26:12,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 11:26:15,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:26:15,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 11:26:15,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:26:17,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 11:26:17,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1250300.0, ans=0.125 2023-10-03 11:26:22,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 11:26:29,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:26:29,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 11:26:29,922 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1250300.0, ans=0.125 2023-10-03 11:26:30,428 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.88 vs. limit=15.0 2023-10-03 11:26:31,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:26:31,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:26:31,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:26:33,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 11:26:38,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 11:26:40,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:26:40,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1250366.6666666667, ans=0.07 2023-10-03 11:26:41,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:26:42,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:26:43,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:26:44,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:26:46,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:26:48,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:26:53,453 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.58 vs. limit=22.5 2023-10-03 11:26:54,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:26:54,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:26:56,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 11:26:56,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:26:58,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 11:27:02,347 INFO [train.py:1046] (1/4) Epoch 36, batch 1650, loss[loss=0.1633, simple_loss=0.2367, pruned_loss=0.04494, over 23752.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2397, pruned_loss=0.0407, over 4719919.36 frames. ], batch size: 179, lr: 2.84e-03, grad_scale: 32.0 2023-10-03 11:27:02,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:27:03,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:27:05,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:27:05,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 11:27:05,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 11:27:05,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 11:27:07,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 11:27:07,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1250500.0, ans=0.125 2023-10-03 11:27:11,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:27:11,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:27:12,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:27:12,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:27:14,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:27:16,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 11:27:17,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:27:17,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:27:17,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:27:17,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:27:19,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 11:27:20,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 11:27:24,863 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.852e+02 2.089e+02 2.353e+02 3.371e+02, threshold=4.177e+02, percent-clipped=0.0 2023-10-03 11:27:24,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:27:26,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:27:33,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 11:27:35,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:27:36,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 11:27:39,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:27:43,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:27:43,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:27:43,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:27:47,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:27:47,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:27:49,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:27:49,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:27:49,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:27:51,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:27:52,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:27:52,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:27:58,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:27:58,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 11:27:58,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:27:59,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 11:27:59,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 11:27:59,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 11:28:01,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:28:01,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:28:01,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:28:01,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:28:01,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 11:28:05,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:28:07,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:28:08,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:28:09,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 11:28:13,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1250766.6666666667, ans=0.015 2023-10-03 11:28:16,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:28:16,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:28:16,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 11:28:16,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:28:17,847 INFO [train.py:1046] (1/4) Epoch 36, batch 1700, loss[loss=0.1706, simple_loss=0.2411, pruned_loss=0.05003, over 23705.00 frames. ], tot_loss[loss=0.16, simple_loss=0.239, pruned_loss=0.04046, over 4723666.31 frames. ], batch size: 232, lr: 2.84e-03, grad_scale: 32.0 2023-10-03 11:28:17,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:28:17,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:28:19,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:28:20,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:28:20,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 11:28:21,032 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:28:23,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:28:31,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:28:33,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1250900.0, ans=0.1 2023-10-03 11:28:34,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:28:40,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:28:40,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:28:42,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:28:42,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:28:44,334 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.57 vs. limit=15.0 2023-10-03 11:28:44,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 11:28:45,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1250966.6666666667, ans=0.0 2023-10-03 11:28:47,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:28:47,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:28:49,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:28:51,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:28:54,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 11:28:54,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 11:28:55,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:28:55,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1250966.6666666667, ans=0.125 2023-10-03 11:28:57,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 11:28:58,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:29:01,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1251033.3333333333, ans=0.0 2023-10-03 11:29:05,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:29:05,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:29:07,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:29:10,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 11:29:10,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 11:29:10,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:29:11,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:29:11,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 11:29:13,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:29:13,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:29:13,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:29:13,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:29:13,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1251033.3333333333, ans=0.1 2023-10-03 11:29:15,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:29:15,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:29:15,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:29:16,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:29:16,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:29:23,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:29:24,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 11:29:25,209 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.37 vs. limit=15.0 2023-10-03 11:29:26,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:29:27,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:29:28,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 11:29:31,606 INFO [train.py:1046] (1/4) Epoch 36, batch 1750, loss[loss=0.1556, simple_loss=0.2514, pruned_loss=0.02993, over 24311.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2386, pruned_loss=0.04012, over 4722854.81 frames. ], batch size: 74, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:29:34,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:29:37,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:29:37,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:29:39,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 11:29:39,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:29:39,908 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.90 vs. limit=22.5 2023-10-03 11:29:41,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:29:41,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:29:46,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 11:29:46,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1251233.3333333333, ans=0.125 2023-10-03 11:29:48,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:29:51,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 11:29:51,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:29:52,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:29:55,391 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.875e+02 2.027e+02 2.400e+02 3.332e+02, threshold=4.055e+02, percent-clipped=0.0 2023-10-03 11:29:55,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 11:29:56,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 11:29:58,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:29:58,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 11:30:04,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1251300.0, ans=0.0 2023-10-03 11:30:06,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:30:08,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:30:08,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:30:08,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1251300.0, ans=0.125 2023-10-03 11:30:12,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:30:12,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:30:14,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:30:15,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:30:17,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1251366.6666666667, ans=0.0 2023-10-03 11:30:18,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:30:20,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:30:21,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 11:30:24,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:30:26,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 11:30:26,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:30:28,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:30:29,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:30:33,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:30:34,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 11:30:34,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:30:36,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:30:41,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:30:41,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1251433.3333333333, ans=0.0 2023-10-03 11:30:43,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:30:45,157 INFO [train.py:1046] (1/4) Epoch 36, batch 1800, loss[loss=0.1632, simple_loss=0.2367, pruned_loss=0.04483, over 23881.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2376, pruned_loss=0.03989, over 4722965.16 frames. ], batch size: 212, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:30:45,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:30:45,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 11:30:46,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:30:46,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:30:46,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:30:46,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:30:46,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:30:48,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:30:52,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:30:52,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:30:54,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 11:30:55,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:30:56,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1251500.0, ans=0.2 2023-10-03 11:31:00,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 11:31:01,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:31:03,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:31:05,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:31:06,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:31:07,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:31:07,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1251566.6666666667, ans=0.1 2023-10-03 11:31:10,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:31:10,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 11:31:12,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:31:14,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:31:18,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 11:31:20,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 11:31:20,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 11:31:22,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:31:23,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:31:23,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:31:25,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:31:31,595 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 11:31:31,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:31:33,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:31:34,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 11:31:34,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 11:31:35,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:31:37,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:31:37,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:31:40,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1251700.0, ans=0.0 2023-10-03 11:31:41,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 11:31:47,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:31:47,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 11:31:48,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:31:48,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:31:50,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:31:50,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 11:31:53,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:31:53,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:31:57,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 11:31:57,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:31:59,434 INFO [train.py:1046] (1/4) Epoch 36, batch 1850, loss[loss=0.1553, simple_loss=0.2482, pruned_loss=0.03116, over 24609.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2375, pruned_loss=0.03978, over 4715252.26 frames. ], batch size: 73, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:31:59,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:31:59,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:31:59,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:32:01,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:32:01,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:32:03,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:32:03,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:32:05,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:32:06,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:32:11,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1251833.3333333333, ans=0.0 2023-10-03 11:32:12,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:32:12,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 11:32:15,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 11:32:18,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 11:32:21,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:32:22,887 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.882e+02 2.066e+02 2.277e+02 3.527e+02, threshold=4.131e+02, percent-clipped=0.0 2023-10-03 11:32:22,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 11:32:22,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 11:32:23,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1251900.0, ans=0.125 2023-10-03 11:32:24,098 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=10.02 vs. limit=10.0 2023-10-03 11:32:32,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1251966.6666666667, ans=0.125 2023-10-03 11:32:33,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:32:34,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 11:32:36,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1251966.6666666667, ans=0.125 2023-10-03 11:32:37,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:32:37,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:32:43,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 11:32:43,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:32:43,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:32:44,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:32:46,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:32:49,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:32:49,846 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.93 vs. limit=22.5 2023-10-03 11:32:50,941 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.22 vs. limit=15.0 2023-10-03 11:32:53,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:32:53,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:32:54,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 11:32:54,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:32:55,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:32:58,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:32:59,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 11:33:01,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:33:05,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1252100.0, ans=0.125 2023-10-03 11:33:06,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:33:06,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:33:06,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 11:33:06,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 11:33:08,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1252100.0, ans=0.125 2023-10-03 11:33:09,491 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 11:33:09,569 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 11:33:09,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1252100.0, ans=0.0 2023-10-03 11:33:12,680 INFO [train.py:1046] (1/4) Epoch 36, batch 1900, loss[loss=0.1675, simple_loss=0.2561, pruned_loss=0.03945, over 24292.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2379, pruned_loss=0.0398, over 4722703.83 frames. ], batch size: 74, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:33:12,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:33:12,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:33:12,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:33:12,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:33:12,838 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 11:33:12,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:33:12,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:33:14,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:33:14,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1252166.6666666667, ans=0.07 2023-10-03 11:33:15,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:33:17,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:33:17,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 11:33:19,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1252166.6666666667, ans=0.0 2023-10-03 11:33:20,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:33:20,903 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 11:33:20,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:33:21,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:33:26,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:33:28,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:33:28,496 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 11:33:30,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 11:33:31,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:33:33,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:33:33,421 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 11:33:33,460 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 11:33:36,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 11:33:37,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:33:39,121 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1252233.3333333333, ans=0.5 2023-10-03 11:33:40,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 11:33:41,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 11:33:53,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 11:33:57,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 11:33:57,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:33:57,396 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 11:33:58,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 11:33:58,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 11:33:58,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 11:33:58,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:34:03,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 11:34:05,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:34:06,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:34:06,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 11:34:09,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:34:11,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 11:34:11,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1252433.3333333333, ans=0.125 2023-10-03 11:34:12,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:34:17,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:34:17,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:34:17,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:34:19,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:34:21,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:34:21,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 11:34:22,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:34:24,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:34:24,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:34:25,336 INFO [train.py:1046] (1/4) Epoch 36, batch 1950, loss[loss=0.1645, simple_loss=0.252, pruned_loss=0.03847, over 24352.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.239, pruned_loss=0.04009, over 4727170.44 frames. ], batch size: 74, lr: 2.84e-03, grad_scale: 4.0 2023-10-03 11:34:27,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:34:27,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:34:28,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:34:28,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:34:31,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:34:34,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:34:35,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:34:35,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:34:36,447 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.88 vs. limit=15.0 2023-10-03 11:34:37,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 11:34:37,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 11:34:38,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:34:38,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:34:42,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:34:42,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:34:43,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:34:44,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:34:47,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:34:47,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:34:47,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:34:47,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:34:50,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:34:51,698 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.918e+02 2.114e+02 2.421e+02 3.439e+02, threshold=4.228e+02, percent-clipped=0.0 2023-10-03 11:34:53,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:34:53,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:34:53,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 11:34:53,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 11:34:54,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:34:54,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:34:54,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:34:56,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1252633.3333333333, ans=0.1 2023-10-03 11:34:58,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:35:01,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:35:04,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:35:07,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:35:07,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:35:07,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 11:35:07,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:35:11,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:35:13,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:35:13,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:35:20,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:35:22,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:35:25,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:35:25,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1252766.6666666667, ans=0.125 2023-10-03 11:35:26,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:35:27,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1252766.6666666667, ans=0.0 2023-10-03 11:35:30,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:35:30,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:35:30,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 11:35:31,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:35:31,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:35:31,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 11:35:34,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:35:37,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1252766.6666666667, ans=0.125 2023-10-03 11:35:38,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:35:39,759 INFO [train.py:1046] (1/4) Epoch 36, batch 2000, loss[loss=0.1523, simple_loss=0.232, pruned_loss=0.03628, over 24322.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2397, pruned_loss=0.04028, over 4721083.18 frames. ], batch size: 56, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:35:39,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:35:41,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:35:41,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1252833.3333333333, ans=0.125 2023-10-03 11:35:41,757 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.65 vs. limit=15.0 2023-10-03 11:35:42,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:35:43,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:35:47,434 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1252833.3333333333, ans=0.0 2023-10-03 11:35:48,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1252833.3333333333, ans=0.125 2023-10-03 11:35:49,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 11:35:49,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:35:52,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:35:54,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 11:35:56,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:35:56,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:35:59,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:36:00,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 11:36:02,920 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.07 vs. limit=15.0 2023-10-03 11:36:03,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:04,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:04,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:06,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 11:36:06,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:36:08,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 11:36:08,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:36:11,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:36:12,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 11:36:12,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:12,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:36:13,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:36:13,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 11:36:16,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 11:36:16,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:36:18,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:36:22,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:36:24,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:36:24,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:36:25,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:36:28,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:36:29,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:36:29,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:36:29,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:36:31,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:34,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:36:34,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 11:36:39,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:36:39,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:36:42,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:36:43,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:36:46,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:49,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:36:49,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:50,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:36:50,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:36:53,338 INFO [train.py:1046] (1/4) Epoch 36, batch 2050, loss[loss=0.1494, simple_loss=0.2324, pruned_loss=0.03318, over 24355.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2391, pruned_loss=0.04023, over 4712234.65 frames. ], batch size: 61, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:36:53,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:36:55,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:58,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:36:59,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:37:04,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:37:06,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:37:07,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:37:08,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:37:09,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 11:37:09,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:37:11,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:37:11,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:37:14,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1253233.3333333333, ans=0.125 2023-10-03 11:37:20,199 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.868e+02 2.079e+02 2.374e+02 3.501e+02, threshold=4.157e+02, percent-clipped=0.0 2023-10-03 11:37:20,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:37:20,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:37:21,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 11:37:24,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:37:26,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 11:37:26,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:37:29,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:37:34,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:37:35,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:37:36,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:37:37,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:37:37,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:37:37,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:37:38,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1253300.0, ans=0.0 2023-10-03 11:37:39,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1253366.6666666667, ans=0.125 2023-10-03 11:37:40,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:37:43,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:37:45,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:37:46,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:37:47,169 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.14 vs. limit=15.0 2023-10-03 11:37:50,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:37:55,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:37:55,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1253433.3333333333, ans=0.1 2023-10-03 11:37:58,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 11:38:03,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:38:04,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:38:06,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:38:07,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 11:38:10,446 INFO [train.py:1046] (1/4) Epoch 36, batch 2100, loss[loss=0.1561, simple_loss=0.2332, pruned_loss=0.03948, over 24626.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2372, pruned_loss=0.04003, over 4705840.07 frames. ], batch size: 60, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:38:10,551 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 11:38:10,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:38:10,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:38:11,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:38:13,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1253500.0, ans=0.125 2023-10-03 11:38:14,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:38:14,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 11:38:14,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 11:38:16,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:38:19,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:38:20,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:38:21,486 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.62 vs. limit=15.0 2023-10-03 11:38:24,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:38:24,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:38:24,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 11:38:25,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:38:26,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 11:38:26,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 11:38:28,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:38:28,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:38:28,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 11:38:28,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 11:38:32,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1253566.6666666667, ans=0.125 2023-10-03 11:38:33,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 11:38:33,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:38:36,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:38:37,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:38:37,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1253566.6666666667, ans=0.125 2023-10-03 11:38:40,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:38:40,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 11:38:40,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:38:41,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 11:38:41,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1253633.3333333333, ans=0.125 2023-10-03 11:38:42,664 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.12 vs. limit=22.5 2023-10-03 11:38:43,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 11:38:43,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:38:44,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 11:38:44,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 11:38:44,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 11:38:47,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:38:49,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:38:52,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:38:53,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:38:55,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:38:55,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:38:55,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 11:38:55,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:38:55,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1253700.0, ans=0.125 2023-10-03 11:38:56,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:38:56,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:38:58,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 11:39:01,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 11:39:01,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 11:39:04,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:39:06,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1253700.0, ans=0.09899494936611666 2023-10-03 11:39:07,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:39:08,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 11:39:09,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1253766.6666666667, ans=0.125 2023-10-03 11:39:11,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:39:14,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:39:14,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:39:14,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:39:14,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 11:39:14,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:39:14,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1253766.6666666667, ans=0.125 2023-10-03 11:39:17,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:39:17,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:39:17,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:39:18,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1253766.6666666667, ans=0.125 2023-10-03 11:39:19,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:39:20,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 11:39:22,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 11:39:22,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:39:24,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:39:24,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:39:25,340 INFO [train.py:1046] (1/4) Epoch 36, batch 2150, loss[loss=0.1531, simple_loss=0.2446, pruned_loss=0.03085, over 24646.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2362, pruned_loss=0.03941, over 4710062.23 frames. ], batch size: 73, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:39:25,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:39:25,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:39:30,182 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:39:31,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 11:39:33,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:39:34,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:39:34,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1253833.3333333333, ans=0.125 2023-10-03 11:39:37,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:39:37,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:39:37,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:39:40,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:39:41,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:39:41,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:39:44,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:39:45,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 11:39:50,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:39:51,383 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.841e+02 2.012e+02 2.249e+02 3.397e+02, threshold=4.024e+02, percent-clipped=0.0 2023-10-03 11:39:51,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:39:52,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:39:52,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:39:54,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:39:54,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1253966.6666666667, ans=0.0 2023-10-03 11:39:55,233 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.82 vs. limit=15.0 2023-10-03 11:39:55,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:39:55,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:39:55,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:39:57,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:39:58,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 11:40:00,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:40:01,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:40:03,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:40:04,243 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.39 vs. limit=12.0 2023-10-03 11:40:05,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:40:05,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:40:06,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:40:06,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:40:09,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:40:09,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 11:40:09,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:40:11,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1254033.3333333333, ans=0.0 2023-10-03 11:40:11,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1254033.3333333333, ans=0.125 2023-10-03 11:40:12,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:40:13,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:13,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:40:14,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:40:16,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:16,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:17,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 11:40:19,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 11:40:19,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:40:20,361 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 11:40:21,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:21,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:40:21,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1254100.0, ans=0.125 2023-10-03 11:40:23,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 11:40:23,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:40:23,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 11:40:23,527 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 11:40:23,527 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 11:40:23,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 11:40:25,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:26,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:40:26,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:40:28,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:28,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 11:40:29,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:29,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:37,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:40:37,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 11:40:38,512 INFO [train.py:1046] (1/4) Epoch 36, batch 2200, loss[loss=0.1778, simple_loss=0.2495, pruned_loss=0.0531, over 23818.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2371, pruned_loss=0.03978, over 4713747.88 frames. ], batch size: 164, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:40:40,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:40:44,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:45,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:40:45,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:40:46,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:40:48,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:49,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:40:49,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 11:40:56,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 11:40:58,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:41:03,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 11:41:06,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:41:08,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:41:08,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:41:10,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:41:10,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 11:41:15,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:41:15,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:41:16,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 11:41:19,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:41:20,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1254300.0, ans=0.125 2023-10-03 11:41:22,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:41:22,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:41:24,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:41:26,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 11:41:26,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:41:27,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 11:41:28,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:41:28,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 11:41:28,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:41:31,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:41:32,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:41:32,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:41:32,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:41:33,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:41:33,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:41:36,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 11:41:39,527 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.04 vs. limit=6.0 2023-10-03 11:41:40,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 11:41:41,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:41:41,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1254433.3333333333, ans=0.0 2023-10-03 11:41:44,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:41:45,488 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 11:41:48,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:41:48,265 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 11:41:49,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:41:49,621 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 11:41:51,359 INFO [train.py:1046] (1/4) Epoch 36, batch 2250, loss[loss=0.1395, simple_loss=0.2172, pruned_loss=0.03095, over 24448.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2374, pruned_loss=0.03983, over 4714689.97 frames. ], batch size: 58, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:41:51,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1254500.0, ans=0.125 2023-10-03 11:41:51,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1254500.0, ans=0.125 2023-10-03 11:41:52,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:41:52,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 11:41:54,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:41:56,773 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 11:41:56,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:42:00,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:42:05,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:42:05,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:42:09,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:42:11,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:42:11,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:42:12,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 11:42:13,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:42:13,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:42:16,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 11:42:16,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:42:16,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:42:17,886 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.895e+02 2.080e+02 2.389e+02 3.595e+02, threshold=4.160e+02, percent-clipped=0.0 2023-10-03 11:42:19,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:42:19,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1254633.3333333333, ans=0.0 2023-10-03 11:42:23,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:42:25,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 11:42:25,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:42:26,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 11:42:28,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:42:31,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:42:34,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:42:36,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:42:38,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:42:38,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:42:40,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:42:42,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:42:45,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1254700.0, ans=0.2 2023-10-03 11:42:46,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:42:46,863 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.67 vs. limit=10.0 2023-10-03 11:42:47,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:42:49,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1254766.6666666667, ans=0.0 2023-10-03 11:42:50,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1254766.6666666667, ans=0.1 2023-10-03 11:42:51,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:42:51,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:42:53,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:42:57,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 11:42:59,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:43:01,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 11:43:01,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:43:01,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:43:04,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 11:43:06,075 INFO [train.py:1046] (1/4) Epoch 36, batch 2300, loss[loss=0.1612, simple_loss=0.2501, pruned_loss=0.03615, over 24653.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2383, pruned_loss=0.0401, over 4721360.62 frames. ], batch size: 68, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:43:07,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:43:07,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:43:12,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1254833.3333333333, ans=0.1 2023-10-03 11:43:14,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:43:14,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:43:18,536 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 11:43:19,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:43:25,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:43:25,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:43:26,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:43:26,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:43:26,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 11:43:26,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:43:28,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:43:30,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:43:34,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:43:37,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:43:40,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:43:43,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:43:44,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:43:47,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:43:50,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:43:51,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:43:52,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1255033.3333333333, ans=0.0 2023-10-03 11:43:54,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:43:54,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:43:54,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 11:43:59,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 11:43:59,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:44:00,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:44:00,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:44:00,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:44:02,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 11:44:02,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:44:02,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 11:44:02,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:44:02,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:44:02,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 11:44:05,657 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.67 vs. limit=12.0 2023-10-03 11:44:08,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:44:10,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1255100.0, ans=0.125 2023-10-03 11:44:11,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:44:14,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:44:15,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:44:15,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 11:44:17,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:44:17,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:44:17,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:44:18,613 INFO [train.py:1046] (1/4) Epoch 36, batch 2350, loss[loss=0.1541, simple_loss=0.2348, pruned_loss=0.03671, over 23593.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2391, pruned_loss=0.04027, over 4724101.35 frames. ], batch size: 149, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:44:18,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 11:44:24,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1255166.6666666667, ans=0.125 2023-10-03 11:44:26,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:44:26,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 11:44:31,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 11:44:32,516 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.30 vs. limit=15.0 2023-10-03 11:44:35,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:44:39,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:44:39,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:44:39,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:44:39,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:44:39,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1255233.3333333333, ans=0.125 2023-10-03 11:44:39,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1255233.3333333333, ans=0.125 2023-10-03 11:44:40,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 11:44:44,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:44:45,734 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.891e+02 2.082e+02 2.272e+02 3.750e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-03 11:44:47,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1255300.0, ans=0.125 2023-10-03 11:44:48,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 11:44:49,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:44:53,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:44:54,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:44:55,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:44:56,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 11:44:56,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:44:59,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:44:59,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:45:00,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:45:02,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:45:07,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 11:45:07,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:45:09,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1255366.6666666667, ans=0.0 2023-10-03 11:45:10,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:45:10,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:45:11,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 11:45:12,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:45:15,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 11:45:15,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:45:18,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1255433.3333333333, ans=0.2 2023-10-03 11:45:19,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 11:45:22,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 11:45:23,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:45:23,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 11:45:23,495 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 11:45:23,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 11:45:23,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1255433.3333333333, ans=0.125 2023-10-03 11:45:24,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 11:45:28,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:45:32,233 INFO [train.py:1046] (1/4) Epoch 36, batch 2400, loss[loss=0.1355, simple_loss=0.1947, pruned_loss=0.03819, over 19782.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2384, pruned_loss=0.04025, over 4721979.47 frames. ], batch size: 388, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 11:45:32,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:45:37,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:45:39,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:45:39,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 11:45:40,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 11:45:40,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1255500.0, ans=0.125 2023-10-03 11:45:46,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1255566.6666666667, ans=0.0 2023-10-03 11:45:47,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 11:45:47,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:45:50,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 11:45:50,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:45:50,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:45:52,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 11:45:57,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:45:59,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 11:46:02,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:46:07,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 11:46:09,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:46:11,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:46:15,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:46:15,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 11:46:16,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:46:22,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:46:23,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:46:25,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1255700.0, ans=0.2 2023-10-03 11:46:28,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:46:29,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:46:29,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:46:29,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:46:29,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:46:29,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:46:29,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:46:31,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1255766.6666666667, ans=0.0 2023-10-03 11:46:31,823 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.22 vs. limit=15.0 2023-10-03 11:46:32,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:46:32,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:46:34,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 11:46:36,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 11:46:37,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:46:37,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:46:39,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 11:46:40,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 11:46:40,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 11:46:42,255 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 11:46:42,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 11:46:43,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:46:43,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:46:45,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:46:45,207 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 11:46:45,514 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:46:46,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:46:47,880 INFO [train.py:1046] (1/4) Epoch 36, batch 2450, loss[loss=0.1426, simple_loss=0.2243, pruned_loss=0.03047, over 24681.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2374, pruned_loss=0.04016, over 4719696.08 frames. ], batch size: 65, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 11:46:47,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 11:46:50,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:46:50,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:46:53,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:46:55,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:46:55,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 11:46:56,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1255833.3333333333, ans=0.2 2023-10-03 11:46:59,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:46:59,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:47:03,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:47:03,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:47:03,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:47:05,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 11:47:11,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:47:12,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:47:12,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:47:15,052 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.962e+02 2.115e+02 2.391e+02 5.447e+02, threshold=4.229e+02, percent-clipped=1.0 2023-10-03 11:47:16,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:47:16,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:47:17,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:47:17,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:47:20,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 11:47:21,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:47:23,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1255966.6666666667, ans=0.0 2023-10-03 11:47:28,196 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.00 vs. limit=15.0 2023-10-03 11:47:30,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:47:30,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:47:30,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:47:31,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:47:31,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:47:31,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:47:33,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 11:47:36,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:47:38,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:47:41,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:47:41,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:47:45,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:47:45,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 11:47:46,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:47:48,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:47:48,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 11:47:49,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:47:50,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:47:54,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:47:57,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:47:57,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:47:58,351 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.60 vs. limit=15.0 2023-10-03 11:48:00,573 INFO [train.py:1046] (1/4) Epoch 36, batch 2500, loss[loss=0.1534, simple_loss=0.2371, pruned_loss=0.03489, over 24478.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2369, pruned_loss=0.03961, over 4719885.17 frames. ], batch size: 63, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 11:48:00,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 11:48:02,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:48:06,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1256166.6666666667, ans=0.2 2023-10-03 11:48:08,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:48:18,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:48:18,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:48:20,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:48:20,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 11:48:25,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:48:25,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:48:27,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 11:48:27,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 11:48:28,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 11:48:29,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:48:29,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:48:29,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 11:48:29,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:48:31,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 11:48:31,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:48:36,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:48:36,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1256300.0, ans=0.125 2023-10-03 11:48:37,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:48:40,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 11:48:40,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 11:48:42,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:48:44,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:48:48,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:48:51,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:48:54,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:48:59,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 11:49:02,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 11:49:02,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:49:02,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 11:49:05,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:49:05,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:49:05,864 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 11:49:05,864 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 11:49:05,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 11:49:09,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:49:10,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 11:49:10,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 11:49:10,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:49:10,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 11:49:11,144 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.01 vs. limit=12.0 2023-10-03 11:49:15,080 INFO [train.py:1046] (1/4) Epoch 36, batch 2550, loss[loss=0.1669, simple_loss=0.2407, pruned_loss=0.0466, over 23785.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2373, pruned_loss=0.03947, over 4720693.40 frames. ], batch size: 212, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 11:49:15,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 11:49:18,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:49:19,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:49:21,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:49:22,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:49:24,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 11:49:24,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:49:24,570 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.92 vs. limit=12.0 2023-10-03 11:49:26,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 11:49:28,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:49:30,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:49:34,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:49:34,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 11:49:34,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:49:34,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:49:34,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:49:37,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:49:38,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 11:49:38,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 11:49:38,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:49:38,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 11:49:43,473 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.881e+02 2.105e+02 2.315e+02 3.420e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-03 11:49:49,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:49:51,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1256633.3333333333, ans=0.125 2023-10-03 11:49:54,338 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-03 11:49:56,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:49:56,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:49:56,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:49:56,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=1256633.3333333333, ans=0.025 2023-10-03 11:49:57,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:50:03,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:50:03,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1256700.0, ans=0.125 2023-10-03 11:50:06,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:50:06,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:50:06,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:50:07,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:50:07,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:50:13,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:50:13,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:50:16,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1256766.6666666667, ans=0.125 2023-10-03 11:50:17,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:50:17,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 11:50:17,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:50:19,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:50:19,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:50:20,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:50:22,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:50:22,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1256766.6666666667, ans=0.125 2023-10-03 11:50:26,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1256766.6666666667, ans=0.2 2023-10-03 11:50:27,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:50:29,094 INFO [train.py:1046] (1/4) Epoch 36, batch 2600, loss[loss=0.1602, simple_loss=0.2538, pruned_loss=0.03332, over 24312.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2379, pruned_loss=0.03968, over 4715486.42 frames. ], batch size: 74, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:50:30,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:50:31,930 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 11:50:33,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1256833.3333333333, ans=0.125 2023-10-03 11:50:36,035 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 11:50:36,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:50:36,092 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 11:50:37,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 11:50:37,946 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 11:50:39,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:50:39,582 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 11:50:40,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 11:50:42,865 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 11:50:43,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:50:44,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 11:50:47,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 11:50:48,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:50:48,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 11:50:51,259 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 11:50:52,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 11:50:58,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:50:58,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:50:58,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:50:58,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 11:50:59,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:51:01,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1256966.6666666667, ans=0.0 2023-10-03 11:51:02,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1256966.6666666667, ans=0.0 2023-10-03 11:51:03,794 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 11:51:09,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:51:10,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:51:10,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 11:51:10,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:51:10,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:51:12,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 11:51:16,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:51:17,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:51:18,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:51:21,646 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 11:51:21,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:51:22,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:51:25,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:51:27,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:51:27,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 11:51:27,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:51:28,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:51:30,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:51:34,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 11:51:35,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:51:37,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 11:51:40,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 11:51:42,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:51:43,354 INFO [train.py:1046] (1/4) Epoch 36, batch 2650, loss[loss=0.1724, simple_loss=0.2447, pruned_loss=0.05007, over 23544.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2389, pruned_loss=0.04006, over 4721049.49 frames. ], batch size: 256, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:51:43,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:51:43,463 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 11:51:43,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:51:46,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:51:48,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 11:51:49,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:51:51,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:51:52,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 11:51:53,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:51:53,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:51:57,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 11:51:57,826 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 11:52:00,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:52:02,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 11:52:02,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:52:03,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 11:52:08,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:52:08,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 11:52:09,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:52:09,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:11,329 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.883e+02 2.080e+02 2.401e+02 3.232e+02, threshold=4.161e+02, percent-clipped=0.0 2023-10-03 11:52:14,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 11:52:14,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 11:52:16,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:52:20,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1257300.0, ans=0.09899494936611666 2023-10-03 11:52:22,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 11:52:22,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:52:23,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:23,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:52:23,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:52:24,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:52:26,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:52:27,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:52:27,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1257366.6666666667, ans=0.0 2023-10-03 11:52:28,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:52:30,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:52:31,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:52:33,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:52:33,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=1257366.6666666667, ans=15.0 2023-10-03 11:52:35,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:52:35,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1257366.6666666667, ans=0.125 2023-10-03 11:52:36,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:52:38,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:52:39,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 11:52:42,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:42,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1257433.3333333333, ans=0.125 2023-10-03 11:52:43,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:52:43,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:52:43,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 11:52:43,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1257433.3333333333, ans=0.05 2023-10-03 11:52:48,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:52:49,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:49,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:51,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:52:52,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:52:53,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:52:55,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:52:55,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 11:52:56,981 INFO [train.py:1046] (1/4) Epoch 36, batch 2700, loss[loss=0.1575, simple_loss=0.2274, pruned_loss=0.04378, over 22726.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2394, pruned_loss=0.04008, over 4735689.21 frames. ], batch size: 322, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:52:58,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:53:01,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 11:53:02,206 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.46 vs. limit=15.0 2023-10-03 11:53:02,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:53:03,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:03,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:04,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1257500.0, ans=0.09899494936611666 2023-10-03 11:53:05,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:53:05,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:53:05,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:53:05,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:53:05,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 11:53:07,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:53:10,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:53:10,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:53:10,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:53:12,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:53:13,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 11:53:14,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:53:20,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:53:20,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:53:26,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:53:26,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:53:28,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:53:28,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:53:30,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:53:32,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:53:32,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:53:32,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:53:37,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:37,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:53:45,894 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.10 vs. limit=15.0 2023-10-03 11:53:46,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:53:46,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:53:52,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:53:52,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:53:55,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:55,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1257766.6666666667, ans=0.125 2023-10-03 11:53:56,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:53:57,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:53:59,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:53:59,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:59,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:54:01,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:54:03,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:54:03,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:54:03,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1257766.6666666667, ans=0.0 2023-10-03 11:54:05,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1257766.6666666667, ans=0.125 2023-10-03 11:54:06,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 11:54:07,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:54:10,710 INFO [train.py:1046] (1/4) Epoch 36, batch 2750, loss[loss=0.151, simple_loss=0.2116, pruned_loss=0.04516, over 22733.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2385, pruned_loss=0.04005, over 4735354.78 frames. ], batch size: 322, lr: 2.83e-03, grad_scale: 4.0 2023-10-03 11:54:10,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:54:10,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 11:54:13,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 11:54:13,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:54:13,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1257833.3333333333, ans=0.1 2023-10-03 11:54:15,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:54:16,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:54:17,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:17,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:54:18,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:23,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:54:23,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 11:54:23,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:54:23,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:23,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 11:54:23,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:54:23,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:54:30,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 11:54:30,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:54:31,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:31,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:54:31,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 11:54:33,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:54:35,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:54:35,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:54:36,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:54:39,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:54:39,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 11:54:40,468 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.983e+02 2.239e+02 2.641e+02 5.389e+02, threshold=4.478e+02, percent-clipped=1.0 2023-10-03 11:54:40,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:54:41,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:42,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:54:43,951 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-03 11:54:47,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:54:51,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 11:54:51,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:54:54,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:54,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:54:56,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:55:00,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:55:00,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:55:00,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 11:55:01,378 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.89 vs. limit=15.0 2023-10-03 11:55:06,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:55:08,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 11:55:12,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1258100.0, ans=0.125 2023-10-03 11:55:14,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 11:55:16,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:55:16,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 11:55:17,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:55:18,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:55:18,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 11:55:18,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:55:24,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 11:55:25,394 INFO [train.py:1046] (1/4) Epoch 36, batch 2800, loss[loss=0.1682, simple_loss=0.2515, pruned_loss=0.0424, over 24049.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2375, pruned_loss=0.03969, over 4727556.46 frames. ], batch size: 80, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:55:25,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:55:25,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:55:26,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 11:55:26,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:55:26,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1258166.6666666667, ans=0.125 2023-10-03 11:55:27,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:55:28,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:55:28,795 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 11:55:28,796 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 11:55:29,448 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.02 vs. limit=12.0 2023-10-03 11:55:31,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:55:34,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:55:35,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:55:38,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:55:41,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 11:55:43,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 11:55:44,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 11:55:46,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:55:46,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:55:46,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:55:50,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:55:51,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:55:51,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:55:51,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:55:54,154 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.44 vs. limit=15.0 2023-10-03 11:56:01,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:56:02,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:56:05,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:56:05,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:56:05,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:56:11,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:56:11,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 11:56:11,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:56:12,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:56:12,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:56:15,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:56:17,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:56:21,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:56:22,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:56:24,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:56:24,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 11:56:24,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:56:24,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:56:24,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:56:24,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 11:56:26,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:56:28,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:56:28,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:56:28,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 11:56:29,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:56:29,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:56:30,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:56:32,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 11:56:32,635 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:56:36,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:56:36,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:56:37,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:56:39,667 INFO [train.py:1046] (1/4) Epoch 36, batch 2850, loss[loss=0.1457, simple_loss=0.2184, pruned_loss=0.03652, over 23482.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2368, pruned_loss=0.03978, over 4716964.10 frames. ], batch size: 285, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:56:41,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:56:42,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1258500.0, ans=0.0 2023-10-03 11:56:43,876 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=1258500.0, ans=0.05 2023-10-03 11:56:45,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:56:46,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:56:46,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:56:48,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:56:49,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:56:50,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:56:52,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 11:56:58,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 11:56:58,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:57:01,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 11:57:02,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:04,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 11:57:04,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 11:57:05,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:08,330 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.821e+02 1.991e+02 2.148e+02 3.259e+02, threshold=3.982e+02, percent-clipped=0.0 2023-10-03 11:57:13,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1258633.3333333333, ans=0.1 2023-10-03 11:57:17,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:57:17,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:57:17,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:57:18,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:57:19,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:57:19,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:57:21,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:57:21,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 11:57:24,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:57:24,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:57:25,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:57:25,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:26,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1258700.0, ans=0.1 2023-10-03 11:57:29,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:57:29,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:57:31,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:57:32,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:57:34,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:57:35,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:35,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:57:38,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:57:44,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:57:45,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 11:57:45,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 11:57:46,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 11:57:47,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:57:47,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 11:57:48,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:57:49,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:57:49,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:57:49,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:57:49,739 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 11:57:51,093 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 11:57:51,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:57:51,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:57:52,537 INFO [train.py:1046] (1/4) Epoch 36, batch 2900, loss[loss=0.1613, simple_loss=0.2543, pruned_loss=0.03415, over 24305.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2375, pruned_loss=0.0397, over 4723063.52 frames. ], batch size: 74, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:57:53,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:57:53,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:57:54,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:57:55,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 11:57:57,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1258833.3333333333, ans=0.1 2023-10-03 11:57:59,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:59,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 11:58:01,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 11:58:01,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1258833.3333333333, ans=0.09899494936611666 2023-10-03 11:58:02,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:58:02,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:58:02,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1258833.3333333333, ans=0.125 2023-10-03 11:58:03,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:58:05,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:58:09,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:58:09,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:58:12,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 11:58:12,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 11:58:12,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:58:14,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:58:14,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1258900.0, ans=0.125 2023-10-03 11:58:15,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 11:58:15,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 11:58:19,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:58:19,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 11:58:19,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:58:21,517 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.43 vs. limit=15.0 2023-10-03 11:58:22,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:58:22,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:58:25,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:58:25,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:58:30,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:58:31,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:58:32,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1258966.6666666667, ans=0.125 2023-10-03 11:58:33,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 11:58:33,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 11:58:33,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:58:36,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:58:39,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 11:58:40,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:58:41,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1259033.3333333333, ans=0.1 2023-10-03 11:58:46,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:58:52,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:58:53,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:58:53,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 11:58:58,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:58:58,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 11:58:58,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:58:59,655 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.11 vs. limit=15.0 2023-10-03 11:59:00,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:59:04,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:59:05,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 11:59:06,036 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1259166.6666666667, ans=0.0 2023-10-03 11:59:07,056 INFO [train.py:1046] (1/4) Epoch 36, batch 2950, loss[loss=0.1692, simple_loss=0.2593, pruned_loss=0.03956, over 24540.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2385, pruned_loss=0.03989, over 4721401.59 frames. ], batch size: 71, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:59:07,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:59:07,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:59:08,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:59:08,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1259166.6666666667, ans=0.025 2023-10-03 11:59:10,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:59:13,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 11:59:13,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 11:59:14,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:59:14,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:59:20,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:59:21,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:59:23,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:59:23,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:59:26,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:59:26,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:59:29,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:59:29,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:59:29,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:59:32,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 11:59:34,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1259233.3333333333, ans=0.0 2023-10-03 11:59:36,561 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 1.912e+02 2.122e+02 2.362e+02 3.535e+02, threshold=4.243e+02, percent-clipped=0.0 2023-10-03 11:59:36,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 11:59:36,703 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 11:59:37,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:59:39,380 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 11:59:41,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 11:59:41,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:59:42,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:59:42,824 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 11:59:42,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 11:59:45,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 11:59:45,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:59:46,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:59:49,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:59:51,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:59:51,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1259366.6666666667, ans=0.125 2023-10-03 11:59:52,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:59:52,531 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 11:59:53,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:59:53,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 11:59:59,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:00:01,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:00:03,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 12:00:03,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:00:04,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 12:00:07,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:00:08,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:00:10,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:00:11,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:00:12,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 12:00:14,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:00:15,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:00:15,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:00:15,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:00:17,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:00:17,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:00:18,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:00:18,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 12:00:20,330 INFO [train.py:1046] (1/4) Epoch 36, batch 3000, loss[loss=0.1655, simple_loss=0.2453, pruned_loss=0.04279, over 23299.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2394, pruned_loss=0.04045, over 4726799.87 frames. ], batch size: 105, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 12:00:20,330 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 12:00:31,799 INFO [train.py:1078] (1/4) Epoch 36, validation: loss=0.3578, simple_loss=0.2691, pruned_loss=0.2232, over 1125622.00 frames. 2023-10-03 12:00:31,800 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 12:00:31,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:00:33,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:00:34,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:00:38,531 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 12:00:38,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 12:00:41,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:00:42,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:00:42,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 12:00:42,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:00:49,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:00:50,647 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:00:55,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1259566.6666666667, ans=0.125 2023-10-03 12:00:56,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1259566.6666666667, ans=0.2 2023-10-03 12:00:58,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:01:06,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 12:01:08,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:01:09,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:01:11,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:01:11,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:01:12,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:01:12,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 12:01:14,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 12:01:15,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:01:17,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:01:18,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:01:18,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:01:20,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:01:20,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:01:22,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:01:22,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:01:22,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:01:24,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:01:27,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 12:01:29,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:01:29,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:01:29,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:01:33,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:01:35,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:01:36,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 12:01:37,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 12:01:37,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:01:37,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 12:01:38,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:01:41,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 12:01:42,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:01:43,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:01:44,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 12:01:45,288 INFO [train.py:1046] (1/4) Epoch 36, batch 3050, loss[loss=0.1708, simple_loss=0.2504, pruned_loss=0.04566, over 23743.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2402, pruned_loss=0.04065, over 4734908.53 frames. ], batch size: 85, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 12:01:45,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 12:01:45,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 12:01:45,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:01:47,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:01:47,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 12:01:47,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:01:47,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:01:51,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 12:01:52,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:01:54,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:01:55,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:01:58,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:01:59,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 12:02:06,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 12:02:06,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 12:02:06,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:02:09,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:02:12,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:02:12,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:02:12,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:02:14,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1259966.6666666667, ans=0.0 2023-10-03 12:02:15,670 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.931e+02 2.120e+02 2.392e+02 4.197e+02, threshold=4.241e+02, percent-clipped=0.0 2023-10-03 12:02:15,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:02:15,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:02:17,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:02:17,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:02:17,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:02:17,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1259966.6666666667, ans=0.0 2023-10-03 12:02:19,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:02:21,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:02:23,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:02:24,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 12:02:24,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:02:24,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:02:24,717 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:02:26,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1259966.6666666667, ans=0.2 2023-10-03 12:02:28,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:02:28,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:02:30,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:02:30,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:02:35,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:02:35,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:02:42,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:02:43,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:02:43,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:02:45,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:02:47,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:02:48,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:02:48,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 12:02:49,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:02:49,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:02:51,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 12:02:52,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:02:58,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:02:59,598 INFO [train.py:1046] (1/4) Epoch 36, batch 3100, loss[loss=0.143, simple_loss=0.205, pruned_loss=0.04048, over 19737.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2393, pruned_loss=0.04081, over 4714493.64 frames. ], batch size: 388, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 12:02:59,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:03:02,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:03:05,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 12:03:07,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 12:03:08,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 12:03:08,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:03:10,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1260166.6666666667, ans=0.07 2023-10-03 12:03:11,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:03:11,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:03:13,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 12:03:15,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1260233.3333333333, ans=0.125 2023-10-03 12:03:16,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:03:22,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 12:03:22,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1260233.3333333333, ans=0.125 2023-10-03 12:03:26,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:03:27,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:03:29,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:03:29,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:03:30,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 12:03:32,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:03:32,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 12:03:32,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:03:34,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:03:35,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 12:03:35,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:03:39,155 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.88 vs. limit=10.0 2023-10-03 12:03:39,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:03:41,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 12:03:43,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 12:03:44,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:03:45,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:03:47,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:03:47,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:03:49,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:03:49,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:03:49,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:03:52,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:03:52,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:03:52,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:03:52,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 12:03:57,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:03:57,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 12:03:58,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1260433.3333333333, ans=0.125 2023-10-03 12:04:00,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:04:01,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 12:04:01,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:04:01,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:04:01,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 12:04:09,238 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.74 vs. limit=15.0 2023-10-03 12:04:11,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 12:04:11,277 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:04:14,242 INFO [train.py:1046] (1/4) Epoch 36, batch 3150, loss[loss=0.1443, simple_loss=0.2221, pruned_loss=0.0333, over 24511.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2388, pruned_loss=0.04003, over 4717448.02 frames. ], batch size: 58, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 12:04:14,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:04:14,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:04:14,657 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:04:17,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:04:17,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:04:17,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 12:04:17,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:04:19,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 12:04:19,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 12:04:21,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:04:24,652 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 12:04:27,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 12:04:28,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:04:28,841 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 12:04:30,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 12:04:31,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 12:04:31,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 12:04:31,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 12:04:31,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:04:31,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:04:33,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1260566.6666666667, ans=0.125 2023-10-03 12:04:34,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:04:35,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 12:04:36,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:04:37,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:04:38,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:04:41,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 12:04:42,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 12:04:44,296 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.839e+02 2.057e+02 2.261e+02 3.088e+02, threshold=4.113e+02, percent-clipped=0.0 2023-10-03 12:04:44,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:04:45,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:04:45,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:04:47,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 12:04:49,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 12:04:50,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:04:50,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:04:50,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:04:51,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:04:51,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:04:53,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 12:04:53,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:04:54,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 12:04:55,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:04:55,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:04:58,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:04:58,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:05:00,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 12:05:01,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:05:02,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 12:05:04,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:04,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 12:05:05,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 12:05:07,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:05:08,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:05:10,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 12:05:12,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 12:05:12,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:05:16,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:05:16,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:16,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:05:21,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:05:22,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:24,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 12:05:28,489 INFO [train.py:1046] (1/4) Epoch 36, batch 3200, loss[loss=0.137, simple_loss=0.2166, pruned_loss=0.02868, over 24435.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2373, pruned_loss=0.03968, over 4712149.12 frames. ], batch size: 58, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 12:05:28,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:05:28,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 12:05:30,498 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.60 vs. limit=15.0 2023-10-03 12:05:32,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:34,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:05:34,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 12:05:37,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:05:39,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1260833.3333333333, ans=0.5 2023-10-03 12:05:41,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:05:44,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:52,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:06:00,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 12:06:02,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:06:03,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 12:06:04,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:06:08,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:06:08,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:06:09,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:06:13,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 12:06:16,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 12:06:18,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 12:06:19,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 12:06:23,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:06:28,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:06:28,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:06:29,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:06:29,953 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 12:06:29,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:06:34,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:06:34,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 12:06:36,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 12:06:36,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 12:06:37,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 12:06:39,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:06:42,439 INFO [train.py:1046] (1/4) Epoch 36, batch 3250, loss[loss=0.1779, simple_loss=0.2495, pruned_loss=0.05311, over 23823.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2369, pruned_loss=0.03993, over 4708012.30 frames. ], batch size: 179, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 12:06:42,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 12:06:42,562 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 12:06:42,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:06:42,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:06:43,973 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 12:06:49,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:06:51,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:06:52,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1261166.6666666667, ans=0.1 2023-10-03 12:07:00,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:07:01,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 12:07:01,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:07:03,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:07:03,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:07:03,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:07:03,947 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.31 vs. limit=22.5 2023-10-03 12:07:04,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:07:06,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:07:06,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:07:06,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:07:06,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:07:06,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:07:08,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:07:11,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:07:12,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:07:13,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1261300.0, ans=0.125 2023-10-03 12:07:13,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1261300.0, ans=0.1 2023-10-03 12:07:14,172 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 1.976e+02 2.164e+02 2.550e+02 4.020e+02, threshold=4.329e+02, percent-clipped=0.0 2023-10-03 12:07:14,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:07:14,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:07:17,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:07:17,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:07:17,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:07:21,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 12:07:23,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:07:23,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:07:24,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:07:26,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:07:26,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1261366.6666666667, ans=0.2 2023-10-03 12:07:31,564 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.02 vs. limit=12.0 2023-10-03 12:07:31,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:07:37,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:07:39,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:07:39,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 12:07:39,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:07:39,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 12:07:39,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:07:42,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 12:07:42,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 12:07:43,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:07:43,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:07:44,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:07:45,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 12:07:45,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:07:46,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1261433.3333333333, ans=0.09899494936611666 2023-10-03 12:07:48,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:07:49,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:07:49,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 12:07:51,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:07:53,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:07:53,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 12:07:57,711 INFO [train.py:1046] (1/4) Epoch 36, batch 3300, loss[loss=0.1983, simple_loss=0.2637, pruned_loss=0.06645, over 19462.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2379, pruned_loss=0.04003, over 4703410.10 frames. ], batch size: 389, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:07:57,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:07:57,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 12:08:00,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 12:08:01,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 12:08:01,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:08:04,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:08:05,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:08:07,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:08,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 12:08:08,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:08:08,888 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:08:11,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:08:11,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:08:16,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 12:08:17,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:08:17,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:08:17,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:19,157 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 12:08:19,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:08:20,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:08:21,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:08:21,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:08:21,824 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 12:08:26,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:08:26,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:08:27,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:27,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 12:08:27,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 12:08:29,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:30,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:08:32,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1261633.3333333333, ans=0.2 2023-10-03 12:08:33,441 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 12:08:34,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 12:08:34,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:08:35,853 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.18 vs. limit=10.0 2023-10-03 12:08:36,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 12:08:37,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:08:38,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1261633.3333333333, ans=0.0 2023-10-03 12:08:41,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:08:42,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:08:44,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:08:44,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:08:44,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:08:44,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:08:44,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1261700.0, ans=0.125 2023-10-03 12:08:47,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:08:47,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:48,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:08:48,621 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 12:08:51,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 12:08:52,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 12:08:52,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:08:52,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:08:54,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:08:54,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:08:55,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:08:55,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:08:55,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 12:08:57,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:58,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:09:01,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 12:09:01,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:02,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:04,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1261766.6666666667, ans=0.0 2023-10-03 12:09:05,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:09:05,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:09:07,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:09:09,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:09:09,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:10,286 INFO [train.py:1046] (1/4) Epoch 36, batch 3350, loss[loss=0.1465, simple_loss=0.2375, pruned_loss=0.02779, over 24450.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2387, pruned_loss=0.04011, over 4697534.80 frames. ], batch size: 63, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:09:12,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:09:13,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:14,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1261833.3333333333, ans=0.0 2023-10-03 12:09:15,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:09:15,321 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:09:19,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:20,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:09:23,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:09:23,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:09:24,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 12:09:25,089 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 12:09:25,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:09:28,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 12:09:28,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 12:09:30,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:09:30,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:09:31,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:09:31,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 12:09:33,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:33,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:09:34,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1261900.0, ans=0.125 2023-10-03 12:09:35,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:35,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:35,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:37,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:09:37,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1261900.0, ans=0.1 2023-10-03 12:09:40,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:09:41,973 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.875e+02 2.056e+02 2.286e+02 3.351e+02, threshold=4.113e+02, percent-clipped=0.0 2023-10-03 12:09:42,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:42,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:09:46,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:09:48,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:50,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:50,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:09:53,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:09:55,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 12:09:55,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 12:09:55,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 12:09:55,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:09:55,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 12:09:58,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:09:58,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:10:04,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1262033.3333333333, ans=0.0 2023-10-03 12:10:05,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:10:07,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 12:10:07,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:10:08,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:10:08,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:10:15,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:10:16,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 12:10:17,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:10:17,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:10:19,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:10:19,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 12:10:20,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:10:20,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 12:10:23,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:10:24,774 INFO [train.py:1046] (1/4) Epoch 36, batch 3400, loss[loss=0.1523, simple_loss=0.232, pruned_loss=0.0363, over 23401.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2396, pruned_loss=0.04029, over 4704510.23 frames. ], batch size: 119, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:10:24,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:10:26,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:10:27,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:10:27,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 12:10:33,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 12:10:34,895 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 12:10:34,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:10:35,653 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.38 vs. limit=15.0 2023-10-03 12:10:37,665 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.09 vs. limit=15.0 2023-10-03 12:10:38,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:10:38,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:10:39,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:10:39,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:10:47,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:10:48,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 12:10:48,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1262233.3333333333, ans=0.0 2023-10-03 12:10:52,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:10:53,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:10:55,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:10:55,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 12:11:00,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:11:05,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 12:11:08,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1262366.6666666667, ans=0.1 2023-10-03 12:11:11,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:11:13,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:11:13,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 12:11:14,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:11:14,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:11:15,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:11:16,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:11:19,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:11:22,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:11:22,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:11:26,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:11:29,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 12:11:32,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1262433.3333333333, ans=0.125 2023-10-03 12:11:33,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:11:37,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 12:11:39,317 INFO [train.py:1046] (1/4) Epoch 36, batch 3450, loss[loss=0.1542, simple_loss=0.229, pruned_loss=0.0397, over 24278.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2398, pruned_loss=0.04062, over 4697683.18 frames. ], batch size: 56, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:11:39,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1262500.0, ans=0.1 2023-10-03 12:11:40,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 12:11:40,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:11:42,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:11:42,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 12:11:44,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:11:47,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:11:47,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1262500.0, ans=0.125 2023-10-03 12:11:49,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1262500.0, ans=0.1 2023-10-03 12:11:53,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:11:53,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:11:54,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:11:54,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:11:55,319 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.35 vs. limit=6.0 2023-10-03 12:11:56,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:12:01,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 12:12:06,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1262566.6666666667, ans=0.1 2023-10-03 12:12:08,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 12:12:08,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:12:09,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:12:10,548 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.861e+02 2.014e+02 2.167e+02 2.671e+02, threshold=4.028e+02, percent-clipped=0.0 2023-10-03 12:12:10,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:12:15,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 12:12:16,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:12:20,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:12:20,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:12:21,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1262633.3333333333, ans=0.125 2023-10-03 12:12:22,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:12:24,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:12:25,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 12:12:25,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:12:25,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1262700.0, ans=0.04949747468305833 2023-10-03 12:12:26,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:12:27,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1262700.0, ans=0.125 2023-10-03 12:12:29,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:12:32,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 12:12:35,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:12:41,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:12:41,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:12:46,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:12:49,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:12:49,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:12:51,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:12:51,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:12:53,916 INFO [train.py:1046] (1/4) Epoch 36, batch 3500, loss[loss=0.1583, simple_loss=0.2166, pruned_loss=0.05001, over 22873.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2385, pruned_loss=0.04038, over 4688861.45 frames. ], batch size: 322, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:12:55,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:12:58,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:12:59,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 12:13:02,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:13:04,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 12:13:08,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:13:08,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 12:13:13,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:13:14,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:13:14,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:13:16,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:13:16,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 12:13:17,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:17,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:13:17,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 12:13:20,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:20,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 12:13:23,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:13:24,491 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.38 vs. limit=10.0 2023-10-03 12:13:26,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:26,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 12:13:26,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:13:29,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:13:30,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:13:31,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:33,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:13:33,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:13:34,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 12:13:36,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 12:13:36,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 12:13:38,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:13:39,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:39,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:13:40,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:13:44,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 12:13:45,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:13:47,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1263033.3333333333, ans=0.125 2023-10-03 12:13:50,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:13:51,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 12:13:51,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 12:13:51,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:13:53,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:13:53,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:13:56,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:57,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 12:13:58,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1263100.0, ans=0.125 2023-10-03 12:13:59,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:14:00,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:14:01,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 12:14:03,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 12:14:06,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:14:07,929 INFO [train.py:1046] (1/4) Epoch 36, batch 3550, loss[loss=0.1676, simple_loss=0.2335, pruned_loss=0.05082, over 22814.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2378, pruned_loss=0.04023, over 4688483.11 frames. ], batch size: 322, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:14:08,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:14:08,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:14:08,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:14:10,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:14:18,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:14:20,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 12:14:23,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:14:23,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:14:26,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:14:27,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:14:27,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:14:30,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:14:30,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:14:32,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:14:32,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 12:14:32,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:14:37,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:14:37,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:14:39,161 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.892e+02 2.093e+02 2.381e+02 3.257e+02, threshold=4.186e+02, percent-clipped=0.0 2023-10-03 12:14:39,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:14:39,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:14:39,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:14:40,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 12:14:40,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:14:42,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:14:43,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1263300.0, ans=0.125 2023-10-03 12:14:45,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 12:14:48,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:14:49,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:14:50,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1263300.0, ans=10.0 2023-10-03 12:14:51,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:14:52,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 12:14:52,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:14:54,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 12:14:54,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:14:57,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:14:57,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:15:01,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 12:15:01,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1263366.6666666667, ans=0.125 2023-10-03 12:15:03,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:15:03,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1263366.6666666667, ans=0.025 2023-10-03 12:15:08,581 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.46 vs. limit=6.0 2023-10-03 12:15:09,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:15:10,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 12:15:10,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:15:12,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1263433.3333333333, ans=0.125 2023-10-03 12:15:14,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:15:16,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 12:15:22,073 INFO [train.py:1046] (1/4) Epoch 36, batch 3600, loss[loss=0.162, simple_loss=0.2505, pruned_loss=0.03669, over 24314.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2376, pruned_loss=0.03998, over 4704160.48 frames. ], batch size: 74, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:15:23,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 12:15:23,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:15:23,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:15:23,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=1263500.0, ans=0.95 2023-10-03 12:15:25,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:15:26,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:15:27,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:15:29,040 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.99 vs. limit=15.0 2023-10-03 12:15:29,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:15:31,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:15:34,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:15:34,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:15:34,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:15:34,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 12:15:35,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1263566.6666666667, ans=0.125 2023-10-03 12:15:35,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=1263566.6666666667, ans=0.5 2023-10-03 12:15:37,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:15:37,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:15:41,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:15:42,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:15:44,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:15:45,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:15:45,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 12:15:45,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1263566.6666666667, ans=0.2 2023-10-03 12:15:46,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:15:48,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:15:51,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:15:53,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:15:54,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:15:55,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:15:57,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 12:16:03,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:16:04,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:16:05,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 12:16:09,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:16:14,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:16:19,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:16:25,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 12:16:25,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:16:25,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 12:16:26,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 12:16:28,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 12:16:29,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:16:29,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:16:31,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 12:16:32,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:16:32,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:16:32,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:16:33,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 12:16:34,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 12:16:36,405 INFO [train.py:1046] (1/4) Epoch 36, batch 3650, loss[loss=0.1587, simple_loss=0.2364, pruned_loss=0.04054, over 19484.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2381, pruned_loss=0.03975, over 4706653.47 frames. ], batch size: 42, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:16:37,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:16:39,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 12:16:43,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 12:16:44,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:16:44,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1263833.3333333333, ans=0.125 2023-10-03 12:16:49,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 12:16:52,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 12:16:55,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:16:55,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:16:56,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:16:58,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 12:16:58,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:16:59,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 12:17:01,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:17:01,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:17:01,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 12:17:04,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 12:17:04,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:17:04,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:17:07,637 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.919e+02 2.181e+02 2.473e+02 3.454e+02, threshold=4.361e+02, percent-clipped=0.0 2023-10-03 12:17:07,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:17:09,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 12:17:10,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 12:17:10,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:17:11,100 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.29 vs. limit=22.5 2023-10-03 12:17:13,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 12:17:15,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:17:15,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:17:19,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:17:19,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1264033.3333333333, ans=0.0 2023-10-03 12:17:20,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:17:20,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:17:22,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:17:25,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:17:28,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:17:29,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:17:31,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:17:31,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:17:31,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:17:32,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:17:34,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:17:40,269 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 12:17:44,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:17:44,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:17:44,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 12:17:45,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:17:45,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:17:47,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:17:48,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 12:17:48,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:17:50,163 INFO [train.py:1046] (1/4) Epoch 36, batch 3700, loss[loss=0.1537, simple_loss=0.2283, pruned_loss=0.03952, over 23672.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2382, pruned_loss=0.03989, over 4715860.70 frames. ], batch size: 149, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:17:50,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:17:51,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:17:51,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1264166.6666666667, ans=0.125 2023-10-03 12:17:52,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:17:56,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:17:56,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 12:17:56,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:17:58,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:17:58,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:18:00,173 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.40 vs. limit=6.0 2023-10-03 12:18:04,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:18:06,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:18:07,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1264233.3333333333, ans=0.0 2023-10-03 12:18:08,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:18:09,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:18:09,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:18:09,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 12:18:12,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=1264233.3333333333, ans=22.5 2023-10-03 12:18:13,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:18:13,216 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 12:18:20,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:18:20,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:18:21,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:18:21,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 12:18:21,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=1264300.0, ans=0.05 2023-10-03 12:18:22,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:18:24,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:18:25,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 12:18:27,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:18:27,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:18:31,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:18:31,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:18:31,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1264300.0, ans=0.2 2023-10-03 12:18:33,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1264300.0, ans=0.125 2023-10-03 12:18:34,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 12:18:34,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1264366.6666666667, ans=0.125 2023-10-03 12:18:38,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:18:38,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 12:18:38,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1264366.6666666667, ans=0.125 2023-10-03 12:18:40,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:18:40,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 12:18:44,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:18:44,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:18:44,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1264366.6666666667, ans=0.0 2023-10-03 12:18:46,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:18:47,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 12:18:49,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:18:50,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:18:50,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:18:50,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:18:55,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:18:55,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 12:18:57,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 12:18:57,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:18:57,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:18:59,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1264433.3333333333, ans=0.1 2023-10-03 12:19:00,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:19:02,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:19:02,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1264433.3333333333, ans=0.1 2023-10-03 12:19:05,619 INFO [train.py:1046] (1/4) Epoch 36, batch 3750, loss[loss=0.2069, simple_loss=0.2763, pruned_loss=0.06873, over 19240.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2397, pruned_loss=0.04047, over 4709486.64 frames. ], batch size: 388, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:19:05,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:19:07,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:19:07,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:19:08,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 12:19:10,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 12:19:11,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 12:19:13,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 12:19:13,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:19:14,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:19:16,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:19:16,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1264500.0, ans=0.0 2023-10-03 12:19:16,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1264500.0, ans=0.125 2023-10-03 12:19:17,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:19:20,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:19:24,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:19:24,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:19:27,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:19:30,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:19:30,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 12:19:32,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:19:34,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:19:35,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:19:37,203 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.905e+02 2.182e+02 2.625e+02 6.484e+02, threshold=4.364e+02, percent-clipped=1.0 2023-10-03 12:19:38,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 12:19:41,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 12:19:44,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:19:44,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:19:46,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:19:50,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:19:51,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 12:19:55,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 12:19:56,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1264700.0, ans=0.125 2023-10-03 12:19:58,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:20:02,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:20:02,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:20:03,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.90 vs. limit=15.0 2023-10-03 12:20:05,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:20:07,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 12:20:08,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:20:11,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:20:12,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:20:15,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:20:19,912 INFO [train.py:1046] (1/4) Epoch 36, batch 3800, loss[loss=0.1516, simple_loss=0.2372, pruned_loss=0.03303, over 24677.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2393, pruned_loss=0.0405, over 4718384.34 frames. ], batch size: 65, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:20:24,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:20:24,656 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.11 vs. limit=15.0 2023-10-03 12:20:28,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:20:28,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 12:20:29,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 12:20:31,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:20:32,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:20:34,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 12:20:36,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 12:20:36,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:20:37,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:20:39,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:20:39,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:20:40,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:20:41,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 12:20:44,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 12:20:44,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:20:46,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1264900.0, ans=0.0 2023-10-03 12:20:47,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:20:48,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1264966.6666666667, ans=0.1 2023-10-03 12:20:49,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:20:49,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:20:52,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 12:20:52,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:20:54,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:20:56,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:20:57,248 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.31 vs. limit=15.0 2023-10-03 12:21:00,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 12:21:00,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 12:21:03,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:21:07,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:21:11,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:21:14,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 12:21:16,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1265033.3333333333, ans=0.125 2023-10-03 12:21:17,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 12:21:18,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:21:20,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:21:20,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:21:21,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1265100.0, ans=0.125 2023-10-03 12:21:22,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 12:21:25,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 12:21:25,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 12:21:25,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:21:27,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:21:32,801 INFO [train.py:1046] (1/4) Epoch 36, batch 3850, loss[loss=0.1472, simple_loss=0.2217, pruned_loss=0.03632, over 23620.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2381, pruned_loss=0.04018, over 4716462.05 frames. ], batch size: 149, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:21:32,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:21:33,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:21:39,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:21:41,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 12:21:41,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:21:42,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:21:46,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:21:48,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:21:51,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 12:21:51,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1265233.3333333333, ans=0.04949747468305833 2023-10-03 12:21:52,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 12:21:58,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:21:59,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:22:01,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:22:02,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:22:04,230 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.990e+02 2.175e+02 2.458e+02 3.348e+02, threshold=4.349e+02, percent-clipped=0.0 2023-10-03 12:22:05,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:07,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:22:07,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:07,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:22:08,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:10,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1265300.0, ans=0.125 2023-10-03 12:22:11,494 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.21 vs. limit=22.5 2023-10-03 12:22:12,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:12,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:12,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:22:12,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 12:22:12,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 12:22:13,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:22:13,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:16,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:16,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:17,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 12:22:19,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 12:22:21,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:23,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 12:22:24,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 12:22:27,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_na.min_abs, batch_count=1265366.6666666667, ans=0.02 2023-10-03 12:22:28,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:30,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:32,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1265433.3333333333, ans=0.125 2023-10-03 12:22:33,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:33,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 12:22:37,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 12:22:39,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:41,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:43,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:22:43,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:22:45,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:45,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:45,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:22:45,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 12:22:46,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:22:48,162 INFO [train.py:1046] (1/4) Epoch 36, batch 3900, loss[loss=0.1628, simple_loss=0.2539, pruned_loss=0.03585, over 24438.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2364, pruned_loss=0.03995, over 4704506.83 frames. ], batch size: 69, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:22:48,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 12:22:48,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:48,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:49,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:22:49,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:52,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:22:54,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:54,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:55,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:22:55,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 12:22:55,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:58,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:22:59,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:22:59,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:23:01,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:23:04,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:23:04,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:23:04,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:23:05,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 12:23:05,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:23:07,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1265566.6666666667, ans=0.125 2023-10-03 12:23:09,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 12:23:09,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:23:10,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 12:23:12,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 12:23:13,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1265566.6666666667, ans=0.125 2023-10-03 12:23:15,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:23:16,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:23:16,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:23:16,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:23:19,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:23:19,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1265633.3333333333, ans=0.125 2023-10-03 12:23:22,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:23:25,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:23:25,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:23:25,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:23:31,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:23:31,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:23:39,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:23:40,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:23:40,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1265700.0, ans=0.05 2023-10-03 12:23:49,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:23:50,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:23:50,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 12:23:51,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 12:23:51,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:23:52,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 12:23:54,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:23:54,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 12:23:56,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1265766.6666666667, ans=0.0 2023-10-03 12:24:01,820 INFO [train.py:1046] (1/4) Epoch 36, batch 3950, loss[loss=0.1311, simple_loss=0.2124, pruned_loss=0.02489, over 24458.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2355, pruned_loss=0.03937, over 4712110.87 frames. ], batch size: 58, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:24:03,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:24:03,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 12:24:03,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1265833.3333333333, ans=0.125 2023-10-03 12:24:04,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:24:06,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:24:08,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:24:10,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1265833.3333333333, ans=0.1 2023-10-03 12:24:14,270 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 12:24:15,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:24:16,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 12:24:16,995 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 12:24:17,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:24:17,942 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.79 vs. limit=15.0 2023-10-03 12:24:19,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:24:19,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:24:19,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:24:23,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 12:24:25,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:24:25,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:24:25,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:24:27,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:24:28,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:24:32,501 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.840e+02 2.075e+02 2.335e+02 2.837e+02, threshold=4.150e+02, percent-clipped=0.0 2023-10-03 12:24:37,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:24:37,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:24:42,952 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.60 vs. limit=15.0 2023-10-03 12:24:45,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 12:24:50,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 12:24:50,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 12:24:50,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:24:53,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:25:01,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:25:01,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:25:02,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:25:02,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:25:02,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 12:25:07,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1266100.0, ans=0.0 2023-10-03 12:25:08,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:25:08,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:25:13,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 12:25:16,620 INFO [train.py:1046] (1/4) Epoch 36, batch 4000, loss[loss=0.1459, simple_loss=0.2266, pruned_loss=0.03255, over 20252.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2365, pruned_loss=0.0392, over 4720874.34 frames. ], batch size: 44, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:25:22,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:25:25,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1266166.6666666667, ans=0.125 2023-10-03 12:25:29,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:25:33,275 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.94 vs. limit=15.0 2023-10-03 12:25:33,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:25:33,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:25:35,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:25:35,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 12:25:35,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 12:25:36,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 12:25:36,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:25:36,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 12:25:38,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:25:40,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1266233.3333333333, ans=0.2 2023-10-03 12:25:41,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:25:41,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:25:41,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:25:42,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:25:42,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 12:25:42,755 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.61 vs. limit=12.0 2023-10-03 12:25:44,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:25:46,016 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 12:25:47,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:25:47,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:25:50,726 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 12:25:52,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:25:52,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:25:58,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 12:25:58,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:25:59,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:26:01,057 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 12:26:02,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:26:03,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 12:26:03,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:26:05,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:26:06,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:26:06,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:26:07,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:26:09,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:26:11,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 12:26:11,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:26:12,957 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 12:26:18,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:26:20,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1266433.3333333333, ans=0.2 2023-10-03 12:26:21,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 12:26:24,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1266433.3333333333, ans=0.125 2023-10-03 12:26:25,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:26:25,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:26:26,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:26:27,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:26:30,046 INFO [train.py:1046] (1/4) Epoch 36, batch 4050, loss[loss=0.1593, simple_loss=0.2479, pruned_loss=0.03533, over 24428.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.237, pruned_loss=0.039, over 4720993.79 frames. ], batch size: 69, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:26:33,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:26:34,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 12:26:34,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1266500.0, ans=0.0 2023-10-03 12:26:35,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 12:26:38,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:26:38,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:26:40,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:26:41,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:26:42,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:26:47,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:26:49,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:26:49,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 12:26:51,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:26:51,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:26:55,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:26:57,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:27:00,596 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.848e+02 2.001e+02 2.141e+02 3.089e+02, threshold=4.003e+02, percent-clipped=0.0 2023-10-03 12:27:02,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 12:27:03,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 12:27:03,539 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 12:27:03,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1266633.3333333333, ans=0.125 2023-10-03 12:27:05,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:27:11,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 12:27:11,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1266633.3333333333, ans=0.125 2023-10-03 12:27:12,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:27:16,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:27:19,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:27:19,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:27:19,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:27:22,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:27:25,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 12:27:25,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 12:27:27,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:27:28,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 12:27:32,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:27:32,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=1266766.6666666667, ans=0.5 2023-10-03 12:27:38,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 12:27:40,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:27:40,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:27:42,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 12:27:42,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 12:27:42,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:27:43,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:27:44,933 INFO [train.py:1046] (1/4) Epoch 36, batch 4100, loss[loss=0.2149, simple_loss=0.2815, pruned_loss=0.07421, over 19381.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2391, pruned_loss=0.03993, over 4712898.97 frames. ], batch size: 388, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:27:45,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:27:45,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:27:52,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 12:27:53,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 12:27:56,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 12:27:56,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 12:27:56,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:27:58,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:27:58,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:27:58,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:27:59,880 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 12:28:02,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:28:02,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:28:02,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:28:02,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1266900.0, ans=0.1 2023-10-03 12:28:04,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:28:05,222 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.54 vs. limit=12.0 2023-10-03 12:28:07,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:28:08,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:28:10,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:28:10,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 12:28:10,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:28:10,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:28:11,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1266900.0, ans=0.1 2023-10-03 12:28:11,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:28:11,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:28:11,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 12:28:14,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:28:16,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 12:28:18,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:28:19,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:28:19,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 12:28:21,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:28:21,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:28:21,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1266966.6666666667, ans=0.0 2023-10-03 12:28:22,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:28:23,191 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.19 vs. limit=12.0 2023-10-03 12:28:23,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 12:28:25,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:28:25,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:28:28,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 12:28:29,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:28:29,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:28:34,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:28:38,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:28:40,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1267033.3333333333, ans=0.125 2023-10-03 12:28:42,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:28:44,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:28:51,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:28:51,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:28:53,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1267100.0, ans=0.0 2023-10-03 12:28:54,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:28:56,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:28:58,824 INFO [train.py:1046] (1/4) Epoch 36, batch 4150, loss[loss=0.1593, simple_loss=0.2476, pruned_loss=0.03547, over 24482.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2395, pruned_loss=0.04, over 4723123.78 frames. ], batch size: 69, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:28:58,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:29:00,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:29:00,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:29:00,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:29:03,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 12:29:03,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:29:03,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 12:29:05,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 12:29:06,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 12:29:06,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:29:12,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:29:12,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:29:14,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1267233.3333333333, ans=0.125 2023-10-03 12:29:15,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:29:17,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:29:17,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1267233.3333333333, ans=0.125 2023-10-03 12:29:19,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 12:29:20,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 12:29:20,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:29:21,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 12:29:24,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:29:25,162 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.17 vs. limit=22.5 2023-10-03 12:29:29,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:29:29,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 12:29:32,465 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.928e+02 2.046e+02 2.330e+02 3.418e+02, threshold=4.092e+02, percent-clipped=0.0 2023-10-03 12:29:32,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 12:29:32,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:29:34,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 12:29:34,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:29:35,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:29:39,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:29:39,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:29:40,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1267300.0, ans=0.125 2023-10-03 12:29:42,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1267366.6666666667, ans=0.0 2023-10-03 12:29:43,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 12:29:46,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 12:29:48,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:29:50,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 12:29:51,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:29:51,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 12:29:52,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:29:54,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:29:55,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:29:57,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 12:29:57,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:29:57,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 12:29:58,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:30:03,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 12:30:03,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:30:03,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:30:03,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:30:05,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 12:30:05,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:30:05,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:30:06,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:30:07,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:30:07,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 12:30:08,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 12:30:13,249 INFO [train.py:1046] (1/4) Epoch 36, batch 4200, loss[loss=0.1588, simple_loss=0.2294, pruned_loss=0.0441, over 23759.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2384, pruned_loss=0.03972, over 4726208.44 frames. ], batch size: 164, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:30:13,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:30:13,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1267500.0, ans=0.0 2023-10-03 12:30:14,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 12:30:16,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:30:18,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:30:19,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:30:20,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:30:20,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:30:21,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 12:30:21,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1267500.0, ans=0.0 2023-10-03 12:30:25,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 12:30:25,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:30:26,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:30:28,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:30:30,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1267566.6666666667, ans=0.125 2023-10-03 12:30:32,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 12:30:33,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:30:33,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:30:33,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1267566.6666666667, ans=0.09899494936611666 2023-10-03 12:30:34,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 12:30:34,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:30:34,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:30:34,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:30:36,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:30:36,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1267566.6666666667, ans=0.0 2023-10-03 12:30:37,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:30:40,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 12:30:40,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:30:43,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:30:46,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:30:47,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:30:49,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:30:50,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:30:50,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 12:30:50,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:30:53,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:30:58,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:31:01,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:31:05,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:31:08,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 12:31:09,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:31:13,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 12:31:15,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:31:17,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 12:31:17,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1267766.6666666667, ans=0.0 2023-10-03 12:31:19,067 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.89 vs. limit=6.0 2023-10-03 12:31:22,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:31:22,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1267766.6666666667, ans=0.125 2023-10-03 12:31:26,599 INFO [train.py:1046] (1/4) Epoch 36, batch 4250, loss[loss=0.1758, simple_loss=0.2614, pruned_loss=0.04509, over 24558.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2374, pruned_loss=0.03927, over 4732564.42 frames. ], batch size: 71, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:31:28,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:31:28,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:31:31,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:31:37,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:31:37,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 12:31:37,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:31:37,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1267833.3333333333, ans=10.0 2023-10-03 12:31:40,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:31:44,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:31:48,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:31:48,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:31:49,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1267900.0, ans=0.0 2023-10-03 12:31:52,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:31:52,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:31:52,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:31:53,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:31:53,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:31:57,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:31:59,354 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.900e+02 2.182e+02 2.555e+02 3.786e+02, threshold=4.364e+02, percent-clipped=0.0 2023-10-03 12:31:59,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:31:59,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 12:32:05,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 12:32:05,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:32:06,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:32:06,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:32:07,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:32:07,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:32:07,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:32:10,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 12:32:11,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:32:16,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:32:18,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:32:20,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 12:32:20,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:32:21,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 12:32:22,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:32:24,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:32:26,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:32:26,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:32:27,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1268100.0, ans=0.0 2023-10-03 12:32:28,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 12:32:29,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:32:29,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:32:34,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:32:37,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:32:38,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:32:38,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:32:41,283 INFO [train.py:1046] (1/4) Epoch 36, batch 4300, loss[loss=0.1512, simple_loss=0.223, pruned_loss=0.03967, over 23892.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2373, pruned_loss=0.03923, over 4727347.98 frames. ], batch size: 195, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:32:41,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:32:42,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:32:42,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:32:42,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 12:32:44,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:32:48,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:32:49,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:32:53,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:33:00,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:33:00,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 12:33:02,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:33:04,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:33:04,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:33:04,235 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 12:33:07,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:33:08,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:33:11,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 12:33:11,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:33:11,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 12:33:12,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1268300.0, ans=0.0 2023-10-03 12:33:14,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 12:33:15,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:33:18,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:33:18,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:33:20,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:33:22,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:33:23,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:33:23,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 12:33:24,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 12:33:26,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:33:26,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1268366.6666666667, ans=0.0 2023-10-03 12:33:28,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:33:28,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:33:28,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:33:30,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:33:30,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 12:33:30,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 12:33:30,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 12:33:31,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:33:32,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 12:33:32,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 12:33:37,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:33:38,759 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 12:33:38,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:33:40,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:33:40,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:33:43,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 12:33:44,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:33:44,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:33:44,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:33:44,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:33:45,093 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.48 vs. limit=15.0 2023-10-03 12:33:45,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:33:47,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:33:49,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:33:51,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:33:51,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:33:55,777 INFO [train.py:1046] (1/4) Epoch 36, batch 4350, loss[loss=0.1641, simple_loss=0.2575, pruned_loss=0.03535, over 24672.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2379, pruned_loss=0.03961, over 4726401.74 frames. ], batch size: 73, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:33:55,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 12:33:57,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 12:34:01,250 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.94 vs. limit=22.5 2023-10-03 12:34:01,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:34:02,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1268500.0, ans=0.0 2023-10-03 12:34:04,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:34:07,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:34:07,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:34:09,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1268566.6666666667, ans=0.2 2023-10-03 12:34:10,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:34:12,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1268566.6666666667, ans=0.5 2023-10-03 12:34:12,914 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.83 vs. limit=15.0 2023-10-03 12:34:16,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:34:17,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:34:17,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:34:21,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:34:23,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:34:24,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:34:27,499 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 1.937e+02 2.130e+02 2.314e+02 3.607e+02, threshold=4.259e+02, percent-clipped=0.0 2023-10-03 12:34:30,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 12:34:30,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:34:32,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:34:36,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:34:38,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 12:34:40,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1268700.0, ans=0.125 2023-10-03 12:34:41,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:34:41,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1268700.0, ans=0.1 2023-10-03 12:34:42,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:34:45,717 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 12:34:48,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:34:48,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:34:48,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1268700.0, ans=0.125 2023-10-03 12:34:49,711 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 12:34:49,766 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 12:34:49,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:34:49,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:34:51,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:34:51,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:34:52,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:34:52,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:34:52,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1268766.6666666667, ans=0.125 2023-10-03 12:34:55,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 12:34:55,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:34:55,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:34:56,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:34:57,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 12:34:58,866 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 12:34:58,878 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 12:34:58,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 12:35:02,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:35:02,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:35:02,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:35:03,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:35:05,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 12:35:07,699 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 12:35:07,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:35:07,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1268833.3333333333, ans=0.0 2023-10-03 12:35:09,586 INFO [train.py:1046] (1/4) Epoch 36, batch 4400, loss[loss=0.194, simple_loss=0.2639, pruned_loss=0.062, over 19190.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2386, pruned_loss=0.03976, over 4714150.34 frames. ], batch size: 388, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:35:12,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:35:12,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:35:13,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:35:15,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 12:35:15,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 12:35:15,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 12:35:16,622 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 12:35:16,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 12:35:16,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:35:18,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1268833.3333333333, ans=0.125 2023-10-03 12:35:19,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 12:35:20,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:35:22,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:35:22,321 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 12:35:25,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:35:25,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 12:35:27,070 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 12:35:29,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 12:35:29,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 12:35:30,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 12:35:31,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:35:31,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:35:33,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:35:33,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:35:36,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 12:35:36,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 12:35:36,627 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.86 vs. limit=15.0 2023-10-03 12:35:37,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:35:40,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:35:40,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:35:43,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:35:43,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:35:43,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 12:35:44,970 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 12:35:49,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:35:56,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:35:58,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 12:36:02,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:36:02,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.37 vs. limit=12.0 2023-10-03 12:36:05,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:36:06,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:36:08,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 12:36:08,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:36:08,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:36:08,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:36:10,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:36:14,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1269100.0, ans=10.0 2023-10-03 12:36:15,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 12:36:17,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 12:36:20,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 12:36:20,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:36:20,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 12:36:20,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:36:23,136 INFO [train.py:1046] (1/4) Epoch 36, batch 4450, loss[loss=0.1938, simple_loss=0.2595, pruned_loss=0.06409, over 19368.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2385, pruned_loss=0.03983, over 4718100.87 frames. ], batch size: 388, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:36:23,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:36:24,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 12:36:28,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:36:31,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:36:32,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:36:38,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:36:38,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:36:41,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:36:43,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:36:44,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:36:44,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:36:46,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 12:36:46,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:36:47,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:36:47,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:36:47,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:36:50,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:36:56,046 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.913e+02 2.111e+02 2.301e+02 3.200e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 12:36:56,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:36:56,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:36:57,000 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.43 vs. limit=15.0 2023-10-03 12:36:57,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:36:57,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:36:59,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:37:02,551 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:37:03,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 12:37:05,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 12:37:06,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 12:37:06,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:37:08,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:37:09,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 12:37:13,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:37:15,710 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.92 vs. limit=22.5 2023-10-03 12:37:16,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:37:16,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 12:37:16,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:37:16,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:37:16,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:37:16,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:37:17,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:37:20,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 12:37:22,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 12:37:24,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:37:26,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:37:27,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:37:28,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:37:28,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 12:37:30,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:37:32,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 12:37:34,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:37:38,735 INFO [train.py:1046] (1/4) Epoch 36, batch 4500, loss[loss=0.1632, simple_loss=0.243, pruned_loss=0.04168, over 23420.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2394, pruned_loss=0.04052, over 4697022.23 frames. ], batch size: 93, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:37:40,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:37:41,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 12:37:41,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 12:37:43,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:37:46,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1269500.0, ans=0.125 2023-10-03 12:37:47,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1269500.0, ans=0.125 2023-10-03 12:37:49,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:37:49,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:37:50,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:37:52,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:37:52,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:37:53,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:37:55,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1269566.6666666667, ans=0.0 2023-10-03 12:37:55,779 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.61 vs. limit=22.5 2023-10-03 12:37:58,451 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.56 vs. limit=15.0 2023-10-03 12:38:02,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:38:03,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:38:05,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:38:06,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:38:06,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1269633.3333333333, ans=0.125 2023-10-03 12:38:08,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:38:13,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:38:18,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:38:23,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:38:25,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:38:25,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 12:38:26,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:38:27,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:38:29,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:38:30,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:38:30,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=1269700.0, ans=15.0 2023-10-03 12:38:32,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:38:32,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 12:38:32,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:38:32,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:38:36,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:38:36,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:38:40,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:38:42,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:38:42,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:38:44,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 12:38:45,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 12:38:45,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 12:38:48,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1269766.6666666667, ans=0.125 2023-10-03 12:38:49,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 12:38:52,190 INFO [train.py:1046] (1/4) Epoch 36, batch 4550, loss[loss=0.1442, simple_loss=0.2188, pruned_loss=0.03478, over 24312.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.238, pruned_loss=0.04048, over 4691429.93 frames. ], batch size: 56, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:38:52,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 12:38:52,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:38:56,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:38:57,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:39:00,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:39:04,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:39:07,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:39:08,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:39:08,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:39:08,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:13,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:39:13,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:39:16,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:39:17,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 12:39:19,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 12:39:19,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:39:21,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 12:39:24,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 12:39:25,973 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.879e+02 2.066e+02 2.299e+02 3.391e+02, threshold=4.132e+02, percent-clipped=0.0 2023-10-03 12:39:26,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:39:28,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 12:39:28,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:39:31,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:31,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:33,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:39:33,897 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.15 vs. limit=15.0 2023-10-03 12:39:36,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 12:39:37,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:39:40,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:40,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:39:43,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:39:43,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 12:39:44,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 12:39:44,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:39:46,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 12:39:48,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1270033.3333333333, ans=0.125 2023-10-03 12:39:49,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 12:39:49,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:39:50,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:39:50,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:39:50,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:50,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:39:52,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:39:52,993 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.66 vs. limit=12.0 2023-10-03 12:39:53,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 12:39:54,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:39:54,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 12:39:54,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 12:39:54,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:39:56,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 12:39:59,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:39:59,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:40:01,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:40:01,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:40:01,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1270100.0, ans=0.2 2023-10-03 12:40:02,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:40:02,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:40:03,726 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.21 vs. limit=22.5 2023-10-03 12:40:05,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:40:05,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1270166.6666666667, ans=0.0 2023-10-03 12:40:06,948 INFO [train.py:1046] (1/4) Epoch 36, batch 4600, loss[loss=0.1596, simple_loss=0.2357, pruned_loss=0.04173, over 23390.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2375, pruned_loss=0.03998, over 4693907.32 frames. ], batch size: 285, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:40:07,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:08,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:40:13,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:40:13,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:40:13,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:40:15,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 12:40:15,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:40:20,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:40:20,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:40:22,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:24,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1270233.3333333333, ans=0.07 2023-10-03 12:40:30,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 12:40:31,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:33,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:33,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1270233.3333333333, ans=0.0 2023-10-03 12:40:35,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1270300.0, ans=0.125 2023-10-03 12:40:36,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:40:37,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:40:41,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 12:40:41,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:40:43,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:40:48,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:48,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:40:49,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:40:51,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1270366.6666666667, ans=0.125 2023-10-03 12:40:52,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 12:40:54,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 12:40:58,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:00,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:41:02,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:02,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 12:41:04,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:41:04,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 12:41:06,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:06,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:41:07,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:08,659 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.42 vs. limit=6.0 2023-10-03 12:41:09,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:41:09,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:41:09,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 12:41:10,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 12:41:10,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 12:41:10,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:41:13,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:41:13,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:41:13,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:41:20,680 INFO [train.py:1046] (1/4) Epoch 36, batch 4650, loss[loss=0.1756, simple_loss=0.2307, pruned_loss=0.06024, over 18976.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2376, pruned_loss=0.03986, over 4711355.04 frames. ], batch size: 388, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:41:20,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1270500.0, ans=0.2 2023-10-03 12:41:24,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:41:26,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:41:27,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:41:27,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:41:27,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:41:27,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:41:30,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:41:33,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 12:41:34,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1270566.6666666667, ans=0.125 2023-10-03 12:41:38,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:41:39,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 12:41:39,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:41:41,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 12:41:41,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:41:41,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 12:41:42,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 12:41:42,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:42,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:41:46,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:41:47,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:41:47,450 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 12:41:51,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:41:52,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 12:41:54,102 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.890e+02 2.118e+02 2.487e+02 4.002e+02, threshold=4.237e+02, percent-clipped=0.0 2023-10-03 12:41:55,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:55,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:41:57,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 12:41:57,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:41:59,368 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.82 vs. limit=15.0 2023-10-03 12:41:59,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:42:02,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:42:07,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:42:09,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:42:10,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:42:10,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:42:10,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1270700.0, ans=0.2 2023-10-03 12:42:13,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 12:42:14,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 12:42:16,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 12:42:16,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 12:42:17,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:42:24,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:42:24,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:42:24,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 12:42:24,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:42:26,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:42:26,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:42:28,159 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.31 vs. limit=22.5 2023-10-03 12:42:28,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:42:30,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:42:30,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:42:31,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:42:33,006 INFO [train.py:1046] (1/4) Epoch 36, batch 4700, loss[loss=0.1664, simple_loss=0.2557, pruned_loss=0.03856, over 24473.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.238, pruned_loss=0.0395, over 4709150.95 frames. ], batch size: 69, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:42:35,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:42:35,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:42:35,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:42:37,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 12:42:38,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1270833.3333333333, ans=0.125 2023-10-03 12:42:39,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:42:39,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 12:42:41,187 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:42:46,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:42:48,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:42:48,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:42:49,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:42:51,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 12:42:54,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 12:42:55,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 12:42:56,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:42:58,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:42:58,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:42:58,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1270900.0, ans=0.0 2023-10-03 12:43:02,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:43:05,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1270966.6666666667, ans=0.1 2023-10-03 12:43:08,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:43:10,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 12:43:11,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:43:17,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 12:43:18,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1271033.3333333333, ans=0.125 2023-10-03 12:43:19,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:43:20,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:23,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 12:43:23,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1271033.3333333333, ans=0.0 2023-10-03 12:43:24,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:43:25,765 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.05 vs. limit=15.0 2023-10-03 12:43:28,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:43:30,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 12:43:31,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:31,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:43:34,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:43:35,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:43:35,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 12:43:37,359 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 12:43:39,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:43:42,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:42,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:42,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 12:43:42,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:46,293 INFO [train.py:1046] (1/4) Epoch 36, batch 4750, loss[loss=0.1703, simple_loss=0.2641, pruned_loss=0.03818, over 24563.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2394, pruned_loss=0.03981, over 4723092.19 frames. ], batch size: 71, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:43:47,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 12:43:50,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:43:52,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:43:54,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1271166.6666666667, ans=0.125 2023-10-03 12:43:55,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:43:56,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:43:58,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 12:43:58,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:43:58,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten.whitening_limit, batch_count=1271166.6666666667, ans=22.5 2023-10-03 12:44:00,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 12:44:02,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:44:02,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:44:03,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:44:08,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 12:44:14,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:44:15,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 12:44:16,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:44:20,882 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.859e+02 2.065e+02 2.331e+02 3.483e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-03 12:44:21,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:44:21,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:44:21,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:44:22,308 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 12:44:22,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 12:44:27,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 12:44:28,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:44:31,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:44:32,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:44:32,705 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 12:44:32,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:44:35,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:44:38,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:44:40,175 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.85 vs. limit=15.0 2023-10-03 12:44:41,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 12:44:41,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 12:44:41,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:44:41,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:44:41,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:44:44,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:44:44,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 12:44:47,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 12:44:47,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1271433.3333333333, ans=0.125 2023-10-03 12:44:48,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:44:52,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:44:52,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 12:44:52,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:44:52,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1271433.3333333333, ans=0.125 2023-10-03 12:44:53,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:44:55,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:44:56,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:44:56,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:44:58,861 INFO [train.py:1046] (1/4) Epoch 36, batch 4800, loss[loss=0.1676, simple_loss=0.2442, pruned_loss=0.04556, over 23890.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2393, pruned_loss=0.03988, over 4723128.68 frames. ], batch size: 196, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:45:00,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:45:00,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 12:45:01,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 12:45:01,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 12:45:04,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:45:04,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:45:04,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 12:45:07,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1271500.0, ans=0.95 2023-10-03 12:45:09,875 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=10.48 vs. limit=22.5 2023-10-03 12:45:09,881 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.34 vs. limit=15.0 2023-10-03 12:45:10,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:45:10,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:45:15,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:45:18,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:45:18,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:45:18,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 12:45:19,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1271566.6666666667, ans=0.125 2023-10-03 12:45:20,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:45:20,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:45:20,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:45:26,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:45:27,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:45:27,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:45:27,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:45:27,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 12:45:27,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:45:30,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:45:31,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:45:34,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:45:36,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:45:36,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:45:37,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:45:38,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:45:40,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 12:45:40,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 12:45:42,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:45:42,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:45:42,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:45:42,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:45:42,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:45:44,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:45:45,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:45:50,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:45:52,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:45:53,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:45:57,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 12:45:59,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:45:59,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:45:59,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:46:00,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:46:02,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1271766.6666666667, ans=0.0 2023-10-03 12:46:06,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:46:07,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:46:07,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:46:07,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:46:07,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:46:08,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:46:12,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:46:12,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:46:12,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:46:12,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 12:46:13,508 INFO [train.py:1046] (1/4) Epoch 36, batch 4850, loss[loss=0.1505, simple_loss=0.2151, pruned_loss=0.04299, over 22830.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2393, pruned_loss=0.04019, over 4724389.15 frames. ], batch size: 323, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:46:15,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 12:46:15,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:46:15,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:46:15,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1271833.3333333333, ans=0.015 2023-10-03 12:46:17,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:46:17,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:46:19,663 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.79 vs. limit=15.0 2023-10-03 12:46:20,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:46:26,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 12:46:27,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:46:30,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:46:32,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:46:32,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:46:34,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:46:34,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1271900.0, ans=0.1 2023-10-03 12:46:37,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:46:38,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:46:38,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 12:46:42,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:46:44,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:46:44,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:46:44,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:46:44,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1271966.6666666667, ans=0.1 2023-10-03 12:46:45,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 12:46:47,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1271966.6666666667, ans=0.1 2023-10-03 12:46:48,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:46:48,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:46:50,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1271966.6666666667, ans=0.09899494936611666 2023-10-03 12:46:52,112 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.942e+02 2.103e+02 2.430e+02 3.861e+02, threshold=4.206e+02, percent-clipped=0.0 2023-10-03 12:46:52,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:46:52,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 12:46:53,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 12:46:55,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:47:01,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:47:03,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 12:47:04,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:47:04,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:47:04,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=1272033.3333333333, ans=0.05 2023-10-03 12:47:05,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:47:08,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 12:47:08,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:47:10,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 12:47:10,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:47:11,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:47:13,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 12:47:19,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:47:24,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1272100.0, ans=0.0 2023-10-03 12:47:25,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:47:25,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:47:25,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1272100.0, ans=0.125 2023-10-03 12:47:28,156 INFO [train.py:1046] (1/4) Epoch 36, batch 4900, loss[loss=0.1724, simple_loss=0.2591, pruned_loss=0.04287, over 24334.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2379, pruned_loss=0.04013, over 4723220.06 frames. ], batch size: 77, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:47:30,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 12:47:30,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:47:33,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:47:35,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:47:35,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:47:39,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 12:47:45,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 12:47:48,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 12:47:48,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 12:47:50,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:47:50,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:47:50,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:47:50,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:47:50,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:47:50,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1272233.3333333333, ans=0.1 2023-10-03 12:47:50,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1272233.3333333333, ans=0.0 2023-10-03 12:47:52,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 12:47:55,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 12:47:55,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:47:56,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:47:56,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:47:57,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:47:59,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:48:00,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:48:00,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 12:48:02,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:48:03,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:48:03,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 12:48:03,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 12:48:06,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 12:48:09,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:48:09,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:48:09,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:48:10,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:48:10,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 12:48:10,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:48:12,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 12:48:13,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:48:16,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 12:48:16,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1272366.6666666667, ans=0.0 2023-10-03 12:48:18,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:48:21,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 12:48:23,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:48:23,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 12:48:24,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 12:48:27,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1272433.3333333333, ans=0.0 2023-10-03 12:48:30,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:48:31,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:48:32,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 12:48:34,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:48:34,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:48:35,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:48:40,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:48:40,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:48:40,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:48:40,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 12:48:41,769 INFO [train.py:1046] (1/4) Epoch 36, batch 4950, loss[loss=0.1611, simple_loss=0.2395, pruned_loss=0.04135, over 23467.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2367, pruned_loss=0.03969, over 4713950.77 frames. ], batch size: 93, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:48:41,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:48:43,990 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.91 vs. limit=15.0 2023-10-03 12:48:44,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:48:44,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:48:47,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 12:48:47,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 12:48:47,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:48:49,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 12:48:49,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:48:49,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:48:51,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:48:51,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:48:52,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1272500.0, ans=0.0 2023-10-03 12:48:53,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:48:53,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:48:55,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:48:56,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:48:59,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:48:59,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:48:59,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1272566.6666666667, ans=0.125 2023-10-03 12:49:03,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:49:07,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:49:09,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:49:11,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:49:12,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:13,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:49:13,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1272633.3333333333, ans=0.125 2023-10-03 12:49:13,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_ff2.min_abs, batch_count=1272633.3333333333, ans=0.1 2023-10-03 12:49:15,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 12:49:16,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 12:49:17,642 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.932e+02 2.177e+02 2.793e+02 4.403e+02, threshold=4.354e+02, percent-clipped=2.0 2023-10-03 12:49:17,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:19,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:49:19,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:49:21,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:49:21,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:49:21,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:49:21,851 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:49:23,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:49:25,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1272700.0, ans=0.125 2023-10-03 12:49:26,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:49:28,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:49:30,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:49:30,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:30,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 12:49:30,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:49:31,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:49:34,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:49:35,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:49:35,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:49:36,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:37,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:49:39,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:49:39,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:49:40,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:49:40,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:49:41,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 12:49:45,070 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:49:46,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:49:51,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 12:49:51,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 12:49:55,800 INFO [train.py:1046] (1/4) Epoch 36, batch 5000, loss[loss=0.1309, simple_loss=0.2115, pruned_loss=0.02511, over 24345.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2365, pruned_loss=0.03956, over 4719421.74 frames. ], batch size: 56, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:49:58,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:58,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:50:00,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 12:50:01,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 12:50:02,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:50:04,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 12:50:04,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:50:04,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:50:05,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 12:50:05,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:50:07,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:50:07,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 12:50:07,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:50:08,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:50:10,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 12:50:11,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 12:50:11,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:50:13,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 12:50:13,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:50:13,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:50:14,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:50:14,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 12:50:14,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 12:50:15,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 12:50:15,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:50:17,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:50:19,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 12:50:19,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:50:21,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:50:22,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:50:23,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1272900.0, ans=0.125 2023-10-03 12:50:24,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 12:50:25,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 12:50:25,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:50:28,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:50:31,354 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 12:50:34,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:50:36,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:50:36,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:50:40,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 12:50:41,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:50:43,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:50:43,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:50:44,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 12:50:44,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:50:46,460 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=15.45 vs. limit=15.0 2023-10-03 12:50:48,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:50:50,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:50:56,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 12:50:57,670 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.99 vs. limit=15.0 2023-10-03 12:50:59,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:51:02,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1273100.0, ans=0.0 2023-10-03 12:51:06,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:51:08,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:51:08,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:51:08,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:51:08,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:51:09,434 INFO [train.py:1046] (1/4) Epoch 36, batch 5050, loss[loss=0.165, simple_loss=0.2351, pruned_loss=0.0474, over 23638.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2375, pruned_loss=0.03968, over 4726847.41 frames. ], batch size: 256, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:51:09,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:51:09,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:51:12,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:51:12,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 12:51:12,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:51:15,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:51:16,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:51:16,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 12:51:18,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:51:18,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:51:21,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:51:21,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:51:21,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:51:33,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 12:51:33,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 12:51:33,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1273233.3333333333, ans=0.2 2023-10-03 12:51:34,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:51:36,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 12:51:36,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:51:37,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:51:38,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:51:40,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:51:40,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 12:51:40,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 12:51:43,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:51:43,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:51:46,456 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 1.880e+02 2.077e+02 2.440e+02 4.009e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-03 12:51:46,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:51:47,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 12:51:49,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:51:50,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 12:51:52,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:51:52,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1273366.6666666667, ans=0.07 2023-10-03 12:51:53,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:51:54,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:51:55,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:51:57,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:51:59,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:52:00,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:01,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:52:01,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:52:01,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 12:52:02,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:52:04,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:52:07,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:52:07,041 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 12:52:07,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 12:52:08,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:52:09,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:11,124 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 12:52:13,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:52:13,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 12:52:13,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:19,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:52:19,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:19,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 12:52:21,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 12:52:22,591 INFO [train.py:1046] (1/4) Epoch 36, batch 5100, loss[loss=0.1632, simple_loss=0.234, pruned_loss=0.04615, over 23762.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2382, pruned_loss=0.03981, over 4736353.16 frames. ], batch size: 150, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:52:22,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:52:22,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:52:24,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:52:24,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1273500.0, ans=0.0 2023-10-03 12:52:26,060 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 12:52:28,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:52:32,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 12:52:32,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 12:52:33,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:52:36,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:52:38,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1273566.6666666667, ans=0.125 2023-10-03 12:52:39,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:52:39,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 12:52:39,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 12:52:43,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:52:43,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:52:47,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:52:50,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 12:52:51,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:52:53,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:53,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 12:52:56,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:52:56,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:52:56,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 12:52:59,533 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 12:53:01,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:53:01,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 12:53:01,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 12:53:05,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:53:06,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1273700.0, ans=0.125 2023-10-03 12:53:08,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1273700.0, ans=0.1 2023-10-03 12:53:12,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:53:15,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 12:53:15,575 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 12:53:15,582 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 12:53:17,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 12:53:17,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:53:20,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 12:53:20,631 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:53:24,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 12:53:26,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:53:27,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:53:32,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 12:53:34,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 12:53:35,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 12:53:36,864 INFO [train.py:1046] (1/4) Epoch 36, batch 5150, loss[loss=0.155, simple_loss=0.2444, pruned_loss=0.03279, over 24540.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2389, pruned_loss=0.03996, over 4738163.54 frames. ], batch size: 71, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:53:39,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:53:39,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:53:39,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:53:41,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:53:42,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:53:42,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:53:43,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 12:53:43,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 12:53:43,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 12:53:43,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:53:43,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 12:53:45,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:53:46,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 12:53:48,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:53:49,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:53:53,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 12:53:53,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 12:53:55,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:53:55,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:53:57,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:53:57,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:53:57,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:53:57,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1273900.0, ans=0.1 2023-10-03 12:53:58,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:53:58,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:53:58,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 12:53:59,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1273900.0, ans=0.125 2023-10-03 12:54:00,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:54:01,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:54:04,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:54:06,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 12:54:08,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:54:11,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:54:12,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 12:54:13,777 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.870e+02 2.045e+02 2.275e+02 3.519e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-03 12:54:15,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:54:21,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:54:22,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:54:27,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:54:27,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:54:30,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 12:54:33,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:54:34,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:54:34,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:54:34,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1274100.0, ans=0.09899494936611666 2023-10-03 12:54:37,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:54:38,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:54:39,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 12:54:43,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:54:45,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:54:46,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:54:47,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:54:47,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 12:54:47,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:54:47,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:54:49,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:54:50,977 INFO [train.py:1046] (1/4) Epoch 36, batch 5200, loss[loss=0.1567, simple_loss=0.2445, pruned_loss=0.0345, over 24679.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2401, pruned_loss=0.04051, over 4718104.98 frames. ], batch size: 68, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:54:52,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:54:55,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:54:57,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:55:01,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 12:55:03,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:55:03,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:55:03,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1274166.6666666667, ans=0.0 2023-10-03 12:55:05,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:55:05,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:55:05,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:55:07,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 12:55:10,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:55:10,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:55:13,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 12:55:15,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:55:16,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:55:17,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 12:55:17,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 12:55:20,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 12:55:20,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:55:20,768 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 12:55:20,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:55:22,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:55:24,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:55:25,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 12:55:25,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:55:28,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:55:28,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 12:55:30,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 12:55:30,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 12:55:34,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 12:55:35,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:55:41,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:55:43,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:55:44,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 12:55:44,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:55:44,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 12:55:44,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:55:46,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:55:50,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:55:50,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:55:55,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:55:56,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:55:56,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:55:57,093 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.80 vs. limit=15.0 2023-10-03 12:56:02,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:56:03,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 12:56:05,243 INFO [train.py:1046] (1/4) Epoch 36, batch 5250, loss[loss=0.147, simple_loss=0.2131, pruned_loss=0.04044, over 23393.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2394, pruned_loss=0.04056, over 4716796.85 frames. ], batch size: 285, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:56:05,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:56:05,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:56:06,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:56:06,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 12:56:06,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:56:08,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:56:11,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:56:11,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1274500.0, ans=0.125 2023-10-03 12:56:12,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:56:14,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:56:18,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:56:19,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:56:22,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:56:24,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:56:25,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 12:56:25,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:56:27,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:56:38,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1274633.3333333333, ans=0.125 2023-10-03 12:56:39,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1274633.3333333333, ans=0.5 2023-10-03 12:56:40,461 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 1.965e+02 2.175e+02 2.587e+02 3.682e+02, threshold=4.349e+02, percent-clipped=0.0 2023-10-03 12:56:42,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1274633.3333333333, ans=0.1 2023-10-03 12:57:04,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1274766.6666666667, ans=0.0 2023-10-03 12:57:08,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1274766.6666666667, ans=0.0 2023-10-03 12:57:13,153 INFO [train.py:1046] (1/4) Epoch 36, batch 5300, loss[loss=0.1371, simple_loss=0.2177, pruned_loss=0.02821, over 21136.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2373, pruned_loss=0.03983, over 4718797.93 frames. ], batch size: 46, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:57:13,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1274833.3333333333, ans=0.2 2023-10-03 12:57:27,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:57:27,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 12:57:27,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 12:57:27,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:57:27,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:57:27,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:57:27,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:57:27,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:57:27,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:57:27,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:57:28,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 12:57:28,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:57:28,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 12:57:28,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 12:57:28,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 12:57:28,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:57:28,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 12:57:28,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 12:57:28,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:57:29,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:57:29,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:57:29,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:57:29,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:57:29,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:57:30,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:57:30,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:57:30,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:57:30,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:57:30,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:57:30,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:57:30,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:57:30,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 12:57:30,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:57:31,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:57:31,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 12:57:31,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 12:57:31,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:57:31,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:57:31,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 12:57:31,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 12:57:31,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:57:32,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:57:32,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:57:32,559 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 12:57:32,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 12:57:32,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:57:32,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:57:32,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 12:57:32,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 12:57:32,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 12:57:33,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:57:39,480 INFO [train.py:1046] (1/4) Epoch 37, batch 0, loss[loss=0.1622, simple_loss=0.2343, pruned_loss=0.04505, over 22869.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2343, pruned_loss=0.04505, over 22869.00 frames. ], batch size: 322, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 12:57:39,480 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 12:57:48,733 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([6.0005, 5.8517, 5.5372, 5.1633], device='cuda:1') 2023-10-03 12:57:51,190 INFO [train.py:1078] (1/4) Epoch 37, validation: loss=0.3206, simple_loss=0.2712, pruned_loss=0.185, over 1125622.00 frames. 2023-10-03 12:57:51,191 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 12:57:52,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 12:57:54,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:57:54,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1274913.3333333333, ans=0.0 2023-10-03 12:57:55,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:57:55,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1274913.3333333333, ans=0.2 2023-10-03 12:58:00,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:58:01,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:58:02,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:58:02,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 12:58:03,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 12:58:06,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:58:07,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:58:10,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:58:10,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1274980.0, ans=0.125 2023-10-03 12:58:11,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:58:11,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:58:12,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:58:12,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1274980.0, ans=0.125 2023-10-03 12:58:13,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 12:58:14,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:58:17,019 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.42 vs. limit=6.0 2023-10-03 12:58:22,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:58:22,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:58:23,494 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.14 vs. limit=15.0 2023-10-03 12:58:26,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 12:58:26,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1275046.6666666667, ans=0.0 2023-10-03 12:58:29,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:58:29,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:58:30,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:58:33,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1275113.3333333333, ans=0.0 2023-10-03 12:58:34,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:58:37,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:58:42,535 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.49 vs. limit=10.0 2023-10-03 12:58:42,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 12:58:45,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 12:58:47,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:58:47,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:58:48,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:58:49,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:58:52,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 12:58:53,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:58:55,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:58:55,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=1275180.0, ans=22.5 2023-10-03 12:58:59,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:58:59,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1275180.0, ans=0.2 2023-10-03 12:59:02,440 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 12:59:03,743 INFO [train.py:1046] (1/4) Epoch 37, batch 50, loss[loss=0.1597, simple_loss=0.229, pruned_loss=0.04522, over 23465.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2388, pruned_loss=0.0401, over 1062176.91 frames. ], batch size: 285, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 12:59:03,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:59:06,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:59:09,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:59:09,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 12:59:10,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:59:10,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:59:12,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:59:13,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:59:16,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:59:19,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 12:59:19,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:59:22,973 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.909e+02 2.063e+02 2.312e+02 4.693e+02, threshold=4.126e+02, percent-clipped=2.0 2023-10-03 12:59:25,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 12:59:25,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 12:59:27,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1275313.3333333333, ans=0.0 2023-10-03 12:59:28,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 12:59:30,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:59:31,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:59:31,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:59:33,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:59:34,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:59:34,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:59:34,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:59:41,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:59:44,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:59:44,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:59:45,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 12:59:47,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:59:48,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:59:48,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 12:59:48,533 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:59:49,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:59:51,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 12:59:59,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:59:59,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:00:01,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:00:02,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:00:02,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:00:05,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 13:00:07,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 13:00:07,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:00:08,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:00:08,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:00:08,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:00:08,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 13:00:10,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 13:00:11,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 13:00:12,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:12,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:00:14,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 13:00:14,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 13:00:15,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:15,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:00:15,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1275580.0, ans=0.0 2023-10-03 13:00:17,019 INFO [train.py:1046] (1/4) Epoch 37, batch 100, loss[loss=0.1574, simple_loss=0.2339, pruned_loss=0.04043, over 23875.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2403, pruned_loss=0.04045, over 1884133.93 frames. ], batch size: 195, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:00:17,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:00:17,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:00:19,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:00:23,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:00:26,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:00:27,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 13:00:27,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:00:29,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:00:30,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:00:30,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:00:30,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:00:30,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:00:32,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 13:00:34,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 13:00:34,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:35,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:00:35,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:00:38,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 13:00:39,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:40,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:00:42,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:00:43,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 13:00:45,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1275713.3333333333, ans=0.0 2023-10-03 13:00:48,047 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 13:00:48,069 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 13:00:49,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:00:49,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:00:55,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:00:57,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:58,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:00:58,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1275713.3333333333, ans=0.1 2023-10-03 13:01:03,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:04,827 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 13:01:06,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 13:01:09,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:01:10,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:01:13,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:15,675 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.00 vs. limit=15.0 2023-10-03 13:01:17,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:01:18,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:01:21,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:01:23,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:23,863 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.18 vs. limit=15.0 2023-10-03 13:01:25,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:01:25,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:01:25,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:01:26,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:28,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 13:01:28,272 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 13:01:28,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:01:28,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:01:29,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:29,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:01:29,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 13:01:31,006 INFO [train.py:1046] (1/4) Epoch 37, batch 150, loss[loss=0.2099, simple_loss=0.2794, pruned_loss=0.07021, over 19160.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2409, pruned_loss=0.04076, over 2511532.72 frames. ], batch size: 388, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:01:31,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 13:01:31,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 13:01:31,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:31,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1275913.3333333333, ans=0.1 2023-10-03 13:01:32,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:01:33,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:01:33,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:01:35,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:01:35,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1275913.3333333333, ans=0.1 2023-10-03 13:01:37,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:01:40,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:01:40,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:01:41,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:43,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:01:43,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:46,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:01:46,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:48,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 13:01:48,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 13:01:48,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 13:01:50,627 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.933e+02 2.120e+02 2.360e+02 3.157e+02, threshold=4.239e+02, percent-clipped=0.0 2023-10-03 13:01:52,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:01:52,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:01:53,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:01:55,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:55,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:01:57,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:57,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:59,948 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 13:02:01,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:02:05,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:02:08,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:02:09,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 13:02:11,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:02:13,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:02:13,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:02:15,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:02:17,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:02:18,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:02:19,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:02:19,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 13:02:24,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:02:26,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:02:26,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:02:26,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:02:29,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:02:31,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 13:02:33,290 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1276180.0, ans=0.0 2023-10-03 13:02:33,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1276180.0, ans=0.5 2023-10-03 13:02:34,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:02:35,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:02:35,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:02:37,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:02:37,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 13:02:39,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:02:39,179 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 13:02:42,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:02:45,196 INFO [train.py:1046] (1/4) Epoch 37, batch 200, loss[loss=0.1439, simple_loss=0.2274, pruned_loss=0.03025, over 24318.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2418, pruned_loss=0.04132, over 3003624.08 frames. ], batch size: 61, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:02:45,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:02:46,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:02:49,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 13:02:49,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1276246.6666666667, ans=0.0 2023-10-03 13:02:50,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:02:50,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:02:53,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 13:02:53,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1276246.6666666667, ans=0.125 2023-10-03 13:02:55,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:02:56,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:02:58,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:02:59,561 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.87 vs. limit=10.0 2023-10-03 13:03:01,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:03:01,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:03:01,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:03:07,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1276313.3333333333, ans=0.125 2023-10-03 13:03:20,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:03:21,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:03:21,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:03:22,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:03:24,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 13:03:24,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 13:03:24,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:03:25,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:03:25,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:03:25,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:03:28,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 13:03:28,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 13:03:29,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:03:33,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:03:37,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:03:44,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:03:44,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:03:50,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:03:52,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 13:03:54,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:03:54,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:03:54,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:03:54,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:03:56,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 13:03:56,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:03:57,746 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 13:03:59,124 INFO [train.py:1046] (1/4) Epoch 37, batch 250, loss[loss=0.1561, simple_loss=0.2341, pruned_loss=0.03901, over 23758.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2401, pruned_loss=0.04077, over 3367041.81 frames. ], batch size: 179, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:03:59,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:04:00,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:04:02,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:04:03,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:04:05,369 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.16 vs. limit=15.0 2023-10-03 13:04:06,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:04:06,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:04:07,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:04:07,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1276580.0, ans=0.125 2023-10-03 13:04:09,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:04:19,365 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.868e+02 1.988e+02 2.171e+02 2.742e+02, threshold=3.975e+02, percent-clipped=0.0 2023-10-03 13:04:20,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:04:23,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:04:23,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:04:30,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:04:31,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:04:31,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:04:32,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:04:34,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 13:04:34,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:04:35,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:04:37,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:04:39,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 13:04:39,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:04:39,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:04:41,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:04:41,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:04:41,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:04:41,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:04:41,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:04:44,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:04:44,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:04:44,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:04:48,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:04:51,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1276780.0, ans=0.125 2023-10-03 13:04:52,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:04:56,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:05:00,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:05:02,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:05:06,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 13:05:08,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:05:08,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:05:09,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 13:05:09,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:05:11,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:05:11,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 13:05:12,715 INFO [train.py:1046] (1/4) Epoch 37, batch 300, loss[loss=0.1606, simple_loss=0.2524, pruned_loss=0.03442, over 24440.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2383, pruned_loss=0.04024, over 3668305.83 frames. ], batch size: 69, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:05:16,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:05:16,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:05:20,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:05:20,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 13:05:22,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:05:23,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:05:23,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 13:05:23,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:05:26,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 13:05:29,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:05:29,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 13:05:32,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 13:05:34,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:05:36,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:05:38,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:05:38,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 13:05:38,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:05:39,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:05:43,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:05:43,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:05:47,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1277046.6666666667, ans=0.125 2023-10-03 13:05:48,549 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.54 vs. limit=15.0 2023-10-03 13:05:49,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 13:05:49,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 13:05:50,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:05:51,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:05:54,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 13:05:54,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:05:59,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:05:59,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1277113.3333333333, ans=0.125 2023-10-03 13:06:01,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:06:01,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 13:06:06,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:06:06,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:06:09,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:06:10,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:06:10,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 13:06:10,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 13:06:12,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:06:13,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 13:06:15,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:06:15,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:16,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:06:18,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:06:18,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:25,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:06:25,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 13:06:26,766 INFO [train.py:1046] (1/4) Epoch 37, batch 350, loss[loss=0.1439, simple_loss=0.2236, pruned_loss=0.03216, over 24308.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2365, pruned_loss=0.0398, over 3900013.50 frames. ], batch size: 61, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:06:26,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:30,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1277246.6666666667, ans=0.125 2023-10-03 13:06:33,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:06:36,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:06:36,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:38,290 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.53 vs. limit=15.0 2023-10-03 13:06:39,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 13:06:41,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:06:41,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 13:06:44,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:46,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 13:06:48,107 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.877e+02 2.152e+02 2.421e+02 3.801e+02, threshold=4.304e+02, percent-clipped=0.0 2023-10-03 13:06:48,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:06:49,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 13:06:49,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1277313.3333333333, ans=0.0 2023-10-03 13:06:51,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:06:52,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:06:53,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:06:55,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:06:55,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:06:55,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:06:55,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:06:56,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:06:56,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1277380.0, ans=0.125 2023-10-03 13:06:57,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:06:57,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:57,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1277380.0, ans=0.1 2023-10-03 13:07:06,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:07:06,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:07:06,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:07:06,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:07:11,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 13:07:11,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:07:12,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1277446.6666666667, ans=0.125 2023-10-03 13:07:14,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:07:14,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:07:14,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:07:16,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 13:07:18,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:07:19,701 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 13:07:21,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 13:07:21,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:07:21,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1277446.6666666667, ans=0.125 2023-10-03 13:07:24,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:07:24,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 13:07:28,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:07:30,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:07:32,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:07:32,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:07:32,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:07:35,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:07:38,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1277513.3333333333, ans=0.0 2023-10-03 13:07:39,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:07:40,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:07:41,934 INFO [train.py:1046] (1/4) Epoch 37, batch 400, loss[loss=0.1523, simple_loss=0.2409, pruned_loss=0.03185, over 24634.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2363, pruned_loss=0.03949, over 4088279.18 frames. ], batch size: 68, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:07:42,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 13:07:42,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:07:42,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:07:43,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:07:44,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:07:47,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:07:49,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:07:50,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 13:07:52,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 13:07:52,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:07:53,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 13:07:53,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:07:59,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:07:59,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:07:59,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 13:07:59,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:07:59,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:07:59,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:08:00,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1277646.6666666667, ans=0.125 2023-10-03 13:08:01,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:08:03,874 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 13:08:03,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 13:08:08,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:08:09,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:08:09,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 13:08:09,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 13:08:12,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:08:12,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1277713.3333333333, ans=0.2 2023-10-03 13:08:13,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:08:22,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 13:08:25,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:08:25,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 13:08:27,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:08:30,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:08:30,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 13:08:33,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1277780.0, ans=0.125 2023-10-03 13:08:34,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:08:37,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 13:08:39,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:08:41,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:08:41,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 13:08:44,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 13:08:45,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 13:08:47,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:08:47,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:08:48,211 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.57 vs. limit=12.0 2023-10-03 13:08:51,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 13:08:52,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:08:54,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:08:54,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:08:55,622 INFO [train.py:1046] (1/4) Epoch 37, batch 450, loss[loss=0.1345, simple_loss=0.2186, pruned_loss=0.02523, over 21956.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2369, pruned_loss=0.03941, over 4239041.08 frames. ], batch size: 48, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:08:55,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 13:08:55,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:08:57,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:08:57,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:08:57,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 13:08:58,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:09:00,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:09:01,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:09:05,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1277913.3333333333, ans=0.025 2023-10-03 13:09:06,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1277913.3333333333, ans=0.125 2023-10-03 13:09:09,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:09:10,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:09:11,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 13:09:11,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 13:09:15,842 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.670e+02 1.823e+02 2.027e+02 2.290e+02 3.468e+02, threshold=4.055e+02, percent-clipped=0.0 2023-10-03 13:09:17,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:09:20,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:09:21,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:09:23,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1278046.6666666667, ans=0.0 2023-10-03 13:09:27,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:09:27,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:09:28,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 13:09:30,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 13:09:31,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 13:09:31,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:09:33,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:09:33,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:09:34,293 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.65 vs. limit=10.0 2023-10-03 13:09:34,925 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 13:09:34,941 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 13:09:36,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:09:37,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:09:39,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 13:09:42,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 13:09:42,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:09:43,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 13:09:43,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1278113.3333333333, ans=0.0 2023-10-03 13:09:44,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 13:09:47,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:09:49,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:09:50,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:09:52,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 13:09:52,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1278180.0, ans=0.0 2023-10-03 13:09:55,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:09:56,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 13:09:57,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 13:09:59,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:10:05,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:10:06,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:10:08,436 INFO [train.py:1046] (1/4) Epoch 37, batch 500, loss[loss=0.1578, simple_loss=0.2481, pruned_loss=0.0338, over 24453.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2375, pruned_loss=0.03956, over 4344565.05 frames. ], batch size: 69, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:10:08,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:10:08,571 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 13:10:12,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:10:12,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:10:12,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:10:12,573 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 13:10:15,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 13:10:15,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:10:18,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 13:10:22,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 13:10:23,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:10:27,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:10:27,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:10:27,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:10:35,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:10:35,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:10:37,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 13:10:37,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:10:37,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 13:10:37,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:10:41,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:10:42,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:10:42,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:10:42,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:10:43,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 13:10:45,202 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 13:10:47,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:10:49,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:10:50,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:10:50,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:10:50,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:10:53,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 13:10:53,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1278446.6666666667, ans=0.125 2023-10-03 13:10:56,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:10:56,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:11:00,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:11:03,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:11:07,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1278513.3333333333, ans=0.125 2023-10-03 13:11:08,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:11:11,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 13:11:11,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:11:11,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:11:15,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 13:11:15,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:11:17,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:11:21,324 INFO [train.py:1046] (1/4) Epoch 37, batch 550, loss[loss=0.1368, simple_loss=0.2067, pruned_loss=0.03339, over 14593.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2382, pruned_loss=0.04, over 4405972.47 frames. ], batch size: 31, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:11:22,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 13:11:24,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 13:11:24,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:11:24,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 13:11:26,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:11:26,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:11:26,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:26,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:26,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:11:28,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:11:29,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:11:29,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1278580.0, ans=0.0 2023-10-03 13:11:30,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 13:11:30,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:11:35,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:11:36,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:41,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:11:41,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:41,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1278646.6666666667, ans=0.0 2023-10-03 13:11:42,925 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.845e+02 2.021e+02 2.216e+02 2.763e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-03 13:11:45,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 13:11:47,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 13:11:48,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:11:51,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:11:52,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:11:54,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:11:54,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1278713.3333333333, ans=0.1 2023-10-03 13:11:54,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1278713.3333333333, ans=0.125 2023-10-03 13:11:57,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:11:57,569 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 13:11:58,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:58,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 13:12:01,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:12:02,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:12:03,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:12:04,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:12:05,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 13:12:06,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 13:12:07,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:12:07,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:12:09,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:12:09,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:12:10,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:12:12,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:12:15,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:12:15,889 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.43 vs. limit=15.0 2023-10-03 13:12:16,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:12:16,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 13:12:18,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:12:19,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:12:19,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:12:21,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:12:21,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 13:12:21,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 13:12:28,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 13:12:32,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 13:12:34,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:12:34,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:12:35,428 INFO [train.py:1046] (1/4) Epoch 37, batch 600, loss[loss=0.1447, simple_loss=0.2072, pruned_loss=0.04106, over 23610.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2392, pruned_loss=0.04032, over 4469619.07 frames. ], batch size: 256, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:12:35,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:12:39,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1278913.3333333333, ans=0.0 2023-10-03 13:12:42,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:12:43,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 13:12:45,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 13:12:47,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:12:47,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:12:49,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1278980.0, ans=0.0 2023-10-03 13:12:50,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:12:51,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 13:12:52,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:12:58,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 13:13:00,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:13:00,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:13:02,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:13:07,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:13:07,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:13:07,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:13:16,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:13:21,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:13:23,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:13:23,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:13:30,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 13:13:30,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1279113.3333333333, ans=0.0 2023-10-03 13:13:34,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 13:13:34,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:13:34,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1279180.0, ans=0.125 2023-10-03 13:13:38,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 13:13:38,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:13:42,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 13:13:42,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:13:44,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:13:46,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.60 vs. limit=15.0 2023-10-03 13:13:49,930 INFO [train.py:1046] (1/4) Epoch 37, batch 650, loss[loss=0.1439, simple_loss=0.2213, pruned_loss=0.03328, over 20672.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2385, pruned_loss=0.03994, over 4528332.21 frames. ], batch size: 45, lr: 2.77e-03, grad_scale: 8.0 2023-10-03 13:13:50,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 13:13:52,314 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.40 vs. limit=22.5 2023-10-03 13:13:52,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 13:13:54,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:13:54,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=1279246.6666666667, ans=10.0 2023-10-03 13:13:54,852 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.08 vs. limit=15.0 2023-10-03 13:13:55,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:13:56,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:00,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 13:14:00,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:14:04,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:14:04,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:14:08,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:14:12,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 13:14:13,240 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.868e+02 2.019e+02 2.279e+02 4.165e+02, threshold=4.037e+02, percent-clipped=1.0 2023-10-03 13:14:13,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:14:15,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:14:15,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=1279313.3333333333, ans=6.0 2023-10-03 13:14:18,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:14:19,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 13:14:21,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:14:22,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:22,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 13:14:23,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:25,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:14:26,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:14:26,570 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 13:14:26,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:14:28,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:14:31,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:31,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:14:32,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:14:32,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:14:33,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 13:14:35,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:14:35,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:14:35,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:14:35,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:14:36,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:14:38,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 13:14:41,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 13:14:41,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:41,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:14:41,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:14:42,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:14:44,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:14:44,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1279446.6666666667, ans=0.125 2023-10-03 13:14:48,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:50,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:14:51,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:14:53,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1279513.3333333333, ans=0.0 2023-10-03 13:14:54,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:14:54,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:14:55,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:15:02,711 INFO [train.py:1046] (1/4) Epoch 37, batch 700, loss[loss=0.1539, simple_loss=0.2299, pruned_loss=0.03895, over 23897.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2379, pruned_loss=0.03989, over 4562827.66 frames. ], batch size: 195, lr: 2.77e-03, grad_scale: 8.0 2023-10-03 13:15:02,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:15:02,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:15:02,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:15:04,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:15:07,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 13:15:07,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 13:15:09,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 13:15:12,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:15:13,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:15:14,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 13:15:18,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:15:20,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:15:23,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:15:23,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:15:25,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:15:28,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:15:31,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 13:15:31,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:15:33,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 13:15:34,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 13:15:35,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1279713.3333333333, ans=0.125 2023-10-03 13:15:38,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:15:38,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:15:38,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1279713.3333333333, ans=0.125 2023-10-03 13:15:39,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:15:44,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:15:45,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 13:15:51,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:15:51,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:15:51,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 13:15:53,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1279780.0, ans=0.125 2023-10-03 13:15:55,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:15:57,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:16:00,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:16:01,225 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.18 vs. limit=15.0 2023-10-03 13:16:06,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:16:06,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 13:16:10,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 13:16:10,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 13:16:11,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:16:15,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:16:15,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:16:16,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1279913.3333333333, ans=0.125 2023-10-03 13:16:16,835 INFO [train.py:1046] (1/4) Epoch 37, batch 750, loss[loss=0.165, simple_loss=0.2406, pruned_loss=0.04472, over 23614.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2376, pruned_loss=0.03959, over 4606944.41 frames. ], batch size: 256, lr: 2.77e-03, grad_scale: 8.0 2023-10-03 13:16:16,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:16:16,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 13:16:21,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 13:16:21,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 13:16:21,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 13:16:22,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 13:16:22,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 13:16:24,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:16:24,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 13:16:25,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:16:26,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:16:28,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:16:28,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1279913.3333333333, ans=0.0 2023-10-03 13:16:31,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:16:31,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:16:32,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:16:37,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:16:38,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:16:38,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1279980.0, ans=0.125 2023-10-03 13:16:40,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:16:42,003 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.819e+02 1.975e+02 2.168e+02 3.086e+02, threshold=3.950e+02, percent-clipped=0.0 2023-10-03 13:16:42,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:16:42,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:16:43,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 13:16:44,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:16:44,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:16:48,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:16:50,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:16:50,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 13:16:50,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:16:51,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 13:16:51,616 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 13:16:53,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 13:16:53,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:16:53,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 13:16:54,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:16:59,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1280046.6666666667, ans=0.1 2023-10-03 13:17:01,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:17:01,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:17:01,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:17:05,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:17:06,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:17:07,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 13:17:08,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:17:08,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 13:17:09,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:17:13,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:17:13,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 13:17:14,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:17:19,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:17:20,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1280180.0, ans=0.0 2023-10-03 13:17:21,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:17:21,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:17:22,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:17:26,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 13:17:26,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:17:27,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:17:30,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:17:30,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:17:30,719 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.40 vs. limit=15.0 2023-10-03 13:17:30,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=1280180.0, ans=6.0 2023-10-03 13:17:32,785 INFO [train.py:1046] (1/4) Epoch 37, batch 800, loss[loss=0.1689, simple_loss=0.2532, pruned_loss=0.04234, over 24387.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2378, pruned_loss=0.0396, over 4635576.30 frames. ], batch size: 77, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:17:32,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:17:32,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:17:33,559 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.26 vs. limit=15.0 2023-10-03 13:17:41,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:17:41,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:17:43,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:17:43,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:17:44,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:17:44,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:17:47,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:17:48,785 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.72 vs. limit=15.0 2023-10-03 13:17:50,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:17:50,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:17:51,987 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.74 vs. limit=5.0 2023-10-03 13:17:55,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 13:17:55,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:17:56,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:17:56,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:17:56,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:17:56,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 13:17:58,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:17:58,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 13:18:01,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:18:03,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:18:05,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:18:06,502 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.19 vs. limit=15.0 2023-10-03 13:18:06,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:18:08,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:18:08,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:18:12,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:18:12,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:18:13,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 13:18:14,569 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.57 vs. limit=6.0 2023-10-03 13:18:15,487 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 13:18:15,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 13:18:15,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:18:15,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:18:15,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1280446.6666666667, ans=0.2 2023-10-03 13:18:17,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:18:18,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:18:24,170 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.11 vs. limit=15.0 2023-10-03 13:18:24,821 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 13:18:24,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 13:18:26,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:18:27,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:18:30,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:18:33,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:18:35,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 13:18:36,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:18:38,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1280513.3333333333, ans=0.1 2023-10-03 13:18:39,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 13:18:41,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1280513.3333333333, ans=0.2 2023-10-03 13:18:41,486 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.17 vs. limit=15.0 2023-10-03 13:18:46,227 INFO [train.py:1046] (1/4) Epoch 37, batch 850, loss[loss=0.1605, simple_loss=0.2522, pruned_loss=0.03435, over 24479.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2388, pruned_loss=0.03974, over 4660068.35 frames. ], batch size: 69, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:18:46,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:18:46,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1280580.0, ans=0.125 2023-10-03 13:18:48,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:18:49,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 13:18:49,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:18:52,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:18:53,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 13:18:53,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:18:54,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:18:55,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:18:57,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:18:58,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:18:59,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 13:19:01,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 13:19:01,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 13:19:02,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:19:02,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:19:05,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:19:05,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:19:07,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:19:07,695 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.37 vs. limit=15.0 2023-10-03 13:19:10,505 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.463e+02 1.795e+02 1.931e+02 2.229e+02 3.193e+02, threshold=3.862e+02, percent-clipped=0.0 2023-10-03 13:19:10,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:19:12,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:19:12,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 13:19:16,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 13:19:20,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:19:20,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 13:19:23,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 13:19:25,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 13:19:27,151 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 13:19:27,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:19:27,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:19:27,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 13:19:31,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:19:31,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:19:31,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 13:19:32,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:19:33,077 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:19:34,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:19:35,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:19:35,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:19:37,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:19:39,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 13:19:40,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 13:19:42,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:19:42,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:19:43,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:19:43,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:19:44,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:19:46,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:19:47,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:19:49,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:19:51,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:19:52,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:19:59,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:20:01,122 INFO [train.py:1046] (1/4) Epoch 37, batch 900, loss[loss=0.1511, simple_loss=0.2422, pruned_loss=0.03002, over 24649.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2394, pruned_loss=0.03991, over 4674932.72 frames. ], batch size: 65, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:20:01,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:20:01,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 13:20:01,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:20:01,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:20:02,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 13:20:04,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1280913.3333333333, ans=0.125 2023-10-03 13:20:07,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1280913.3333333333, ans=0.125 2023-10-03 13:20:09,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:20:09,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1280913.3333333333, ans=0.0 2023-10-03 13:20:13,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:20:13,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 13:20:16,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:20:17,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 13:20:17,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 13:20:17,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:20:17,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:20:19,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:20:19,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:20:24,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1280980.0, ans=0.125 2023-10-03 13:20:31,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:20:31,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:20:31,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:20:32,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:20:37,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 13:20:40,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:20:43,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1281046.6666666667, ans=0.0 2023-10-03 13:20:44,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:20:45,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:20:46,048 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 13:20:47,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 13:20:53,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:20:53,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:20:53,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:21:00,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:21:00,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:21:02,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 13:21:02,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:21:03,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 13:21:05,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:21:05,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:21:06,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:21:06,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:21:10,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1281180.0, ans=0.125 2023-10-03 13:21:11,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 13:21:11,617 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 13:21:11,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 13:21:11,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 13:21:15,632 INFO [train.py:1046] (1/4) Epoch 37, batch 950, loss[loss=0.1491, simple_loss=0.2264, pruned_loss=0.03589, over 23529.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2392, pruned_loss=0.04002, over 4687747.72 frames. ], batch size: 134, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:21:15,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:21:18,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1281246.6666666667, ans=0.125 2023-10-03 13:21:19,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 13:21:22,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:21:26,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:21:26,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:21:26,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 13:21:28,964 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 13:21:31,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:21:31,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:21:33,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:21:34,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:21:34,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 13:21:36,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 13:21:36,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:21:38,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 13:21:38,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:21:39,980 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.894e+02 2.023e+02 2.225e+02 2.799e+02, threshold=4.046e+02, percent-clipped=0.0 2023-10-03 13:21:44,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:21:44,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:21:44,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:21:44,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=1281380.0, ans=10.0 2023-10-03 13:21:45,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 13:21:47,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 13:21:48,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:21:50,286 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.64 vs. limit=15.0 2023-10-03 13:21:51,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:21:56,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:21:56,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:22:00,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 13:22:02,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 13:22:02,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:22:03,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:22:03,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:22:03,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:22:06,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 13:22:06,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:22:10,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:22:10,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:22:10,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 13:22:10,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:22:10,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:22:11,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 13:22:14,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:22:17,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:22:18,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1281513.3333333333, ans=0.125 2023-10-03 13:22:21,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:22:22,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 13:22:22,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 13:22:27,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:22:30,584 INFO [train.py:1046] (1/4) Epoch 37, batch 1000, loss[loss=0.1517, simple_loss=0.2165, pruned_loss=0.04345, over 22847.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2383, pruned_loss=0.03994, over 4684897.06 frames. ], batch size: 322, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:22:30,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 13:22:31,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:22:38,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:22:38,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1281580.0, ans=0.0 2023-10-03 13:22:38,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1281580.0, ans=0.0 2023-10-03 13:22:39,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 13:22:39,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 13:22:44,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:22:44,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:22:45,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:22:46,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 13:22:51,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 13:22:52,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 13:22:52,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:22:54,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 13:22:56,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 13:22:57,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 13:22:59,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:22:59,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:08,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:23:09,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:23:10,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:11,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:23:11,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 13:23:11,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:23:11,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:23:11,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:23:13,206 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 13:23:16,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1281780.0, ans=0.125 2023-10-03 13:23:18,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 13:23:18,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 13:23:19,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 13:23:22,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:23:28,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:28,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:23:30,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:31,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:23:31,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 13:23:34,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:23:34,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 13:23:35,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 13:23:35,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:23:37,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:23:38,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:23:41,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:23:44,298 INFO [train.py:1046] (1/4) Epoch 37, batch 1050, loss[loss=0.1598, simple_loss=0.2491, pruned_loss=0.03526, over 24628.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2375, pruned_loss=0.03954, over 4686903.79 frames. ], batch size: 73, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:23:44,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:23:47,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:23:47,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:23:48,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1281913.3333333333, ans=0.2 2023-10-03 13:23:49,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 13:23:49,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:50,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:23:53,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:23:56,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:23:58,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:23:58,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:23:59,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:23:59,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:24:00,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1281980.0, ans=0.125 2023-10-03 13:24:01,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 13:24:02,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:24:02,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 13:24:03,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1281980.0, ans=0.0 2023-10-03 13:24:05,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:24:05,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 13:24:05,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:24:08,990 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.790e+02 1.943e+02 2.142e+02 2.975e+02, threshold=3.887e+02, percent-clipped=0.0 2023-10-03 13:24:13,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:24:14,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:24:14,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:24:16,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 13:24:16,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 13:24:16,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:24:16,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1282046.6666666667, ans=0.07 2023-10-03 13:24:19,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 13:24:22,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 13:24:22,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:24:25,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 13:24:27,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 13:24:28,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:24:28,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:24:32,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:24:37,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 13:24:38,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 13:24:38,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 13:24:38,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:24:38,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:24:40,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 13:24:44,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:24:45,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:24:45,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:24:47,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:24:47,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:24:50,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:24:50,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 13:24:50,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1282180.0, ans=0.125 2023-10-03 13:24:53,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:24:53,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 13:24:53,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 13:24:55,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:24:55,966 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.74 vs. limit=12.0 2023-10-03 13:24:58,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:24:59,840 INFO [train.py:1046] (1/4) Epoch 37, batch 1100, loss[loss=0.1608, simple_loss=0.2347, pruned_loss=0.04339, over 23728.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2369, pruned_loss=0.03954, over 4692120.63 frames. ], batch size: 164, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:25:02,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:25:07,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:25:08,108 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.15 vs. limit=22.5 2023-10-03 13:25:08,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:25:09,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:25:09,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 13:25:11,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:25:14,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:25:15,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:25:18,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:25:18,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 13:25:18,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 13:25:20,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:25:20,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:25:23,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:25:24,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:25:30,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:25:33,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 13:25:35,250 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 13:25:35,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:25:37,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:25:39,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:25:39,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:25:41,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 13:25:42,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:25:42,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:25:42,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:25:42,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:25:42,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1282446.6666666667, ans=0.125 2023-10-03 13:25:43,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 13:25:49,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:25:49,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 13:25:52,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:25:56,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:25:59,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 13:25:59,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 13:26:02,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:26:03,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:26:03,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:26:05,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 13:26:06,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:26:06,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:26:07,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 13:26:09,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:26:09,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 13:26:10,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:26:11,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:26:12,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:26:13,799 INFO [train.py:1046] (1/4) Epoch 37, batch 1150, loss[loss=0.1376, simple_loss=0.2162, pruned_loss=0.02955, over 24312.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2377, pruned_loss=0.03959, over 4705816.95 frames. ], batch size: 56, lr: 2.76e-03, grad_scale: 8.0 2023-10-03 13:26:15,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:26:16,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:26:18,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:26:18,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:26:19,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 13:26:19,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:26:21,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 13:26:22,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:26:22,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:26:27,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 13:26:30,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:26:34,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:26:34,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:26:36,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 13:26:36,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:26:36,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:26:39,415 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.859e+02 2.016e+02 2.195e+02 3.712e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-03 13:26:42,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 13:26:43,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:26:43,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:26:52,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:26:58,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:27:00,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 13:27:00,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:27:00,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:27:07,443 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 13:27:10,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:27:13,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1282846.6666666667, ans=0.0 2023-10-03 13:27:15,404 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.38 vs. limit=22.5 2023-10-03 13:27:16,270 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 13:27:17,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:27:19,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:27:19,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:27:20,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:27:21,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:27:26,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:27:26,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:27:28,279 INFO [train.py:1046] (1/4) Epoch 37, batch 1200, loss[loss=0.1817, simple_loss=0.2536, pruned_loss=0.05493, over 23826.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2384, pruned_loss=0.03996, over 4700820.77 frames. ], batch size: 179, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:27:29,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:27:29,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:27:29,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:27:33,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:27:34,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:27:35,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:27:35,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:27:38,749 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 13:27:42,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 13:27:42,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1282980.0, ans=0.125 2023-10-03 13:27:44,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:27:47,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:27:50,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:27:51,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:27:51,722 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 13:27:51,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:27:58,101 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.10 vs. limit=22.5 2023-10-03 13:28:01,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:28:01,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:28:01,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 13:28:03,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:28:05,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 13:28:09,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 13:28:09,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:28:10,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:28:10,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:28:12,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:28:15,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:28:15,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:28:16,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:28:16,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 13:28:16,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:28:18,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:28:18,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:28:20,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1283113.3333333333, ans=0.05 2023-10-03 13:28:21,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:28:21,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:28:23,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 13:28:26,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:28:29,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 13:28:31,748 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 13:28:35,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:28:36,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:28:37,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:28:39,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:28:42,544 INFO [train.py:1046] (1/4) Epoch 37, batch 1250, loss[loss=0.1388, simple_loss=0.2175, pruned_loss=0.03007, over 24410.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2396, pruned_loss=0.04029, over 4705540.31 frames. ], batch size: 58, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:28:42,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 13:28:45,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:28:46,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:28:46,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 13:28:49,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:28:49,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:28:52,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 13:28:53,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:28:53,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:28:54,759 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.74 vs. limit=10.0 2023-10-03 13:28:55,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:28:56,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:29:00,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 13:29:00,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:29:01,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:29:01,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:29:03,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:29:06,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:29:06,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:29:07,849 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.893e+02 2.075e+02 2.273e+02 3.186e+02, threshold=4.149e+02, percent-clipped=0.0 2023-10-03 13:29:11,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 13:29:11,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1283380.0, ans=0.1 2023-10-03 13:29:12,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:29:15,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:29:16,238 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.53 vs. limit=12.0 2023-10-03 13:29:16,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 13:29:16,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:29:16,903 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 13:29:18,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:29:18,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:29:21,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:29:23,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:29:23,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:29:25,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 13:29:25,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 13:29:25,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 13:29:26,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1283446.6666666667, ans=0.0 2023-10-03 13:29:28,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:29:29,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 13:29:29,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:29:31,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 13:29:31,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:29:33,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 13:29:33,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 13:29:33,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:29:34,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 13:29:34,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:29:36,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 13:29:39,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:29:40,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:29:42,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:29:43,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1283513.3333333333, ans=0.125 2023-10-03 13:29:46,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:29:49,239 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:29:50,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:29:50,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 13:29:55,657 INFO [train.py:1046] (1/4) Epoch 37, batch 1300, loss[loss=0.1519, simple_loss=0.2401, pruned_loss=0.03182, over 24432.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2399, pruned_loss=0.04032, over 4705243.44 frames. ], batch size: 63, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:29:55,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:29:55,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 13:29:57,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:29:58,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:30:01,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:30:01,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 13:30:05,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:30:05,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:30:07,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 13:30:12,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:30:14,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:30:15,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:30:17,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:30:19,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:30:19,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1283646.6666666667, ans=0.0 2023-10-03 13:30:20,292 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.99 vs. limit=15.0 2023-10-03 13:30:20,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:30:20,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 13:30:21,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 13:30:28,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:30:28,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:30:29,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 13:30:29,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 13:30:32,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:30:34,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:30:36,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 13:30:36,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:30:36,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 13:30:36,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1283713.3333333333, ans=0.125 2023-10-03 13:30:37,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:30:42,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:30:42,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:30:46,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 13:30:46,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 13:30:48,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 13:30:48,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1283780.0, ans=0.125 2023-10-03 13:30:52,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:30:55,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 13:30:56,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:31:03,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 13:31:07,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:31:08,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1283913.3333333333, ans=0.125 2023-10-03 13:31:10,524 INFO [train.py:1046] (1/4) Epoch 37, batch 1350, loss[loss=0.1526, simple_loss=0.2393, pruned_loss=0.03291, over 24469.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2389, pruned_loss=0.03982, over 4712357.46 frames. ], batch size: 66, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:31:11,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:31:13,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:31:13,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:31:17,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:31:17,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:31:22,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:31:23,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 13:31:23,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:31:23,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:31:27,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 13:31:27,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:31:29,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:31:29,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 13:31:30,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 13:31:32,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1283980.0, ans=0.2 2023-10-03 13:31:34,549 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.514e+02 1.947e+02 2.174e+02 2.484e+02 3.426e+02, threshold=4.348e+02, percent-clipped=0.0 2023-10-03 13:31:34,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 13:31:35,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:31:35,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 13:31:48,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:31:54,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1284113.3333333333, ans=0.1 2023-10-03 13:31:58,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:31:58,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:31:58,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 13:32:02,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:32:02,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 13:32:03,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:32:03,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:32:06,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:32:09,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 13:32:10,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:32:11,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1284180.0, ans=10.0 2023-10-03 13:32:18,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 13:32:19,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 13:32:23,885 INFO [train.py:1046] (1/4) Epoch 37, batch 1400, loss[loss=0.1596, simple_loss=0.2459, pruned_loss=0.03663, over 24046.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2382, pruned_loss=0.03948, over 4720729.06 frames. ], batch size: 80, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:32:24,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1284246.6666666667, ans=0.125 2023-10-03 13:32:25,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 13:32:26,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:32:28,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1284246.6666666667, ans=0.125 2023-10-03 13:32:29,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:32:29,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:32:33,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 13:32:34,528 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.84 vs. limit=15.0 2023-10-03 13:32:35,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 13:32:36,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1284313.3333333333, ans=0.125 2023-10-03 13:32:44,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:32:44,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1284313.3333333333, ans=0.0 2023-10-03 13:32:46,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:32:47,520 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.83 vs. limit=15.0 2023-10-03 13:32:49,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1284313.3333333333, ans=0.125 2023-10-03 13:32:50,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:32:50,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:32:53,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:32:53,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1284380.0, ans=0.0 2023-10-03 13:32:54,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 13:33:01,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:01,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:05,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 13:33:05,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:33:07,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:33:07,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:33:07,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:33:08,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:33:08,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:33:08,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:33:10,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 13:33:11,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:33:15,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:18,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:33:21,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1284513.3333333333, ans=0.0 2023-10-03 13:33:25,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 13:33:27,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 13:33:27,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:33:31,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 13:33:31,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:33:32,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:33:35,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:33:35,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1284580.0, ans=0.2 2023-10-03 13:33:36,839 INFO [train.py:1046] (1/4) Epoch 37, batch 1450, loss[loss=0.1624, simple_loss=0.2481, pruned_loss=0.03838, over 24466.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2374, pruned_loss=0.03904, over 4719834.59 frames. ], batch size: 66, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:33:38,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:33:38,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:38,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 13:33:44,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:33:44,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1284580.0, ans=0.125 2023-10-03 13:33:45,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:33:48,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:33:48,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 13:33:50,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:33:51,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 13:33:51,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:51,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1284646.6666666667, ans=0.125 2023-10-03 13:33:53,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:33:53,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 13:33:54,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:33:54,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:33:55,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 13:33:55,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:33:57,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:33:58,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:59,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:34:02,485 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.928e+02 2.172e+02 2.508e+02 3.657e+02, threshold=4.343e+02, percent-clipped=0.0 2023-10-03 13:34:03,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:34:03,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:34:05,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:34:05,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:34:08,547 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.90 vs. limit=15.0 2023-10-03 13:34:09,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:34:09,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:34:09,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:34:09,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:34:13,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=1284713.3333333333, ans=22.5 2023-10-03 13:34:13,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 13:34:17,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:34:20,521 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 13:34:21,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:34:23,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:34:24,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:34:26,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 13:34:28,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:34:31,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 13:34:32,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 13:34:34,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:34:38,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:34:38,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:34:38,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 13:34:40,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1284846.6666666667, ans=0.125 2023-10-03 13:34:41,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 13:34:41,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 13:34:42,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:34:44,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:34:44,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1284846.6666666667, ans=0.0 2023-10-03 13:34:51,418 INFO [train.py:1046] (1/4) Epoch 37, batch 1500, loss[loss=0.1622, simple_loss=0.252, pruned_loss=0.03619, over 24578.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2381, pruned_loss=0.03911, over 4729713.24 frames. ], batch size: 71, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:34:57,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 13:34:57,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:34:57,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:34:57,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1284913.3333333333, ans=0.0 2023-10-03 13:34:58,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:34:59,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:34:59,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:35:01,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 13:35:02,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:35:02,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:35:02,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:35:03,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:35:04,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:35:05,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:35:10,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:35:10,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 13:35:12,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:35:12,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:35:12,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:35:15,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 13:35:20,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 13:35:21,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:35:21,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 13:35:23,395 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:35:24,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 13:35:25,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:35:25,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1285046.6666666667, ans=0.0 2023-10-03 13:35:27,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:35:27,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:35:29,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 13:35:29,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:35:29,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:35:29,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 13:35:31,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:35:35,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:35:35,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 13:35:37,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1285113.3333333333, ans=0.125 2023-10-03 13:35:39,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:35:41,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:35:45,600 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 13:35:48,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:35:48,100 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 13:35:49,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:35:49,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:35:49,541 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 13:35:50,276 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.03 vs. limit=6.0 2023-10-03 13:35:51,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:35:52,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 13:35:54,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:35:54,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1285180.0, ans=0.0 2023-10-03 13:35:57,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:35:57,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:35:58,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:35:58,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:35:58,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:35:59,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 13:35:59,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 13:36:01,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:36:01,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 13:36:02,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 13:36:02,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1285246.6666666667, ans=0.07 2023-10-03 13:36:03,873 INFO [train.py:1046] (1/4) Epoch 37, batch 1550, loss[loss=0.16, simple_loss=0.2411, pruned_loss=0.0395, over 23396.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2387, pruned_loss=0.03938, over 4720594.06 frames. ], batch size: 93, lr: 2.76e-03, grad_scale: 8.0 2023-10-03 13:36:04,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:36:05,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:36:06,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:36:06,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:36:08,462 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.72 vs. limit=10.0 2023-10-03 13:36:09,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:36:10,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:36:11,486 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.70 vs. limit=22.5 2023-10-03 13:36:12,055 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 13:36:13,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:36:13,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:36:13,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:36:15,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1285246.6666666667, ans=0.0 2023-10-03 13:36:18,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:36:18,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 13:36:20,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1285313.3333333333, ans=0.0 2023-10-03 13:36:21,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:36:21,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 13:36:21,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 13:36:21,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 13:36:23,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:36:25,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:36:27,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:36:28,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1285313.3333333333, ans=0.2 2023-10-03 13:36:29,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 13:36:29,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 13:36:30,488 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.843e+02 2.005e+02 2.170e+02 3.085e+02, threshold=4.010e+02, percent-clipped=0.0 2023-10-03 13:36:30,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1285313.3333333333, ans=0.2 2023-10-03 13:36:36,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:36:40,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:36:40,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:36:40,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:36:41,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 13:36:46,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:36:48,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:36:48,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1285446.6666666667, ans=0.0 2023-10-03 13:36:51,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:36:54,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:36:54,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:36:54,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 13:36:55,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:36:56,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:36:56,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:36:57,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 13:36:57,770 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 13:37:00,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:37:04,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 13:37:11,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:37:11,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:37:12,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 13:37:14,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:37:14,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:37:14,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:37:15,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:37:16,943 INFO [train.py:1046] (1/4) Epoch 37, batch 1600, loss[loss=0.1597, simple_loss=0.2461, pruned_loss=0.03665, over 24407.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2392, pruned_loss=0.04001, over 4715903.93 frames. ], batch size: 77, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:37:17,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:37:20,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:37:22,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 13:37:23,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 13:37:25,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 13:37:28,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:37:30,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 13:37:31,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:37:32,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:37:38,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:37:40,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 13:37:42,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:37:44,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 13:37:44,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:37:44,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 13:37:50,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 13:37:58,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:38:00,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 13:38:00,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:38:00,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:38:00,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:38:03,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 13:38:03,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1285780.0, ans=0.125 2023-10-03 13:38:06,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1285780.0, ans=0.0 2023-10-03 13:38:07,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 13:38:10,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:38:10,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:38:10,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:38:10,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1285780.0, ans=0.0 2023-10-03 13:38:11,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:38:13,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:38:13,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:38:16,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:38:22,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:38:22,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:38:23,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 13:38:23,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:38:24,358 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.13 vs. limit=10.0 2023-10-03 13:38:25,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 13:38:30,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:38:31,963 INFO [train.py:1046] (1/4) Epoch 37, batch 1650, loss[loss=0.1709, simple_loss=0.2406, pruned_loss=0.05054, over 23840.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2395, pruned_loss=0.04009, over 4720418.63 frames. ], batch size: 212, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:38:32,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:38:33,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:38:33,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 13:38:33,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 13:38:33,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 13:38:34,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 13:38:39,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:38:40,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:38:40,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:38:41,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:38:43,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:38:44,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 13:38:47,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:38:47,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:38:47,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:38:47,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:38:48,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 13:38:48,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 13:38:56,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:38:56,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:38:58,189 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.927e+02 2.064e+02 2.320e+02 3.499e+02, threshold=4.128e+02, percent-clipped=0.0 2023-10-03 13:38:58,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1285980.0, ans=0.125 2023-10-03 13:39:06,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 13:39:08,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:39:11,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 13:39:12,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:39:14,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:39:15,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:39:15,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:39:18,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:39:18,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:39:19,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:39:19,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:39:21,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:39:22,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:39:22,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:39:22,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:39:26,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:39:27,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 13:39:29,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:39:30,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 13:39:32,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 13:39:34,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 13:39:34,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:39:34,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:39:34,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:39:34,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:39:34,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 13:39:37,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:39:38,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:39:39,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:39:40,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1286180.0, ans=0.0 2023-10-03 13:39:42,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 13:39:45,111 INFO [train.py:1046] (1/4) Epoch 37, batch 1700, loss[loss=0.1648, simple_loss=0.2515, pruned_loss=0.03903, over 23939.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2392, pruned_loss=0.03961, over 4729958.69 frames. ], batch size: 86, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:39:46,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:39:46,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:39:47,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 13:39:49,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:39:49,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:39:49,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:39:50,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:39:50,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:39:52,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 13:39:53,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:39:53,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1286246.6666666667, ans=0.0 2023-10-03 13:40:02,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:40:04,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:40:09,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:40:10,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:40:10,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:40:11,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:40:13,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 13:40:15,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:40:15,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:40:17,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:40:17,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:40:20,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 13:40:20,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 13:40:21,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:40:22,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 13:40:23,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:40:31,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:40:31,906 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1286446.6666666667, ans=0.0 2023-10-03 13:40:33,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:40:34,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:40:35,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 13:40:35,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 13:40:35,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:40:37,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:40:37,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 13:40:38,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:40:38,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:40:38,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:40:38,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:40:41,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:40:41,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:40:42,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:40:42,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:40:44,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:40:44,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1286513.3333333333, ans=0.125 2023-10-03 13:40:48,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:40:49,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 13:40:51,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:40:51,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:40:54,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 13:40:58,886 INFO [train.py:1046] (1/4) Epoch 37, batch 1750, loss[loss=0.1639, simple_loss=0.2415, pruned_loss=0.04322, over 23672.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2377, pruned_loss=0.03956, over 4718704.15 frames. ], batch size: 232, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:41:00,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:41:02,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:41:02,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:41:03,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 13:41:03,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:41:06,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:41:06,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:41:11,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 13:41:13,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1286646.6666666667, ans=0.125 2023-10-03 13:41:14,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:41:15,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 13:41:17,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:41:17,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:41:21,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 13:41:21,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1286646.6666666667, ans=0.0 2023-10-03 13:41:23,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 13:41:24,679 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.864e+02 2.125e+02 2.477e+02 3.687e+02, threshold=4.251e+02, percent-clipped=0.0 2023-10-03 13:41:24,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:41:24,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 13:41:34,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:41:35,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:41:37,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:41:40,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:41:40,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:41:41,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:41:44,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:41:45,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:41:46,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:41:48,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 13:41:49,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:41:52,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 13:41:52,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:41:54,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:41:55,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:41:59,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:42:00,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 13:42:01,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:42:03,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:42:04,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1286846.6666666667, ans=0.5 2023-10-03 13:42:07,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:42:09,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1286846.6666666667, ans=0.0 2023-10-03 13:42:10,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:42:12,022 INFO [train.py:1046] (1/4) Epoch 37, batch 1800, loss[loss=0.1546, simple_loss=0.2433, pruned_loss=0.03297, over 24450.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2371, pruned_loss=0.03919, over 4722087.83 frames. ], batch size: 66, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:42:12,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:42:12,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 13:42:12,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:42:14,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:42:14,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:42:14,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:42:14,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:42:16,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:42:16,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1286913.3333333333, ans=0.0 2023-10-03 13:42:19,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:42:19,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:42:20,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 13:42:23,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:42:24,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 13:42:26,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:42:29,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:42:31,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1286980.0, ans=0.125 2023-10-03 13:42:31,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1286980.0, ans=0.125 2023-10-03 13:42:32,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:42:32,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:42:34,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:42:35,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:42:37,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 13:42:37,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:42:37,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1286980.0, ans=0.125 2023-10-03 13:42:40,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:42:42,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 13:42:45,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 13:42:45,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 13:42:45,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:42:48,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:42:48,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:42:48,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:42:55,181 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 13:42:57,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:42:58,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:42:59,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 13:42:59,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 13:42:59,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1287113.3333333333, ans=0.125 2023-10-03 13:43:00,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:43:00,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:43:02,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:43:05,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 13:43:11,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:43:11,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 13:43:12,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:43:12,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:43:12,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:43:14,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 13:43:15,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:43:15,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:43:18,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 13:43:18,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:43:21,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:43:21,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:43:23,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:43:24,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:43:24,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:43:25,739 INFO [train.py:1046] (1/4) Epoch 37, batch 1850, loss[loss=0.1552, simple_loss=0.2488, pruned_loss=0.03076, over 24513.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2379, pruned_loss=0.03959, over 4715504.83 frames. ], batch size: 71, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:43:25,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:43:25,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:43:29,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:43:30,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:43:36,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:43:38,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 13:43:40,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 13:43:43,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 13:43:46,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:43:46,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 13:43:46,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 13:43:47,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1287313.3333333333, ans=0.1 2023-10-03 13:43:52,037 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.939e+02 2.095e+02 2.336e+02 4.113e+02, threshold=4.190e+02, percent-clipped=0.0 2023-10-03 13:43:52,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1287313.3333333333, ans=0.0 2023-10-03 13:43:56,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:43:57,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 13:44:01,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:44:01,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:44:05,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1287380.0, ans=0.0 2023-10-03 13:44:07,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 13:44:07,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:07,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 13:44:08,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:44:10,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:44:14,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:44:14,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1287446.6666666667, ans=0.09899494936611666 2023-10-03 13:44:17,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:44:17,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:17,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 13:44:18,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:44:19,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:44:21,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:44:23,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 13:44:24,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:44:28,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:44:30,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:44:30,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 13:44:30,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 13:44:32,313 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 13:44:33,666 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 13:44:33,922 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1287513.3333333333, ans=0.125 2023-10-03 13:44:35,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:44:35,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:44:35,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:44:35,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:36,919 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 13:44:36,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:44:36,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:37,736 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.78 vs. limit=15.0 2023-10-03 13:44:38,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:44:39,619 INFO [train.py:1046] (1/4) Epoch 37, batch 1900, loss[loss=0.1603, simple_loss=0.2508, pruned_loss=0.03489, over 24671.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2379, pruned_loss=0.03927, over 4731300.54 frames. ], batch size: 73, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:44:39,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:44:39,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:44:39,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 13:44:41,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:41,205 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 13:44:41,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:44:41,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1287580.0, ans=0.2 2023-10-03 13:44:43,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:44:48,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:44:51,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:44:53,077 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 13:44:53,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 13:44:54,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:44:55,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:44:55,952 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 13:44:57,278 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 13:45:00,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 13:45:01,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:45:05,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 13:45:06,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 13:45:15,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 13:45:18,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 13:45:18,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:45:18,643 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 13:45:18,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 13:45:19,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 13:45:19,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 13:45:20,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:45:20,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1287713.3333333333, ans=0.125 2023-10-03 13:45:24,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 13:45:27,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:45:28,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1287780.0, ans=0.125 2023-10-03 13:45:29,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:45:29,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 13:45:32,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:45:33,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1287780.0, ans=0.0 2023-10-03 13:45:34,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 13:45:34,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:45:39,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:45:39,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:45:39,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:45:41,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:45:41,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:45:42,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 13:45:44,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:45:45,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:45:45,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:45:49,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:45:49,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:45:49,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:45:50,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:45:54,085 INFO [train.py:1046] (1/4) Epoch 37, batch 1950, loss[loss=0.1678, simple_loss=0.2579, pruned_loss=0.03889, over 24592.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.239, pruned_loss=0.03984, over 4718368.70 frames. ], batch size: 71, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:45:54,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:45:56,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:45:56,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:45:56,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:46:00,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 13:46:00,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 13:46:01,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:46:01,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:46:05,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:46:05,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:05,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:46:07,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:46:10,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:46:11,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:46:11,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:46:11,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:46:16,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:46:18,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:46:18,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:18,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 13:46:18,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 13:46:19,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 13:46:19,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:46:19,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1287980.0, ans=0.125 2023-10-03 13:46:20,727 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.916e+02 2.137e+02 2.276e+02 3.150e+02, threshold=4.275e+02, percent-clipped=0.0 2023-10-03 13:46:20,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:46:24,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:46:26,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:46:31,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:46:35,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:46:36,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:46:36,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 13:46:37,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:46:38,527 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1288113.3333333333, ans=0.125 2023-10-03 13:46:43,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:46:44,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:46:44,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:46:50,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1288113.3333333333, ans=0.125 2023-10-03 13:46:52,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:53,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:56,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:57,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:46:59,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:46:59,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:47:00,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 13:47:00,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:47:02,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:47:03,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 13:47:04,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1288180.0, ans=0.125 2023-10-03 13:47:05,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:47:08,072 INFO [train.py:1046] (1/4) Epoch 37, batch 2000, loss[loss=0.1622, simple_loss=0.2488, pruned_loss=0.0378, over 24364.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2389, pruned_loss=0.03973, over 4726920.73 frames. ], batch size: 77, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:47:09,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:47:12,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:47:12,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:47:15,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:47:15,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:47:19,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 13:47:21,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:47:22,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:47:25,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 13:47:26,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1288313.3333333333, ans=0.0 2023-10-03 13:47:27,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:47:27,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:47:29,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:47:30,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 13:47:32,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:47:34,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:47:34,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:47:35,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 13:47:35,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 13:47:37,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 13:47:37,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:47:40,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:47:40,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:47:40,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:47:42,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:47:42,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:47:42,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1288380.0, ans=0.2 2023-10-03 13:47:44,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 13:47:46,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 13:47:46,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:47:46,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:47:52,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:47:54,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:47:54,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:47:54,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:47:57,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:47:57,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:47:58,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:47:58,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:47:59,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:00,352 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.49 vs. limit=15.0 2023-10-03 13:48:01,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:48:03,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 13:48:09,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:48:09,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1288513.3333333333, ans=0.125 2023-10-03 13:48:10,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:13,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:13,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:48:16,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:19,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:48:19,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:20,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:48:21,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:48:22,346 INFO [train.py:1046] (1/4) Epoch 37, batch 2050, loss[loss=0.1526, simple_loss=0.2142, pruned_loss=0.04552, over 19452.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2385, pruned_loss=0.03992, over 4716276.98 frames. ], batch size: 388, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:48:22,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:24,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:25,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:48:27,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:30,622 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.07 vs. limit=22.5 2023-10-03 13:48:31,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:48:32,327 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:48:33,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:48:34,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:34,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:48:36,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 13:48:36,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:48:37,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:48:39,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:48:46,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:48:46,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:49,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 13:48:50,794 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.876e+02 2.031e+02 2.404e+02 3.900e+02, threshold=4.063e+02, percent-clipped=0.0 2023-10-03 13:48:50,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:52,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 13:48:54,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:48:56,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:48:59,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:49:01,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:49:01,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:49:03,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:49:04,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:49:05,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:49:07,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:49:10,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:49:11,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:49:12,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:49:14,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1288780.0, ans=0.2 2023-10-03 13:49:16,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:49:21,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:49:23,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 13:49:29,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:49:29,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:49:32,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:49:33,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 13:49:36,692 INFO [train.py:1046] (1/4) Epoch 37, batch 2100, loss[loss=0.1476, simple_loss=0.2243, pruned_loss=0.03542, over 23693.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2377, pruned_loss=0.03966, over 4712361.13 frames. ], batch size: 135, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:49:36,835 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 13:49:36,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:49:38,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:49:38,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:49:40,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:49:40,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 13:49:40,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 13:49:41,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:49:43,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1288913.3333333333, ans=0.0 2023-10-03 13:49:46,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:49:46,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:49:46,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1288913.3333333333, ans=0.125 2023-10-03 13:49:48,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:49:48,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:49:48,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 13:49:50,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:49:50,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 13:49:50,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 13:49:51,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:49:51,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:49:51,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 13:49:53,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 13:49:54,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1288980.0, ans=0.125 2023-10-03 13:49:59,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 13:49:59,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:50:01,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:50:01,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:50:05,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:50:05,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 13:50:05,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:50:05,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 13:50:05,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1289046.6666666667, ans=0.125 2023-10-03 13:50:08,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 13:50:08,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:50:08,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 13:50:08,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 13:50:10,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 13:50:11,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:50:14,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:50:16,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:50:16,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:50:17,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:50:18,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:50:18,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 13:50:18,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:50:18,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:50:20,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:50:20,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 13:50:21,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 13:50:23,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 13:50:23,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1289113.3333333333, ans=0.09899494936611666 2023-10-03 13:50:27,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:50:30,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:50:30,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 13:50:35,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:50:37,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:50:38,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:50:38,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:50:38,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 13:50:39,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:50:41,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:50:41,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:50:42,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:50:42,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:50:45,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 13:50:47,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 13:50:47,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:50:48,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:50:48,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:50:48,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:50:50,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:50:51,383 INFO [train.py:1046] (1/4) Epoch 37, batch 2150, loss[loss=0.1545, simple_loss=0.24, pruned_loss=0.03445, over 24511.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2368, pruned_loss=0.03939, over 4711829.79 frames. ], batch size: 66, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:50:55,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 13:50:57,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:50:58,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:51:00,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:51:00,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:00,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:51:03,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:51:03,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:51:03,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:51:09,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:09,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 13:51:13,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:51:14,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:51:15,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:15,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:51:15,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:17,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:51:17,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:51:18,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:51:19,872 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.856e+02 2.047e+02 2.375e+02 3.292e+02, threshold=4.095e+02, percent-clipped=0.0 2023-10-03 13:51:19,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:51:20,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 13:51:21,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:51:22,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:51:23,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:51:24,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:51:25,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:51:26,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:51:26,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:51:30,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:51:30,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 13:51:30,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:51:33,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:51:33,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:34,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:51:34,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:51:34,957 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:51:36,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:51:36,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:36,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 13:51:38,377 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:51:39,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 13:51:39,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:51:39,505 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 13:51:39,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:51:40,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:51:42,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 13:51:43,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:51:43,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 13:51:43,448 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 13:51:43,448 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 13:51:43,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 13:51:44,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:51:44,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:51:44,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:51:46,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:48,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 13:51:50,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:51:50,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:59,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:51:59,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 13:52:03,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1289513.3333333333, ans=0.2 2023-10-03 13:52:04,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:52:06,060 INFO [train.py:1046] (1/4) Epoch 37, batch 2200, loss[loss=0.1412, simple_loss=0.2208, pruned_loss=0.03075, over 24613.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2371, pruned_loss=0.03983, over 4708065.86 frames. ], batch size: 60, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:52:06,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1289580.0, ans=0.5 2023-10-03 13:52:09,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:52:09,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:52:10,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:52:11,632 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.51 vs. limit=5.0 2023-10-03 13:52:12,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:52:13,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:52:14,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:52:14,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 13:52:19,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 13:52:21,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:52:26,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 13:52:29,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:52:32,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:52:32,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:52:34,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1289713.3333333333, ans=0.125 2023-10-03 13:52:35,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:52:35,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 13:52:37,781 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.99 vs. limit=12.0 2023-10-03 13:52:40,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:52:41,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:52:41,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 13:52:43,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1289713.3333333333, ans=0.125 2023-10-03 13:52:44,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:52:44,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:52:45,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:52:47,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:52:50,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 13:52:51,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:52:51,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 13:52:54,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:52:54,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 13:52:54,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:52:57,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:52:57,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:52:57,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:52:57,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:52:58,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:53:00,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:53:02,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 13:53:05,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 13:53:05,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:53:07,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:53:09,775 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 13:53:11,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:53:12,376 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 13:53:13,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 13:53:13,797 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 13:53:16,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:53:16,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 13:53:19,477 INFO [train.py:1046] (1/4) Epoch 37, batch 2250, loss[loss=0.134, simple_loss=0.2165, pruned_loss=0.02575, over 24416.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.238, pruned_loss=0.03982, over 4710842.38 frames. ], batch size: 58, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 13:53:19,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:53:20,959 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 13:53:24,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:53:26,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:53:29,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:53:31,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:53:35,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:53:37,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:53:38,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:53:38,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1289980.0, ans=0.2 2023-10-03 13:53:39,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 13:53:39,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:53:39,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:53:41,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 13:53:43,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:53:43,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:53:45,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:53:46,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1289980.0, ans=0.125 2023-10-03 13:53:47,211 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.849e+02 1.952e+02 2.109e+02 2.912e+02, threshold=3.904e+02, percent-clipped=0.0 2023-10-03 13:53:48,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:53:50,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 13:53:50,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:53:52,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 13:53:53,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:53:56,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:53:59,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1290046.6666666667, ans=0.0 2023-10-03 13:54:00,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:54:01,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:54:03,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:54:03,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:54:06,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:54:07,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:54:11,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1290113.3333333333, ans=0.2 2023-10-03 13:54:12,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:54:15,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:54:17,561 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.38 vs. limit=22.5 2023-10-03 13:54:18,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 13:54:19,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:54:19,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:54:25,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 13:54:28,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:54:28,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 13:54:28,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:54:28,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:54:31,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 13:54:32,564 INFO [train.py:1046] (1/4) Epoch 37, batch 2300, loss[loss=0.1664, simple_loss=0.243, pruned_loss=0.04491, over 23532.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.238, pruned_loss=0.04004, over 4714393.93 frames. ], batch size: 134, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 13:54:35,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:54:35,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:54:42,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:54:44,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:54:47,596 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 13:54:49,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:54:53,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:54:53,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:54:55,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:54:55,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:54:55,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 13:54:55,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:54:57,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:54:58,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1290313.3333333333, ans=0.0 2023-10-03 13:54:59,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:55:01,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:55:04,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:55:07,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:55:10,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1290380.0, ans=0.125 2023-10-03 13:55:10,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1290380.0, ans=0.0 2023-10-03 13:55:13,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:55:13,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:55:17,149 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.95 vs. limit=22.5 2023-10-03 13:55:18,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:55:20,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1290446.6666666667, ans=0.1 2023-10-03 13:55:21,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:55:24,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:55:25,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:55:25,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:55:25,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 13:55:29,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 13:55:29,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:55:30,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:55:30,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:55:30,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:55:31,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 13:55:31,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:55:32,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 13:55:32,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:55:32,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:55:34,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 13:55:38,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:55:41,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:55:47,298 INFO [train.py:1046] (1/4) Epoch 37, batch 2350, loss[loss=0.2261, simple_loss=0.2909, pruned_loss=0.08062, over 19663.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2395, pruned_loss=0.04041, over 4714319.08 frames. ], batch size: 388, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 13:55:47,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:55:48,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:55:48,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:55:50,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:55:50,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:55:50,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:55:52,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 13:55:56,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:55:57,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 13:56:03,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 13:56:05,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:56:08,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:56:08,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:56:08,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:56:09,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:56:11,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 13:56:14,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:56:17,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 13:56:18,750 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.950e+02 2.084e+02 2.408e+02 3.908e+02, threshold=4.167e+02, percent-clipped=1.0 2023-10-03 13:56:18,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:56:23,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:56:23,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:56:24,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:56:26,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 13:56:27,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:56:28,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:56:28,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:56:28,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:56:33,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:56:35,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 13:56:35,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:56:38,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:56:38,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:56:39,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 13:56:41,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:56:44,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 13:56:44,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:56:48,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 13:56:51,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 13:56:52,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:56:52,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:56:53,001 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 13:56:54,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 13:56:56,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 13:56:58,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:57:01,900 INFO [train.py:1046] (1/4) Epoch 37, batch 2400, loss[loss=0.1676, simple_loss=0.2516, pruned_loss=0.04181, over 23992.00 frames. ], tot_loss[loss=0.16, simple_loss=0.239, pruned_loss=0.04046, over 4716336.22 frames. ], batch size: 80, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 13:57:03,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:57:04,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:57:08,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:57:08,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 13:57:09,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 13:57:15,748 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.33 vs. limit=22.5 2023-10-03 13:57:16,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 13:57:16,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:57:18,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 13:57:18,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:57:19,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:57:20,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 13:57:25,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:57:29,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 13:57:31,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1291046.6666666667, ans=0.1 2023-10-03 13:57:34,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:57:34,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1291046.6666666667, ans=0.0 2023-10-03 13:57:39,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 13:57:40,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:57:42,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:57:42,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1291046.6666666667, ans=0.2 2023-10-03 13:57:46,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:57:46,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 13:57:47,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:57:53,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:57:56,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:57:59,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:58:01,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:58:01,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:58:01,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:58:01,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:58:01,942 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.97 vs. limit=15.0 2023-10-03 13:58:02,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:58:02,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:58:05,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:58:06,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:58:06,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 13:58:07,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 13:58:09,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:58:10,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:58:10,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 13:58:11,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 13:58:11,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 13:58:11,930 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 13:58:13,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 13:58:14,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:58:14,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:58:14,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:58:16,063 INFO [train.py:1046] (1/4) Epoch 37, batch 2450, loss[loss=0.1597, simple_loss=0.2437, pruned_loss=0.0378, over 24492.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2371, pruned_loss=0.0401, over 4700666.16 frames. ], batch size: 66, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 13:58:16,183 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 13:58:17,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:58:17,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 13:58:22,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:58:22,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:58:26,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:58:26,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:58:26,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 13:58:27,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1291246.6666666667, ans=0.125 2023-10-03 13:58:32,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:58:32,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:58:35,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1291313.3333333333, ans=0.125 2023-10-03 13:58:37,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:58:37,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:58:37,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:58:38,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 13:58:40,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1291313.3333333333, ans=0.125 2023-10-03 13:58:41,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:58:43,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:58:44,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:58:44,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1291380.0, ans=0.125 2023-10-03 13:58:47,042 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.840e+02 2.041e+02 2.225e+02 3.110e+02, threshold=4.081e+02, percent-clipped=0.0 2023-10-03 13:58:47,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:58:48,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:58:49,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:58:50,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:58:52,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 13:58:52,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:58:59,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:59:01,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:59:01,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:59:02,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:59:04,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:59:05,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:59:06,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 13:59:10,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:59:10,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:59:12,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:59:12,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:59:17,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:59:17,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 13:59:18,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:59:18,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:59:18,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 13:59:18,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:59:20,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:59:24,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:59:27,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:59:27,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:59:29,948 INFO [train.py:1046] (1/4) Epoch 37, batch 2500, loss[loss=0.159, simple_loss=0.2411, pruned_loss=0.0385, over 23496.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2359, pruned_loss=0.04, over 4689691.06 frames. ], batch size: 93, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 13:59:30,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 13:59:31,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:59:36,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:59:38,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1291580.0, ans=0.0 2023-10-03 13:59:45,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:59:45,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:59:46,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:59:46,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 13:59:53,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 13:59:53,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:59:55,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 13:59:55,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1291646.6666666667, ans=0.125 2023-10-03 13:59:56,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 13:59:56,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 13:59:58,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:59:58,210 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1291713.3333333333, ans=0.1 2023-10-03 13:59:59,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:59:59,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 14:00:00,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:00:01,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 14:00:01,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:00:07,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:00:07,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:00:08,382 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.22 vs. limit=15.0 2023-10-03 14:00:10,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:00:12,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 14:00:12,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:00:13,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:00:17,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:00:20,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:00:24,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:00:30,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 14:00:30,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1291846.6666666667, ans=0.2 2023-10-03 14:00:30,976 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.55 vs. limit=12.0 2023-10-03 14:00:33,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 14:00:33,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:00:33,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 14:00:35,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:00:35,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:00:35,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1291846.6666666667, ans=0.0 2023-10-03 14:00:37,319 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 14:00:37,319 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 14:00:37,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 14:00:40,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:00:43,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 14:00:43,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 14:00:43,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1291913.3333333333, ans=0.2 2023-10-03 14:00:44,247 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.54 vs. limit=15.0 2023-10-03 14:00:44,578 INFO [train.py:1046] (1/4) Epoch 37, batch 2550, loss[loss=0.1278, simple_loss=0.2089, pruned_loss=0.0233, over 24363.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2365, pruned_loss=0.03984, over 4702702.58 frames. ], batch size: 56, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:00:44,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:00:44,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 14:00:47,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 14:00:50,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:00:52,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:00:52,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:00:54,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:00:56,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 14:00:56,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:01:00,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 14:01:01,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:01:03,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:05,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:01:05,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 14:01:05,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1291980.0, ans=0.0 2023-10-03 14:01:06,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:01:06,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:01:06,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:01:10,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:01:10,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 14:01:10,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 14:01:10,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:10,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 14:01:13,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1292046.6666666667, ans=0.2 2023-10-03 14:01:16,195 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.844e+02 2.001e+02 2.201e+02 4.188e+02, threshold=4.002e+02, percent-clipped=1.0 2023-10-03 14:01:21,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:01:25,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:01:25,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:26,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:01:26,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:01:34,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:01:37,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:01:37,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:01:37,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:01:37,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 14:01:38,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:01:43,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:01:43,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:49,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:01:49,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 14:01:49,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:01:50,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:51,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 14:01:52,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:01:54,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:01:56,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1292180.0, ans=0.125 2023-10-03 14:01:59,006 INFO [train.py:1046] (1/4) Epoch 37, batch 2600, loss[loss=0.1587, simple_loss=0.2432, pruned_loss=0.03712, over 24346.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2369, pruned_loss=0.0401, over 4703837.33 frames. ], batch size: 61, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:02:00,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:02:03,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:02:06,351 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 14:02:07,133 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 14:02:08,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:02:08,455 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 14:02:08,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 14:02:08,541 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 14:02:12,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:02:12,493 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 14:02:14,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 14:02:15,757 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 14:02:17,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:02:18,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 14:02:18,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 14:02:21,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 14:02:21,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 14:02:24,038 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 14:02:24,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 14:02:31,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:02:31,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:02:31,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:02:31,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 14:02:31,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1292380.0, ans=0.0 2023-10-03 14:02:34,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:02:38,478 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 14:02:45,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:02:46,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:02:46,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 14:02:47,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:02:47,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:02:47,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 14:02:48,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1292446.6666666667, ans=0.125 2023-10-03 14:02:52,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:02:52,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:02:53,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:02:57,542 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 14:02:57,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:02:57,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:02:59,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1292513.3333333333, ans=0.2 2023-10-03 14:03:03,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:03:03,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:03:03,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 14:03:05,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:03:07,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:03:08,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:03:11,790 INFO [train.py:1046] (1/4) Epoch 37, batch 2650, loss[loss=0.1653, simple_loss=0.2474, pruned_loss=0.04157, over 24317.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2373, pruned_loss=0.03957, over 4716978.62 frames. ], batch size: 61, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:03:13,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 14:03:15,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:03:17,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:03:20,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 14:03:20,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:03:21,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:03:22,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=1292580.0, ans=0.95 2023-10-03 14:03:23,090 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 14:03:23,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:03:25,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:03:27,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:03:27,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:03:30,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:03:30,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 14:03:30,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:03:30,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1292646.6666666667, ans=0.1 2023-10-03 14:03:32,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:03:34,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 14:03:36,193 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 14:03:38,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:03:40,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 14:03:41,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:03:42,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 14:03:43,645 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.832e+02 2.042e+02 2.355e+02 4.298e+02, threshold=4.084e+02, percent-clipped=1.0 2023-10-03 14:03:44,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1292713.3333333333, ans=0.09899494936611666 2023-10-03 14:03:45,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:03:45,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:03:45,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:03:47,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:03:49,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1292713.3333333333, ans=0.125 2023-10-03 14:03:50,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 14:03:50,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 14:03:53,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:03:56,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 14:03:56,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:03:57,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:03:59,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:03:59,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:03:59,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:04:01,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:04:04,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:04:04,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:04:04,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:04:05,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:04:06,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1292780.0, ans=0.1 2023-10-03 14:04:07,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:04:07,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:04:08,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:04:12,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:04:12,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:04:15,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:04:17,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:04:17,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:04:17,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 14:04:20,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:04:21,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:04:21,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:04:24,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:04:25,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:04:25,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:04:27,236 INFO [train.py:1046] (1/4) Epoch 37, batch 2700, loss[loss=0.1472, simple_loss=0.2292, pruned_loss=0.03261, over 20170.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2386, pruned_loss=0.04024, over 4708553.88 frames. ], batch size: 44, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:04:29,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:04:29,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 14:04:31,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:04:33,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 14:04:34,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:04:34,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:04:34,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:04:36,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:04:36,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:04:37,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:04:37,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 14:04:37,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 14:04:38,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:04:40,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:04:43,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:04:43,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:04:46,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:04:47,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 14:04:47,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:04:50,128 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.03 vs. limit=15.0 2023-10-03 14:04:52,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:04:52,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:04:59,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:04:59,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:04:59,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:04:59,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:05:02,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:05:05,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:05:05,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:05:05,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:05:09,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:05:09,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:05:10,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1293113.3333333333, ans=0.0 2023-10-03 14:05:15,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:05:15,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:05:20,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:05:20,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:05:25,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:05:26,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:05:27,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:05:27,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:05:29,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:05:29,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:05:31,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:05:32,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:05:32,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:05:33,213 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.33 vs. limit=12.0 2023-10-03 14:05:35,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 14:05:36,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:05:40,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:05:40,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 14:05:41,441 INFO [train.py:1046] (1/4) Epoch 37, batch 2750, loss[loss=0.15, simple_loss=0.2329, pruned_loss=0.03351, over 24426.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2376, pruned_loss=0.04023, over 4711305.26 frames. ], batch size: 66, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:05:42,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 14:05:42,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:05:44,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:05:44,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:05:48,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:05:48,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:05:48,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:05:48,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1293246.6666666667, ans=0.125 2023-10-03 14:05:52,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:05:53,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 14:05:54,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:05:54,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:05:54,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 14:05:54,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:05:54,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:05:56,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1293313.3333333333, ans=0.1 2023-10-03 14:05:56,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1293313.3333333333, ans=0.1 2023-10-03 14:06:00,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 14:06:02,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:06:03,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:06:03,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:06:03,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:06:05,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:06:06,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:06:06,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:06:07,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:06:11,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:06:11,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:06:12,525 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.957e+02 2.205e+02 2.466e+02 3.615e+02, threshold=4.410e+02, percent-clipped=0.0 2023-10-03 14:06:12,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:06:14,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:06:15,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:06:16,957 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1293380.0, ans=0.0 2023-10-03 14:06:21,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:06:23,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:06:23,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:06:27,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:06:27,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:06:27,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:06:33,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:06:35,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:06:35,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 14:06:39,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:06:41,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 14:06:44,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 14:06:45,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:06:45,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 14:06:46,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:06:48,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:06:48,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 14:06:48,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:06:54,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 14:06:54,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:06:54,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:06:54,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 14:06:54,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:06:56,295 INFO [train.py:1046] (1/4) Epoch 37, batch 2800, loss[loss=0.1535, simple_loss=0.2403, pruned_loss=0.03334, over 24653.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2366, pruned_loss=0.03993, over 4711128.64 frames. ], batch size: 65, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:06:56,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:06:57,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:06:59,098 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 14:06:59,099 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 14:07:01,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:07:03,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:07:03,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:07:06,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:07:09,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 14:07:10,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 14:07:12,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 14:07:12,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:07:13,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:07:13,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:07:17,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:07:17,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:07:17,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 14:07:19,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:07:24,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1293713.3333333333, ans=0.125 2023-10-03 14:07:27,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:07:28,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:07:30,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:07:31,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:07:31,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:07:33,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1293713.3333333333, ans=0.125 2023-10-03 14:07:37,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:07:37,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 14:07:39,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:07:40,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:07:40,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:07:43,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:07:43,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:07:46,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:07:49,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:07:49,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:07:49,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:07:51,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 14:07:52,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:07:53,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:07:53,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 14:07:53,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:07:53,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:07:55,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:07:56,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 14:07:58,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:07:58,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:07:58,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:07:59,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 14:08:01,569 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1293846.6666666667, ans=0.125 2023-10-03 14:08:06,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:08:06,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:08:06,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:08:09,382 INFO [train.py:1046] (1/4) Epoch 37, batch 2850, loss[loss=0.1528, simple_loss=0.2446, pruned_loss=0.03049, over 24365.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2363, pruned_loss=0.03936, over 4717367.25 frames. ], batch size: 74, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:08:10,157 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.64 vs. limit=15.0 2023-10-03 14:08:10,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:08:12,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:08:14,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:08:14,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:08:17,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:08:17,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1293913.3333333333, ans=0.125 2023-10-03 14:08:18,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:08:19,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:08:21,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 14:08:25,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=1293980.0, ans=10.0 2023-10-03 14:08:26,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 14:08:26,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:08:26,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1293980.0, ans=0.0 2023-10-03 14:08:29,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 14:08:29,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:08:32,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 14:08:32,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 14:08:34,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:08:34,878 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.97 vs. limit=15.0 2023-10-03 14:08:41,022 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.871e+02 2.042e+02 2.334e+02 3.256e+02, threshold=4.084e+02, percent-clipped=0.0 2023-10-03 14:08:45,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:08:47,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:08:47,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:08:48,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:08:48,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:08:48,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:08:50,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:08:50,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 14:08:52,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1294046.6666666667, ans=0.2 2023-10-03 14:08:53,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:08:53,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:08:53,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:08:55,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:08:56,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:08:57,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:08:58,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:00,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:09:02,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:09:02,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:09:04,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:07,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:09:12,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:09:13,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 14:09:13,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 14:09:16,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:09:16,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:09:16,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 14:09:16,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:09:17,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:09:17,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:09:19,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:09:19,155 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 14:09:19,200 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 14:09:19,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:09:19,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:23,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 14:09:23,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:09:25,119 INFO [train.py:1046] (1/4) Epoch 37, batch 2900, loss[loss=0.1404, simple_loss=0.2198, pruned_loss=0.03054, over 21914.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2371, pruned_loss=0.03953, over 4705234.32 frames. ], batch size: 48, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:09:25,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:09:26,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 14:09:31,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:09:31,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 14:09:31,896 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.44 vs. limit=15.0 2023-10-03 14:09:32,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 14:09:33,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:09:33,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:09:35,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:09:35,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:09:38,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:09:40,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:09:42,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 14:09:42,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 14:09:42,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:09:44,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:46,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 14:09:47,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 14:09:48,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:09:48,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 14:09:48,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:09:52,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:09:52,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 14:09:55,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:09:56,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:56,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1294380.0, ans=0.125 2023-10-03 14:09:59,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:10:00,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:10:01,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1294380.0, ans=0.125 2023-10-03 14:10:04,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 14:10:04,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 14:10:04,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:10:06,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:10:08,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 14:10:09,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:10:13,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1294446.6666666667, ans=0.1 2023-10-03 14:10:14,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:10:18,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1294446.6666666667, ans=0.1 2023-10-03 14:10:20,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1294446.6666666667, ans=0.125 2023-10-03 14:10:21,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:10:21,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:10:24,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 14:10:27,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:10:27,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 14:10:27,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:10:27,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:10:36,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:10:38,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 14:10:39,303 INFO [train.py:1046] (1/4) Epoch 37, batch 2950, loss[loss=0.1525, simple_loss=0.2352, pruned_loss=0.03491, over 24456.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2383, pruned_loss=0.03982, over 4708374.62 frames. ], batch size: 63, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:10:39,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:10:39,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:10:40,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:10:40,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1294580.0, ans=0.2 2023-10-03 14:10:42,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:10:43,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 14:10:45,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 14:10:45,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:10:45,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:10:49,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:10:50,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:10:54,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:10:54,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:10:58,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:11:00,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:11:01,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:11:02,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:11:02,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:11:04,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 14:11:07,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1294713.3333333333, ans=0.1 2023-10-03 14:11:09,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 14:11:10,189 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.983e+02 2.182e+02 2.466e+02 3.781e+02, threshold=4.363e+02, percent-clipped=0.0 2023-10-03 14:11:10,276 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 14:11:11,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:11:12,272 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.64 vs. limit=15.0 2023-10-03 14:11:12,985 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 14:11:15,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 14:11:16,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:11:17,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:11:17,570 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 14:11:17,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:11:19,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 14:11:20,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:11:20,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:11:23,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:11:23,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:11:23,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:11:24,630 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 14:11:24,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:11:26,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 14:11:32,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:11:34,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:11:34,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 14:11:35,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:11:36,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 14:11:38,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:11:40,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:11:40,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:11:42,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:11:42,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 14:11:43,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:11:45,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:11:45,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:11:45,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:11:47,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:11:47,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:11:49,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:11:49,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 14:11:50,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:11:52,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:11:53,422 INFO [train.py:1046] (1/4) Epoch 37, batch 3000, loss[loss=0.1663, simple_loss=0.2487, pruned_loss=0.04194, over 23417.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2386, pruned_loss=0.0397, over 4719503.78 frames. ], batch size: 105, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:11:53,422 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 14:12:01,414 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.6211, 5.4739, 5.1521, 4.8075], device='cuda:1') 2023-10-03 14:12:05,423 INFO [train.py:1078] (1/4) Epoch 37, validation: loss=0.3637, simple_loss=0.2861, pruned_loss=0.2207, over 1125622.00 frames. 2023-10-03 14:12:05,424 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 14:12:05,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:12:05,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff2.min_abs, batch_count=1294913.3333333333, ans=0.1 2023-10-03 14:12:08,641 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 14:12:09,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 14:12:11,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:12:11,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:12:12,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 14:12:12,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:12:19,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 14:12:30,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:12:34,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1295046.6666666667, ans=0.125 2023-10-03 14:12:36,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 14:12:38,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:12:39,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:12:40,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:12:40,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:12:42,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:12:42,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 14:12:45,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 14:12:47,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:12:47,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 14:12:48,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1295046.6666666667, ans=0.0 2023-10-03 14:12:50,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:12:50,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:12:51,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:12:51,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:12:55,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1295113.3333333333, ans=0.125 2023-10-03 14:12:56,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:12:57,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:12:57,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:12:58,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:13:03,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 14:13:03,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:13:04,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:04,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:13:09,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:13:10,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:13:11,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 14:13:11,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 14:13:11,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:13:11,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 14:13:11,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:13:13,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 14:13:13,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1295180.0, ans=0.0 2023-10-03 14:13:15,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1295180.0, ans=0.1 2023-10-03 14:13:17,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:13:19,165 INFO [train.py:1046] (1/4) Epoch 37, batch 3050, loss[loss=0.2357, simple_loss=0.2967, pruned_loss=0.08733, over 19803.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.239, pruned_loss=0.03984, over 4723525.02 frames. ], batch size: 388, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:13:19,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:13:19,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 14:13:21,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 14:13:21,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 14:13:22,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:13:23,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:13:23,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 14:13:23,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:23,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:13:26,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 14:13:28,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:13:28,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1295246.6666666667, ans=0.2 2023-10-03 14:13:30,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:13:30,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:13:33,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:36,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 14:13:43,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 14:13:43,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 14:13:44,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:13:47,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:13:49,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:49,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:13:50,504 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.847e+02 2.000e+02 2.223e+02 2.874e+02, threshold=4.000e+02, percent-clipped=0.0 2023-10-03 14:13:51,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:13:51,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1295380.0, ans=0.0 2023-10-03 14:13:53,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:13:53,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:13:55,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:13:55,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:13:55,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:13:56,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:59,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:00,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:14:01,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=1295380.0, ans=0.02 2023-10-03 14:14:02,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 14:14:02,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:14:02,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:14:04,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:14:06,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:14:06,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:14:08,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:14:12,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:14:14,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:14:14,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1295446.6666666667, ans=0.125 2023-10-03 14:14:17,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:18,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:14:18,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:14:20,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:14:20,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:14:22,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:14:22,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 14:14:24,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:14:24,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:25,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 14:14:26,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1295513.3333333333, ans=0.2 2023-10-03 14:14:27,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:14:33,153 INFO [train.py:1046] (1/4) Epoch 37, batch 3100, loss[loss=0.1548, simple_loss=0.2444, pruned_loss=0.03256, over 24676.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2391, pruned_loss=0.03991, over 4721060.25 frames. ], batch size: 65, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:14:33,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:14:35,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:14:36,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:14:37,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 14:14:41,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 14:14:41,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 14:14:42,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:14:47,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:14:47,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:48,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 14:14:50,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1295646.6666666667, ans=0.125 2023-10-03 14:14:53,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:57,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 14:15:01,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:15:02,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:02,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:15:02,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:15:04,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 14:15:06,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1295713.3333333333, ans=0.125 2023-10-03 14:15:07,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:15:07,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 14:15:07,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:15:07,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:15:09,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 14:15:10,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:15:13,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:15:13,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 14:15:14,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 14:15:15,339 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.03 vs. limit=15.0 2023-10-03 14:15:16,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:18,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:15:19,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:15:19,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:19,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:15:21,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:15:21,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:15:21,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1295780.0, ans=0.0 2023-10-03 14:15:22,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:15:24,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:15:24,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:24,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 14:15:29,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:15:29,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 14:15:32,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:15:33,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 14:15:33,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:15:33,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:35,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 14:15:47,056 INFO [train.py:1046] (1/4) Epoch 37, batch 3150, loss[loss=0.162, simple_loss=0.2321, pruned_loss=0.04595, over 23746.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2372, pruned_loss=0.03967, over 4718490.03 frames. ], batch size: 179, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:15:47,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 14:15:48,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:15:49,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:52,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:15:52,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:15:53,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 14:15:55,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:15:55,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 14:15:55,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 14:15:57,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:15:59,365 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 14:16:00,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 14:16:02,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:16:03,395 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 14:16:03,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 14:16:05,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1295980.0, ans=0.125 2023-10-03 14:16:06,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 14:16:06,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 14:16:06,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1295980.0, ans=0.2 2023-10-03 14:16:08,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 14:16:08,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:16:08,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:16:08,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:16:10,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 14:16:12,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:16:12,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:16:14,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:16:14,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1295980.0, ans=0.1 2023-10-03 14:16:15,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 14:16:18,522 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.944e+02 2.154e+02 2.658e+02 3.587e+02, threshold=4.309e+02, percent-clipped=0.0 2023-10-03 14:16:19,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 14:16:19,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:16:23,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:16:23,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:16:25,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 14:16:26,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 14:16:27,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:16:27,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 14:16:27,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:16:29,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:16:29,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:16:30,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:16:30,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:16:31,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 14:16:32,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:16:33,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:16:34,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:16:34,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:16:36,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 14:16:36,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:16:37,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 14:16:37,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:16:37,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 14:16:39,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 14:16:39,718 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.24 vs. limit=15.0 2023-10-03 14:16:41,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:16:41,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:16:42,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 14:16:43,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 14:16:43,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:16:46,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:16:46,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1296180.0, ans=0.125 2023-10-03 14:16:48,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:16:48,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:16:50,876 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.98 vs. limit=15.0 2023-10-03 14:16:54,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:16:55,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:16:58,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 14:17:01,203 INFO [train.py:1046] (1/4) Epoch 37, batch 3200, loss[loss=0.1725, simple_loss=0.2573, pruned_loss=0.04387, over 24632.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2363, pruned_loss=0.03964, over 4714309.71 frames. ], batch size: 68, lr: 2.75e-03, grad_scale: 32.0 2023-10-03 14:17:02,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:17:02,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 14:17:06,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:17:07,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:17:07,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 14:17:10,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:17:12,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1296246.6666666667, ans=0.0 2023-10-03 14:17:17,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:17:17,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1296313.3333333333, ans=0.0 2023-10-03 14:17:18,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:17:28,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:17:35,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 14:17:36,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:17:38,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 14:17:39,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 14:17:44,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:17:44,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:17:44,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:17:47,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 14:17:50,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 14:17:52,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 14:17:55,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 14:17:56,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:18:02,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:18:02,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:18:02,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:18:03,472 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 14:18:03,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:18:07,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:18:07,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1296513.3333333333, ans=0.0 2023-10-03 14:18:08,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 14:18:09,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 14:18:10,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 14:18:12,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 14:18:13,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:18:14,913 INFO [train.py:1046] (1/4) Epoch 37, batch 3250, loss[loss=0.1721, simple_loss=0.258, pruned_loss=0.0431, over 24707.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2368, pruned_loss=0.03979, over 4714889.34 frames. ], batch size: 73, lr: 2.75e-03, grad_scale: 32.0 2023-10-03 14:18:16,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 14:18:16,946 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 14:18:16,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:18:16,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:18,411 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 14:18:19,867 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=9.49 vs. limit=15.0 2023-10-03 14:18:20,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1296580.0, ans=10.0 2023-10-03 14:18:21,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1296580.0, ans=0.125 2023-10-03 14:18:22,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:18:23,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1296580.0, ans=0.0 2023-10-03 14:18:24,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:18:24,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1296580.0, ans=0.2 2023-10-03 14:18:34,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:18:34,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 14:18:34,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:18:35,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:18:35,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:18:37,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:18:37,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:18:39,206 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.58 vs. limit=15.0 2023-10-03 14:18:39,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:39,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:18:39,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:18:41,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:41,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:41,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:18:43,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:18:44,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:18:45,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:18:45,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:47,720 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.697e+02 1.879e+02 2.099e+02 2.263e+02 3.172e+02, threshold=4.197e+02, percent-clipped=0.0 2023-10-03 14:18:49,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:18:49,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:18:49,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:18:55,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 14:18:55,943 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.33 vs. limit=15.0 2023-10-03 14:18:56,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:18:56,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:18:58,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:18:59,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:19:03,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:19:11,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:19:12,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:19:12,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 14:19:12,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:19:12,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 14:19:13,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:19:15,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 14:19:16,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 14:19:16,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:19:18,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:19:19,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:19:19,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 14:19:21,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:19:23,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:19:24,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:19:25,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 14:19:25,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:19:26,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:19:26,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 14:19:29,185 INFO [train.py:1046] (1/4) Epoch 37, batch 3300, loss[loss=0.1494, simple_loss=0.2362, pruned_loss=0.03135, over 24496.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2378, pruned_loss=0.0398, over 4719946.30 frames. ], batch size: 66, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:19:29,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:19:29,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 14:19:31,348 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.82 vs. limit=15.0 2023-10-03 14:19:32,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 14:19:32,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1296913.3333333333, ans=0.0 2023-10-03 14:19:33,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 14:19:33,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:19:36,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:19:37,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:19:37,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:19:39,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 14:19:39,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1296913.3333333333, ans=0.125 2023-10-03 14:19:39,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1296913.3333333333, ans=0.0 2023-10-03 14:19:40,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:19:43,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:19:44,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:19:47,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 14:19:48,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:19:48,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:19:50,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:19:52,659 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 14:19:52,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:19:52,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 14:19:54,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:19:54,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:19:54,712 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 14:19:56,277 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1296980.0, ans=0.125 2023-10-03 14:19:57,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1297046.6666666667, ans=0.125 2023-10-03 14:19:58,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:20:00,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:20:01,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:20:01,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 14:20:03,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 14:20:03,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:20:05,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:20:07,118 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 14:20:08,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 14:20:08,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:20:08,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1297046.6666666667, ans=0.125 2023-10-03 14:20:12,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 14:20:14,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:20:17,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 14:20:17,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:20:21,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:20:21,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:20:21,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:20:21,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:20:23,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:20:23,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:20:24,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:20:26,829 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 14:20:26,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 14:20:28,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:20:30,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:20:30,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:20:31,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:20:31,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:20:31,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:20:33,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:20:34,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 14:20:35,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:20:35,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 14:20:40,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 14:20:40,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:20:41,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:20:41,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:20:42,810 INFO [train.py:1046] (1/4) Epoch 37, batch 3350, loss[loss=0.1599, simple_loss=0.2294, pruned_loss=0.04523, over 23770.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2387, pruned_loss=0.03984, over 4726862.08 frames. ], batch size: 195, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:20:42,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:20:42,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:20:45,029 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.99 vs. limit=15.0 2023-10-03 14:20:45,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:20:45,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:20:50,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:20:51,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:20:53,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:20:56,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:20:58,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:21:00,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:21:01,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:21:03,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 14:21:03,198 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 14:21:04,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:21:07,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 14:21:07,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 14:21:08,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:21:08,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:21:08,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:21:08,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 14:21:10,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:21:10,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:21:11,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:21:14,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:21:14,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:21:14,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:21:15,430 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.884e+02 2.039e+02 2.303e+02 3.240e+02, threshold=4.079e+02, percent-clipped=0.0 2023-10-03 14:21:18,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:21:21,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:21:21,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:21:26,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:21:26,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:21:28,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:21:28,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:21:31,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:21:33,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1297446.6666666667, ans=0.0 2023-10-03 14:21:35,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 14:21:35,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:21:35,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 14:21:37,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:21:38,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 14:21:38,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:21:39,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:21:44,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:21:44,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 14:21:45,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:21:45,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:21:47,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:21:52,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:21:55,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 14:21:55,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:21:56,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:21:57,598 INFO [train.py:1046] (1/4) Epoch 37, batch 3400, loss[loss=0.158, simple_loss=0.2289, pruned_loss=0.04358, over 23709.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2396, pruned_loss=0.04044, over 4721196.77 frames. ], batch size: 179, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:21:57,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:21:59,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 14:22:01,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:22:01,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 14:22:02,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:22:02,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:22:04,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:22:05,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:22:05,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 14:22:09,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 14:22:09,763 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 14:22:09,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:22:14,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:22:14,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1297646.6666666667, ans=0.2 2023-10-03 14:22:15,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:22:15,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:22:17,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:22:18,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1297646.6666666667, ans=0.2 2023-10-03 14:22:20,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:22:24,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 14:22:27,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1297713.3333333333, ans=0.0 2023-10-03 14:22:28,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:22:31,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:22:31,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:22:32,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 14:22:37,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1297713.3333333333, ans=0.2 2023-10-03 14:22:38,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:22:41,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 14:22:46,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:22:48,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:22:48,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 14:22:48,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:22:48,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:22:49,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:22:50,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:22:51,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1297780.0, ans=0.125 2023-10-03 14:22:52,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:22:55,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:22:55,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:23:02,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:23:02,808 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:23:03,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 14:23:09,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 14:23:12,538 INFO [train.py:1046] (1/4) Epoch 37, batch 3450, loss[loss=0.1615, simple_loss=0.2447, pruned_loss=0.03912, over 24657.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2385, pruned_loss=0.04, over 4720556.69 frames. ], batch size: 68, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:23:14,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 14:23:15,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 14:23:16,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:23:18,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:23:18,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 14:23:19,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:23:22,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:23:26,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1297980.0, ans=0.125 2023-10-03 14:23:27,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:23:29,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:23:31,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:23:31,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:23:32,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:23:34,394 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.28 vs. limit=15.0 2023-10-03 14:23:38,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 14:23:45,258 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.909e+02 2.144e+02 2.320e+02 3.378e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 14:23:45,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 14:23:45,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 14:23:45,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:23:45,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:23:52,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 14:23:52,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:23:55,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1298113.3333333333, ans=0.0 2023-10-03 14:23:56,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:23:56,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:23:58,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:23:59,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:24:00,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 14:24:00,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:24:01,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:24:03,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:24:06,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1298113.3333333333, ans=0.0 2023-10-03 14:24:07,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 14:24:09,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:24:13,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:24:14,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:16,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:24:21,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:21,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:24:21,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:24:22,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1298180.0, ans=0.1 2023-10-03 14:24:23,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:24:26,301 INFO [train.py:1046] (1/4) Epoch 37, batch 3500, loss[loss=0.1598, simple_loss=0.2538, pruned_loss=0.03292, over 24667.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.237, pruned_loss=0.0399, over 4697638.37 frames. ], batch size: 73, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:24:26,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:24:31,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:24:31,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 14:24:33,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 14:24:35,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 14:24:37,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1298246.6666666667, ans=0.125 2023-10-03 14:24:38,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:24:38,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 14:24:43,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:24:44,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:24:44,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:24:44,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:24:45,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 14:24:47,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:47,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:24:47,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 14:24:51,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:51,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 14:24:52,020 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.38 vs. limit=15.0 2023-10-03 14:24:53,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:24:56,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:58,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 14:24:59,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:25:01,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:25:02,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:25:03,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:25:06,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:25:06,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:25:07,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1298380.0, ans=0.0 2023-10-03 14:25:08,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 14:25:10,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 14:25:10,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 14:25:10,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:25:11,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:25:11,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:25:11,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:25:14,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 14:25:14,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:25:18,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:25:20,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 14:25:20,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 14:25:20,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:25:24,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:25:24,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:25:26,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:25:28,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 14:25:30,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:25:31,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:25:31,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 14:25:35,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 14:25:37,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:25:37,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:25:37,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:25:37,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:25:41,970 INFO [train.py:1046] (1/4) Epoch 37, batch 3550, loss[loss=0.1611, simple_loss=0.2516, pruned_loss=0.03534, over 24659.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.235, pruned_loss=0.03933, over 4687533.35 frames. ], batch size: 68, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:25:42,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:25:47,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:25:48,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1298580.0, ans=0.125 2023-10-03 14:25:49,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 14:25:53,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:25:53,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:25:56,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:25:58,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:25:58,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:26:01,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:26:01,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:26:02,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:26:02,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 14:26:04,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:26:09,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:26:09,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:26:11,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:26:11,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:26:11,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:26:11,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 14:26:12,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1298713.3333333333, ans=0.1 2023-10-03 14:26:13,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:26:14,584 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.883e+02 2.070e+02 2.275e+02 3.078e+02, threshold=4.140e+02, percent-clipped=0.0 2023-10-03 14:26:14,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:26:14,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 14:26:20,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:26:20,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:26:21,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:26:22,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1298713.3333333333, ans=0.2 2023-10-03 14:26:23,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 14:26:25,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:26:26,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 14:26:26,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:26:28,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:26:29,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:26:31,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 14:26:33,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:26:39,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:26:40,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 14:26:40,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:26:44,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:26:44,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 14:26:52,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 14:26:52,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:26:52,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:26:55,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:26:55,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1298913.3333333333, ans=0.125 2023-10-03 14:26:56,230 INFO [train.py:1046] (1/4) Epoch 37, batch 3600, loss[loss=0.1829, simple_loss=0.243, pruned_loss=0.06146, over 19341.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2357, pruned_loss=0.03929, over 4686272.20 frames. ], batch size: 388, lr: 2.75e-03, grad_scale: 32.0 2023-10-03 14:26:56,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:26:56,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:27:01,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:27:04,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:27:05,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:27:07,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:27:07,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:27:07,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 14:27:13,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:27:14,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:27:17,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:27:18,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:27:20,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:27:21,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:27:21,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 14:27:22,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:27:24,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:27:24,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:27:25,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:27:27,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:27:27,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:27:28,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1299046.6666666667, ans=0.0 2023-10-03 14:27:29,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 14:27:37,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:27:37,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:27:39,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 14:27:43,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:27:46,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:27:48,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:27:54,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:27:54,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:27:54,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 14:27:56,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 14:27:56,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1299180.0, ans=0.125 2023-10-03 14:27:57,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 14:27:59,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:28:00,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:28:00,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 14:28:00,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:28:02,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:28:02,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:28:02,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 14:28:04,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 14:28:05,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:28:05,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 14:28:06,483 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.91 vs. limit=15.0 2023-10-03 14:28:10,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 14:28:11,744 INFO [train.py:1046] (1/4) Epoch 37, batch 3650, loss[loss=0.154, simple_loss=0.2381, pruned_loss=0.03496, over 24459.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2364, pruned_loss=0.03945, over 4698973.03 frames. ], batch size: 66, lr: 2.75e-03, grad_scale: 32.0 2023-10-03 14:28:11,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:28:14,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 14:28:15,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 14:28:18,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:28:18,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:28:18,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:28:21,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 14:28:22,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:28:22,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 14:28:24,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:28:26,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:28:26,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 14:28:27,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 14:28:29,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:28:29,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:28:30,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:28:33,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 14:28:35,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 14:28:37,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:28:40,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 14:28:40,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:28:40,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:28:44,298 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.976e+02 2.171e+02 2.481e+02 4.276e+02, threshold=4.341e+02, percent-clipped=1.0 2023-10-03 14:28:47,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:28:48,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:28:48,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:28:50,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:28:51,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:28:54,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:28:55,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:28:55,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1299446.6666666667, ans=0.125 2023-10-03 14:28:56,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:28:56,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:28:58,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 14:29:00,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:29:01,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:29:10,343 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 14:29:12,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1299513.3333333333, ans=0.125 2023-10-03 14:29:15,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:29:15,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:29:16,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:29:16,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:29:17,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:29:19,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:29:20,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 14:29:20,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:29:23,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:29:24,817 INFO [train.py:1046] (1/4) Epoch 37, batch 3700, loss[loss=0.1529, simple_loss=0.2355, pruned_loss=0.03515, over 23626.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2374, pruned_loss=0.03921, over 4720120.07 frames. ], batch size: 120, lr: 2.74e-03, grad_scale: 32.0 2023-10-03 14:29:26,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:29:27,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:29:30,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:29:30,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 14:29:30,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:29:31,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 14:29:33,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:29:34,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:29:39,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:29:40,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:29:40,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:29:42,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:29:42,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 14:29:45,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:29:47,332 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 14:29:53,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:29:54,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:29:55,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:29:55,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 14:29:55,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:29:58,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:29:59,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 14:29:59,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:30:01,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:30:01,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1299713.3333333333, ans=0.125 2023-10-03 14:30:02,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:30:04,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:30:06,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 14:30:10,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:30:10,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 14:30:12,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:30:12,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 14:30:18,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:30:18,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:30:18,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1299780.0, ans=0.125 2023-10-03 14:30:19,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:30:19,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1299780.0, ans=0.1 2023-10-03 14:30:21,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 14:30:22,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:30:22,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:30:23,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:30:23,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:30:26,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:30:28,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 14:30:29,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 14:30:30,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:30:30,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:30:32,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:30:33,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:30:35,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:30:36,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:30:37,881 INFO [train.py:1046] (1/4) Epoch 37, batch 3750, loss[loss=0.1665, simple_loss=0.256, pruned_loss=0.03849, over 24651.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2388, pruned_loss=0.0393, over 4716807.93 frames. ], batch size: 73, lr: 2.74e-03, grad_scale: 32.0 2023-10-03 14:30:37,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:30:39,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 14:30:41,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 14:30:44,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 14:30:44,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 14:30:46,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:30:46,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:30:48,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:30:49,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:30:52,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:30:56,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:30:57,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:31:00,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:31:02,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:31:03,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 14:31:03,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:31:04,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:31:04,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:31:07,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 14:31:10,851 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.862e+02 2.050e+02 2.343e+02 3.351e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-03 14:31:11,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 14:31:12,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:31:12,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:31:14,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:31:20,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:31:20,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 14:31:25,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 14:31:28,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:31:31,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:31:31,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:31:35,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:31:38,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 14:31:40,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 14:31:43,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:31:44,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:31:47,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:31:51,818 INFO [train.py:1046] (1/4) Epoch 37, batch 3800, loss[loss=0.1579, simple_loss=0.2451, pruned_loss=0.03539, over 24348.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.239, pruned_loss=0.0394, over 4715791.61 frames. ], batch size: 74, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:31:52,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1300246.6666666667, ans=0.0 2023-10-03 14:31:54,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:31:57,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1300246.6666666667, ans=0.125 2023-10-03 14:31:58,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:00,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 14:32:00,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 14:32:02,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:32:04,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:32:04,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 14:32:07,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 14:32:07,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:08,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:32:11,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1300313.3333333333, ans=0.1 2023-10-03 14:32:12,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:32:12,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:32:12,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:32:13,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 14:32:16,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 14:32:17,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:32:20,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:32:23,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:32:23,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:32:25,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:32:25,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:32:26,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:27,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:32:33,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:32:33,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 14:32:34,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:32:39,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:32:44,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:32:47,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 14:32:48,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 14:32:48,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:32:50,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:32:50,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:50,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1300513.3333333333, ans=0.2 2023-10-03 14:32:52,387 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.65 vs. limit=10.0 2023-10-03 14:32:53,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 14:32:55,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 14:32:55,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 14:32:55,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:57,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:33:03,883 INFO [train.py:1046] (1/4) Epoch 37, batch 3850, loss[loss=0.1402, simple_loss=0.2251, pruned_loss=0.02763, over 24610.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2381, pruned_loss=0.03919, over 4713578.73 frames. ], batch size: 60, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:33:03,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:33:04,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:33:04,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1300580.0, ans=0.2 2023-10-03 14:33:08,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:33:10,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 14:33:10,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:33:11,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:33:14,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1300580.0, ans=0.1 2023-10-03 14:33:16,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:33:16,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:33:19,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 14:33:19,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 14:33:19,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1300646.6666666667, ans=0.125 2023-10-03 14:33:19,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1300646.6666666667, ans=0.1 2023-10-03 14:33:23,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:26,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:33:29,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:33:29,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:33:32,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:32,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:33:33,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:33:33,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:33:33,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:33:35,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:33:36,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:36,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:33:38,010 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.860e+02 2.056e+02 2.272e+02 4.240e+02, threshold=4.112e+02, percent-clipped=1.0 2023-10-03 14:33:38,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 14:33:38,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 14:33:39,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:33:39,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:40,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1300713.3333333333, ans=0.125 2023-10-03 14:33:41,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:33:42,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:42,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 14:33:44,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 14:33:46,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:33:47,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 14:33:47,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1300780.0, ans=0.5 2023-10-03 14:33:49,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 14:33:53,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:33:53,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:58,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:33:58,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 14:33:58,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1300780.0, ans=0.0 2023-10-03 14:34:01,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 14:34:02,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:34:02,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:34:06,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:34:06,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:34:08,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:08,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:09,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:34:09,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 14:34:09,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:34:11,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 14:34:11,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:11,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:34:15,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:34:15,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:16,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:34:18,208 INFO [train.py:1046] (1/4) Epoch 37, batch 3900, loss[loss=0.1516, simple_loss=0.2335, pruned_loss=0.03482, over 24480.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2371, pruned_loss=0.03914, over 4698962.93 frames. ], batch size: 63, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:34:18,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:34:18,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:34:19,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:34:19,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 14:34:20,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:23,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:34:26,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:34:26,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:34:27,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:34:30,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:34:30,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:31,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:34:33,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 14:34:33,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:34:34,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 14:34:34,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:36,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 14:34:39,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 14:34:43,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:34:43,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:34:43,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:34:43,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:34:50,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:34:50,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1301046.6666666667, ans=0.0 2023-10-03 14:34:51,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:34:54,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:34:55,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:34:55,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:35:02,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:35:03,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:35:09,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:35:10,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:35:21,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:35:23,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:35:24,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 14:35:24,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 14:35:26,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:35:26,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 14:35:27,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:35:28,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 14:35:30,268 INFO [train.py:1046] (1/4) Epoch 37, batch 3950, loss[loss=0.1643, simple_loss=0.2327, pruned_loss=0.04798, over 22886.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2373, pruned_loss=0.03884, over 4710184.87 frames. ], batch size: 322, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:35:35,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:35:37,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 14:35:37,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:35:38,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1301246.6666666667, ans=0.0 2023-10-03 14:35:40,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:35:41,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:35:46,870 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 14:35:47,633 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.84 vs. limit=12.0 2023-10-03 14:35:48,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:35:48,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 14:35:48,290 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 14:35:50,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:35:52,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:35:52,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:35:52,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:35:54,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 14:35:57,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:35:59,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:35:59,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:35:59,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:36:00,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:36:05,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=1301380.0, ans=22.5 2023-10-03 14:36:05,919 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.893e+02 2.072e+02 2.243e+02 3.144e+02, threshold=4.143e+02, percent-clipped=0.0 2023-10-03 14:36:11,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:36:11,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:36:11,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1301380.0, ans=0.1 2023-10-03 14:36:18,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 14:36:23,369 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.64 vs. limit=22.5 2023-10-03 14:36:24,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 14:36:24,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 14:36:24,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:36:25,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:36:28,723 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.13 vs. limit=15.0 2023-10-03 14:36:34,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:36:34,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:36:34,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:36:35,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:36:35,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 14:36:39,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:36:39,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:36:41,518 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:36:43,898 INFO [train.py:1046] (1/4) Epoch 37, batch 4000, loss[loss=0.1603, simple_loss=0.2446, pruned_loss=0.03804, over 23285.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2375, pruned_loss=0.03896, over 4719642.23 frames. ], batch size: 105, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:36:44,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 14:36:53,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:36:59,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:37:03,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:37:03,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:37:05,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:37:05,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 14:37:06,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:37:07,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1301646.6666666667, ans=0.04949747468305833 2023-10-03 14:37:08,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 14:37:08,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:37:08,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 14:37:09,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:37:10,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:37:12,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:37:12,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:37:12,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:37:12,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 14:37:13,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:37:13,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1301713.3333333333, ans=0.125 2023-10-03 14:37:15,058 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 14:37:16,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:37:18,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:37:22,559 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 14:37:22,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:37:22,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:37:28,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 14:37:28,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:37:32,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:37:33,432 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 14:37:34,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:37:34,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 14:37:34,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:37:34,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:37:36,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:37:37,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:37:37,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:37:37,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:37:37,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1301780.0, ans=0.0 2023-10-03 14:37:39,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 14:37:40,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:37:40,532 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 14:37:43,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1301846.6666666667, ans=0.0 2023-10-03 14:37:45,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:37:49,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 14:37:50,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:37:52,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:37:52,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:37:53,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:37:54,692 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.92 vs. limit=12.0 2023-10-03 14:37:58,134 INFO [train.py:1046] (1/4) Epoch 37, batch 4050, loss[loss=0.1605, simple_loss=0.2512, pruned_loss=0.03491, over 24647.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2381, pruned_loss=0.03942, over 4721798.30 frames. ], batch size: 73, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:37:58,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:37:59,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 14:38:01,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 14:38:02,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:38:02,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:38:05,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:38:05,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:38:06,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:38:06,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1301913.3333333333, ans=0.0 2023-10-03 14:38:09,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:38:12,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:38:12,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:38:15,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:38:15,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:38:19,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:38:20,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1301980.0, ans=0.2 2023-10-03 14:38:21,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:38:24,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 14:38:27,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 14:38:27,294 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 14:38:30,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:38:33,274 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.902e+02 2.050e+02 2.347e+02 3.332e+02, threshold=4.101e+02, percent-clipped=0.0 2023-10-03 14:38:36,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 14:38:36,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1302046.6666666667, ans=0.0 2023-10-03 14:38:37,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:38:40,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:38:43,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:38:43,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:38:43,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:38:46,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1302113.3333333333, ans=0.125 2023-10-03 14:38:47,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:38:49,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 14:38:49,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:38:51,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:38:54,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 14:38:54,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1302113.3333333333, ans=0.2 2023-10-03 14:38:58,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:39:03,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1302180.0, ans=0.125 2023-10-03 14:39:04,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 14:39:06,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:39:06,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:39:07,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 14:39:07,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 14:39:07,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:39:11,537 INFO [train.py:1046] (1/4) Epoch 37, batch 4100, loss[loss=0.157, simple_loss=0.2496, pruned_loss=0.03214, over 24567.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2389, pruned_loss=0.03978, over 4722367.29 frames. ], batch size: 71, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:39:11,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:39:11,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:11,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:39:18,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 14:39:20,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 14:39:24,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 14:39:24,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 14:39:24,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:39:24,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:24,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1302246.6666666667, ans=0.125 2023-10-03 14:39:25,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:25,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:39:27,440 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 14:39:27,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1302313.3333333333, ans=0.125 2023-10-03 14:39:30,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:39:30,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:39:30,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:39:31,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:39:34,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1302313.3333333333, ans=0.125 2023-10-03 14:39:37,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:39:38,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:39:38,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:39:38,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 14:39:38,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:38,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:39:38,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:39:40,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:39:40,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 14:39:43,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:39:44,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 14:39:45,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:39:48,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:39:48,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 14:39:48,601 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:39:49,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:39:49,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:39:49,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:39:51,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 14:39:53,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:39:53,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:39:56,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 14:39:57,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:58,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:40:01,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:40:05,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:40:07,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:40:07,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:40:15,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:40:16,220 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.16 vs. limit=15.0 2023-10-03 14:40:16,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:40:20,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:40:20,334 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:40:22,031 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.12 vs. limit=15.0 2023-10-03 14:40:24,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:40:26,402 INFO [train.py:1046] (1/4) Epoch 37, batch 4150, loss[loss=0.1645, simple_loss=0.2541, pruned_loss=0.03746, over 24359.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2392, pruned_loss=0.03968, over 4729788.73 frames. ], batch size: 77, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:40:29,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:40:30,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:40:32,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:40:32,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:40:34,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 14:40:34,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:40:34,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 14:40:36,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 14:40:36,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 14:40:38,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:40:39,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1302646.6666666667, ans=0.1 2023-10-03 14:40:42,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:40:42,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:40:45,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:40:46,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:40:46,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:40:49,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 14:40:49,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:40:52,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 14:40:54,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:40:54,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1302713.3333333333, ans=0.125 2023-10-03 14:40:59,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:40:59,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 14:41:02,043 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.895e+02 2.138e+02 2.418e+02 3.497e+02, threshold=4.277e+02, percent-clipped=0.0 2023-10-03 14:41:02,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 14:41:02,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:41:03,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 14:41:03,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:41:03,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:41:05,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:41:06,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:41:10,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 14:41:12,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1302780.0, ans=0.125 2023-10-03 14:41:13,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:41:15,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:41:15,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 14:41:16,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:41:18,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 14:41:21,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:41:21,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:41:23,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:41:24,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 14:41:24,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:41:24,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:41:25,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 14:41:27,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 14:41:27,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:41:27,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:41:27,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:41:27,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 14:41:29,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:41:29,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:41:30,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:41:32,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:41:32,786 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.99 vs. limit=15.0 2023-10-03 14:41:33,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 14:41:33,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:41:39,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:41:40,825 INFO [train.py:1046] (1/4) Epoch 37, batch 4200, loss[loss=0.1388, simple_loss=0.2149, pruned_loss=0.0314, over 24251.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2377, pruned_loss=0.03934, over 4725900.25 frames. ], batch size: 56, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:41:40,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 14:41:42,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:41:45,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:41:45,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:41:46,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:41:46,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:41:49,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 14:41:49,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1302913.3333333333, ans=0.0 2023-10-03 14:41:52,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 14:41:54,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:41:56,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:41:59,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:42:03,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 14:42:04,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:42:04,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:42:05,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 14:42:05,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:42:07,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:42:07,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:42:08,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:42:08,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:42:10,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 14:42:11,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:42:16,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:42:17,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:42:19,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:42:20,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:42:22,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:42:22,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 14:42:23,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:42:25,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:42:30,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 14:42:30,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1303113.3333333333, ans=0.1 2023-10-03 14:42:31,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:42:37,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:42:37,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1303113.3333333333, ans=0.125 2023-10-03 14:42:38,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 14:42:41,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:42:47,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:42:47,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:42:50,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 14:42:54,556 INFO [train.py:1046] (1/4) Epoch 37, batch 4250, loss[loss=0.1554, simple_loss=0.2461, pruned_loss=0.03238, over 24639.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.237, pruned_loss=0.03898, over 4730513.28 frames. ], batch size: 68, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:42:54,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:42:57,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:42:57,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:43:02,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:06,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:43:06,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 14:43:08,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:43:10,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:13,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:43:15,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1303313.3333333333, ans=0.125 2023-10-03 14:43:17,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:43:17,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:43:20,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:43:20,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:43:21,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:43:21,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:43:23,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:43:25,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:43:25,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:43:25,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1303380.0, ans=0.0 2023-10-03 14:43:26,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 14:43:29,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 14:43:29,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:43:31,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:43:31,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:43:32,485 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.912e+02 2.052e+02 2.334e+02 3.222e+02, threshold=4.104e+02, percent-clipped=0.0 2023-10-03 14:43:32,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:43:32,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:32,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:43:34,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1303380.0, ans=0.5 2023-10-03 14:43:35,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 14:43:36,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:43:40,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:43:42,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:43:42,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 14:43:44,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:43:44,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 14:43:44,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1303446.6666666667, ans=0.125 2023-10-03 14:43:46,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:43:47,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:43:47,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1303446.6666666667, ans=0.2 2023-10-03 14:43:48,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:48,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:43:49,047 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:43:52,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 14:43:54,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:43:54,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:43:59,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:59,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:44:01,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:44:02,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:44:03,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:44:03,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:44:05,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:44:05,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 14:44:06,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:44:09,543 INFO [train.py:1046] (1/4) Epoch 37, batch 4300, loss[loss=0.1654, simple_loss=0.2441, pruned_loss=0.04336, over 23359.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2367, pruned_loss=0.03907, over 4730285.62 frames. ], batch size: 119, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:44:11,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:44:11,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:44:16,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:44:23,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:44:23,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 14:44:25,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:44:26,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:44:28,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:44:28,242 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 14:44:29,060 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.53 vs. limit=15.0 2023-10-03 14:44:29,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:44:31,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:44:32,145 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.26 vs. limit=15.0 2023-10-03 14:44:33,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 14:44:33,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:44:35,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 14:44:37,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 14:44:38,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1303713.3333333333, ans=0.125 2023-10-03 14:44:39,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:44:41,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:44:41,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:44:43,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:44:44,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:44:46,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:44:46,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 14:44:46,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1303713.3333333333, ans=0.125 2023-10-03 14:44:48,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 14:44:51,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:44:51,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1303713.3333333333, ans=0.125 2023-10-03 14:44:53,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:44:53,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:44:53,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:44:54,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:44:54,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 14:44:54,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 14:44:54,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 14:44:57,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:44:58,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 14:44:58,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 14:45:00,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:45:02,336 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 14:45:03,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:45:04,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=1303780.0, ans=15.0 2023-10-03 14:45:05,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1303780.0, ans=0.125 2023-10-03 14:45:06,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:45:06,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:45:07,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 14:45:09,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:45:09,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:45:09,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:45:09,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:45:09,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:45:12,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:45:13,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:45:14,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:45:15,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:45:18,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1303846.6666666667, ans=0.07 2023-10-03 14:45:20,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1303846.6666666667, ans=0.2 2023-10-03 14:45:21,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 14:45:22,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 14:45:24,273 INFO [train.py:1046] (1/4) Epoch 37, batch 4350, loss[loss=0.1733, simple_loss=0.2544, pruned_loss=0.04615, over 23560.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2378, pruned_loss=0.03971, over 4717473.47 frames. ], batch size: 106, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:45:26,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:45:29,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:45:32,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:45:32,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:45:32,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1303913.3333333333, ans=0.125 2023-10-03 14:45:37,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:45:40,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:45:43,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:45:43,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:45:43,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1303980.0, ans=0.0 2023-10-03 14:45:46,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:45:49,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:45:51,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:45:57,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 14:45:57,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:45:57,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:02,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1304046.6666666667, ans=0.0 2023-10-03 14:46:03,753 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.872e+02 2.033e+02 2.293e+02 3.578e+02, threshold=4.065e+02, percent-clipped=0.0 2023-10-03 14:46:03,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:07,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 14:46:08,842 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.79 vs. limit=15.0 2023-10-03 14:46:10,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:46:11,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 14:46:14,759 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 14:46:16,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:46:16,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:46:17,590 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 14:46:18,894 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 14:46:18,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:46:18,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:46:20,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:46:21,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:46:21,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:46:22,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:46:25,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 14:46:25,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:25,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:46:25,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:26,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 14:46:27,909 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 14:46:27,921 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 14:46:27,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 14:46:31,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:46:31,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:46:31,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:46:31,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:46:34,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 14:46:35,837 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 14:46:35,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:38,486 INFO [train.py:1046] (1/4) Epoch 37, batch 4400, loss[loss=0.1642, simple_loss=0.2466, pruned_loss=0.04088, over 24470.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2387, pruned_loss=0.03996, over 4703781.49 frames. ], batch size: 69, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:46:38,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:46:38,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:40,427 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.08 vs. limit=15.0 2023-10-03 14:46:41,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:46:44,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 14:46:44,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 14:46:44,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 14:46:44,108 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 14:46:45,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 14:46:45,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:46:48,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 14:46:48,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1304246.6666666667, ans=0.2 2023-10-03 14:46:50,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:52,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:46:52,349 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 14:46:56,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:46:56,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 14:46:56,151 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 14:46:58,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 14:47:00,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 14:47:00,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 14:47:00,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:47:02,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:47:02,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:47:04,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:47:06,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 14:47:06,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 14:47:08,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:47:09,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:47:09,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:47:10,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:47:10,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:47:10,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 14:47:10,997 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 14:47:15,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:47:15,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1304380.0, ans=0.0 2023-10-03 14:47:20,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:47:22,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 14:47:25,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:47:29,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:47:32,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:47:32,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 14:47:32,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:47:32,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:47:32,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:47:32,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 14:47:32,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1304446.6666666667, ans=0.0 2023-10-03 14:47:36,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 14:47:39,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 14:47:40,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 14:47:40,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:47:41,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 14:47:42,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:47:45,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:47:46,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 14:47:49,576 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:47:50,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:47:51,869 INFO [train.py:1046] (1/4) Epoch 37, batch 4450, loss[loss=0.1584, simple_loss=0.2303, pruned_loss=0.0432, over 23356.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2391, pruned_loss=0.0402, over 4700517.79 frames. ], batch size: 285, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:47:53,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:47:55,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:47:56,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1304580.0, ans=0.125 2023-10-03 14:48:01,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:48:01,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:48:04,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:48:04,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1304580.0, ans=0.125 2023-10-03 14:48:06,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:48:07,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:48:07,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:48:10,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 14:48:10,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:48:11,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:48:11,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:48:11,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:48:13,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 14:48:15,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.87 vs. limit=22.5 2023-10-03 14:48:17,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:48:18,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:48:18,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:48:20,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:48:21,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:48:24,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1304713.3333333333, ans=0.07 2023-10-03 14:48:26,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 14:48:27,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 14:48:27,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 14:48:27,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:48:32,541 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.931e+02 2.177e+02 2.666e+02 4.430e+02, threshold=4.354e+02, percent-clipped=2.0 2023-10-03 14:48:32,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:48:34,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 14:48:37,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 14:48:41,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:48:41,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 14:48:41,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:48:41,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:48:42,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:48:42,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:48:44,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:48:47,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:48:47,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 14:48:48,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:48:48,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1304780.0, ans=0.1 2023-10-03 14:48:51,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:48:52,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:48:54,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:48:54,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 14:48:55,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:48:59,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 14:49:01,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:49:05,941 INFO [train.py:1046] (1/4) Epoch 37, batch 4500, loss[loss=0.1689, simple_loss=0.2511, pruned_loss=0.04333, over 23450.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2392, pruned_loss=0.04017, over 4697147.39 frames. ], batch size: 93, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:49:06,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:49:09,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 14:49:09,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 14:49:10,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:49:15,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:49:15,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:49:16,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1304913.3333333333, ans=0.125 2023-10-03 14:49:17,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:49:17,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:49:17,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:49:18,163 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.99 vs. limit=15.0 2023-10-03 14:49:18,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:49:30,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:49:31,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:49:33,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:49:33,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:49:35,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:49:44,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:49:47,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:49:51,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:49:52,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:49:52,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 14:49:54,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:49:55,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:49:57,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:49:57,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:50:00,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:50:00,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 14:50:00,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 14:50:00,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:00,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1305113.3333333333, ans=0.125 2023-10-03 14:50:05,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:50:05,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:50:08,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:11,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:50:11,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:50:12,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 14:50:14,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 14:50:14,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 14:50:19,398 INFO [train.py:1046] (1/4) Epoch 37, batch 4550, loss[loss=0.1494, simple_loss=0.2336, pruned_loss=0.03264, over 24525.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2381, pruned_loss=0.04001, over 4699497.16 frames. ], batch size: 63, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:50:19,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 14:50:20,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 14:50:22,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:50:25,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:50:25,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:50:27,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:50:31,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:50:32,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:50:34,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:50:34,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:50:34,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:36,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:50:37,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:50:39,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:50:42,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 14:50:43,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 14:50:45,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:50:46,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 14:50:49,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 14:50:50,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:50:54,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 14:50:56,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:50:58,878 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.866e+02 2.085e+02 2.369e+02 3.164e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-03 14:50:58,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:59,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:59,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:51:01,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 14:51:03,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=1305446.6666666667, ans=15.0 2023-10-03 14:51:04,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:51:06,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:51:06,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:51:08,315 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.46 vs. limit=15.0 2023-10-03 14:51:08,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:51:10,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 14:51:11,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 14:51:11,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:51:12,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 14:51:14,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 14:51:16,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:51:16,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:51:16,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:51:19,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:51:19,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:51:20,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 14:51:21,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 14:51:23,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:51:23,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 14:51:23,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 14:51:23,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:51:23,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 14:51:24,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:51:26,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:51:27,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:51:27,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:51:27,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:51:28,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:51:32,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:51:34,110 INFO [train.py:1046] (1/4) Epoch 37, batch 4600, loss[loss=0.1596, simple_loss=0.245, pruned_loss=0.03707, over 23274.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2367, pruned_loss=0.03969, over 4699258.87 frames. ], batch size: 106, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:51:34,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:51:34,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1305580.0, ans=0.09899494936611666 2023-10-03 14:51:35,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:51:38,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:51:38,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:51:40,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:51:41,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 14:51:42,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:51:44,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:51:46,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:51:49,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:51:50,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1305646.6666666667, ans=0.125 2023-10-03 14:51:55,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 14:51:56,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1305646.6666666667, ans=0.0 2023-10-03 14:51:57,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:00,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:03,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:52:03,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:52:11,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 14:52:11,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:52:11,874 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.88 vs. limit=15.0 2023-10-03 14:52:12,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:52:15,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:16,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:52:18,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:52:21,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 14:52:21,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1305780.0, ans=0.0 2023-10-03 14:52:22,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 14:52:25,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:52:27,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:52:28,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:52:28,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 14:52:29,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:31,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 14:52:31,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:52:31,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:52:33,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:52:34,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:52:34,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:52:34,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 14:52:36,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 14:52:36,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 14:52:36,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:52:36,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:52:37,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:52:37,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:52:39,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1305846.6666666667, ans=0.0 2023-10-03 14:52:41,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1305846.6666666667, ans=0.0 2023-10-03 14:52:48,318 INFO [train.py:1046] (1/4) Epoch 37, batch 4650, loss[loss=0.1423, simple_loss=0.2282, pruned_loss=0.02817, over 24573.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2366, pruned_loss=0.03955, over 4702187.72 frames. ], batch size: 60, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:52:48,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:52:49,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:52:49,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:51,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:52:51,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:52:51,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:52:52,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:55,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 14:52:59,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:53:00,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 14:53:01,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:53:02,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 14:53:02,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:53:03,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 14:53:04,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 14:53:04,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:53:04,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:53:04,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1305980.0, ans=0.125 2023-10-03 14:53:05,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:53:07,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:53:07,719 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 14:53:10,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:53:11,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 14:53:12,444 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.61 vs. limit=15.0 2023-10-03 14:53:14,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:53:14,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:53:15,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 14:53:17,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:53:20,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:53:23,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:53:27,826 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.915e+02 2.038e+02 2.271e+02 3.983e+02, threshold=4.075e+02, percent-clipped=0.0 2023-10-03 14:53:27,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:53:29,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:53:31,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:53:31,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:53:35,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 14:53:35,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 14:53:36,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 14:53:36,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 14:53:37,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:53:38,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1306113.3333333333, ans=0.125 2023-10-03 14:53:46,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:53:46,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:53:46,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 14:53:47,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:53:48,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:53:50,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:53:50,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:53:52,415 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.58 vs. limit=15.0 2023-10-03 14:53:52,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:53:52,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:53:54,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:53:57,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:53:57,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:53:57,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:53:58,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 14:53:58,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:53:59,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 14:54:01,693 INFO [train.py:1046] (1/4) Epoch 37, batch 4700, loss[loss=0.1555, simple_loss=0.2371, pruned_loss=0.03696, over 23446.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2369, pruned_loss=0.03955, over 4710392.88 frames. ], batch size: 105, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:54:08,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:54:09,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:54:09,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:54:10,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:54:12,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:54:17,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 14:54:17,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 14:54:19,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:54:21,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:54:21,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:54:24,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:54:29,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:54:29,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 14:54:32,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:54:36,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1306380.0, ans=0.0 2023-10-03 14:54:37,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 14:54:38,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:54:40,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:54:43,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 14:54:45,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:54:49,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:54:50,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 14:54:52,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:54:53,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:54:56,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:54:56,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1306446.6666666667, ans=0.125 2023-10-03 14:54:57,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:54:57,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 14:54:59,234 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 14:54:59,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1306513.3333333333, ans=0.07 2023-10-03 14:55:00,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:55:02,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:55:02,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:55:02,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 14:55:04,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:55:05,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1306513.3333333333, ans=0.1 2023-10-03 14:55:07,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 14:55:10,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:55:11,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:55:14,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:55:16,111 INFO [train.py:1046] (1/4) Epoch 37, batch 4750, loss[loss=0.1515, simple_loss=0.2349, pruned_loss=0.03407, over 24650.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2383, pruned_loss=0.04009, over 4711149.52 frames. ], batch size: 65, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:55:16,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:55:17,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 14:55:18,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:55:18,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1306580.0, ans=0.0 2023-10-03 14:55:22,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1306580.0, ans=0.1 2023-10-03 14:55:23,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 14:55:24,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:55:24,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:55:25,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:55:29,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 14:55:29,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1306646.6666666667, ans=0.125 2023-10-03 14:55:38,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:55:39,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 14:55:40,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:55:43,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:55:43,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:55:43,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:55:44,595 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 14:55:44,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 14:55:47,564 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:55:50,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 14:55:53,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:55:55,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:55:57,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:55:57,610 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 14:55:57,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:55:57,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1306713.3333333333, ans=0.04949747468305833 2023-10-03 14:55:57,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1306713.3333333333, ans=0.05 2023-10-03 14:55:59,039 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.889e+02 2.021e+02 2.290e+02 3.051e+02, threshold=4.042e+02, percent-clipped=0.0 2023-10-03 14:56:00,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:56:01,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:56:05,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 14:56:05,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 14:56:06,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:56:06,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:56:06,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:56:08,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:56:08,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 14:56:11,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 14:56:11,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1306780.0, ans=0.125 2023-10-03 14:56:12,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:56:14,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:56:14,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 14:56:15,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:56:16,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:56:19,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:56:21,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:56:21,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:56:25,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:56:25,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 14:56:26,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 14:56:26,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 14:56:30,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:56:30,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:56:31,509 INFO [train.py:1046] (1/4) Epoch 37, batch 4800, loss[loss=0.1665, simple_loss=0.2595, pruned_loss=0.03673, over 24457.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2398, pruned_loss=0.04088, over 4685938.12 frames. ], batch size: 69, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:56:31,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 14:56:36,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:56:36,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:56:42,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:56:43,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:56:43,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:56:44,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 14:56:46,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:56:46,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:56:47,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:56:50,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:56:51,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:56:52,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:56:53,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:56:53,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 14:56:53,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:56:55,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:56:57,709 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=7.29 vs. limit=12.0 2023-10-03 14:56:58,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:57:00,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:57:02,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:57:02,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:57:02,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:57:05,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:57:08,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 14:57:08,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 14:57:10,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:57:10,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:57:11,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:57:11,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:57:11,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:57:13,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:57:13,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:57:17,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:57:18,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:19,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1307113.3333333333, ans=0.0 2023-10-03 14:57:20,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:57:24,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 14:57:24,994 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.73 vs. limit=22.5 2023-10-03 14:57:26,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:57:28,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:28,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:57:28,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:57:32,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:57:33,130 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.72 vs. limit=6.0 2023-10-03 14:57:33,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:57:33,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:35,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:57:35,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:57:36,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:57:40,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1307180.0, ans=0.125 2023-10-03 14:57:41,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:57:41,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:41,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:57:42,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 14:57:45,537 INFO [train.py:1046] (1/4) Epoch 37, batch 4850, loss[loss=0.1641, simple_loss=0.255, pruned_loss=0.03659, over 24532.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2395, pruned_loss=0.04074, over 4699320.34 frames. ], batch size: 71, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:57:45,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 14:57:45,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:57:45,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:57:47,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:57:47,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:50,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:57:57,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 14:57:58,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:58:01,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:58:03,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:58:03,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:58:07,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:58:07,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:58:10,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:58:10,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 14:58:13,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:58:15,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:58:15,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:58:17,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:58:17,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 14:58:19,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:58:19,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:58:22,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1307380.0, ans=0.2 2023-10-03 14:58:25,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:58:25,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 14:58:25,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 14:58:26,625 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.885e+02 2.047e+02 2.380e+02 3.001e+02, threshold=4.095e+02, percent-clipped=0.0 2023-10-03 14:58:26,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:58:34,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:58:34,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 14:58:34,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:58:34,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:58:37,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 14:58:39,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 14:58:39,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:58:39,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 14:58:39,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:58:40,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:58:41,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 14:58:48,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1307513.3333333333, ans=0.0 2023-10-03 14:58:49,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:58:54,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1307513.3333333333, ans=0.2 2023-10-03 14:58:55,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:58:55,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:58:59,480 INFO [train.py:1046] (1/4) Epoch 37, batch 4900, loss[loss=0.1474, simple_loss=0.2323, pruned_loss=0.03125, over 24332.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2382, pruned_loss=0.04047, over 4692672.30 frames. ], batch size: 61, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:59:01,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 14:59:01,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:59:06,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:59:07,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:59:08,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:59:12,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 14:59:13,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1307646.6666666667, ans=0.125 2023-10-03 14:59:16,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 14:59:19,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 14:59:21,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 14:59:21,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:59:22,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:59:22,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:59:22,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:59:22,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:59:23,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 14:59:25,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 14:59:26,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:59:27,011 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.92 vs. limit=15.0 2023-10-03 14:59:29,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:59:29,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:59:31,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:59:32,193 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.86 vs. limit=6.0 2023-10-03 14:59:32,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:59:32,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:59:32,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 14:59:34,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:59:35,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:59:35,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 14:59:35,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 14:59:40,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 14:59:41,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:59:44,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:59:44,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:59:44,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:59:45,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 14:59:45,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:59:45,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 14:59:49,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:59:51,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 14:59:53,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:59:54,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 14:59:55,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:59:56,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 14:59:56,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 15:00:02,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:00:03,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:00:04,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 15:00:04,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 15:00:05,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:00:05,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1307846.6666666667, ans=0.0 2023-10-03 15:00:06,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:00:11,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:00:11,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:00:11,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:00:12,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 15:00:14,127 INFO [train.py:1046] (1/4) Epoch 37, batch 4950, loss[loss=0.1487, simple_loss=0.2194, pruned_loss=0.03904, over 23577.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2373, pruned_loss=0.04003, over 4701346.05 frames. ], batch size: 256, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 15:00:14,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 15:00:16,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:00:16,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 15:00:19,616 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.03 vs. limit=15.0 2023-10-03 15:00:20,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 15:00:20,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 15:00:21,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:00:21,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 15:00:21,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:00:21,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:00:23,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:00:23,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:00:26,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:00:27,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:00:28,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:00:30,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:00:31,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:00:33,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:00:37,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 15:00:41,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:00:41,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:00:44,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:00:44,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:00:44,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:00:45,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 15:00:47,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 15:00:48,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:00:50,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:00:50,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:00:51,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:00:51,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:00:52,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:00:54,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:00:56,283 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.953e+02 2.233e+02 2.560e+02 3.668e+02, threshold=4.465e+02, percent-clipped=0.0 2023-10-03 15:00:56,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:00:59,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:01:01,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:01:01,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:01:01,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 15:01:03,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:01:03,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:01:08,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:01:10,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:01:10,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:01:10,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:01:12,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:01:13,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:01:15,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:01:15,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:01:15,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:01:16,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 15:01:21,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:01:25,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 15:01:27,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 15:01:28,804 INFO [train.py:1046] (1/4) Epoch 37, batch 5000, loss[loss=0.1463, simple_loss=0.2199, pruned_loss=0.03635, over 24304.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2375, pruned_loss=0.03998, over 4706098.46 frames. ], batch size: 56, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 15:01:34,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:01:34,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:01:35,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 15:01:37,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 15:01:38,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:01:41,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 15:01:41,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:01:41,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:01:41,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1308313.3333333333, ans=0.125 2023-10-03 15:01:42,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 15:01:42,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:01:43,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:01:44,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 15:01:44,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:01:44,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:01:46,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 15:01:47,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 15:01:49,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:01:49,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 15:01:49,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 15:01:49,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:01:50,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:01:50,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 15:01:50,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 15:01:51,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1308313.3333333333, ans=0.0 2023-10-03 15:01:52,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 15:01:53,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:01:53,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:01:53,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 15:01:55,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:01:56,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:01:56,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:01:57,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 15:01:58,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1308380.0, ans=0.05 2023-10-03 15:01:59,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 15:02:01,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:02:02,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:02:05,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1308380.0, ans=0.1 2023-10-03 15:02:06,859 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 15:02:09,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:02:09,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:02:09,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:14,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 15:02:14,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:02:15,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:02:15,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:02:17,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 15:02:17,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:02:17,992 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.02 vs. limit=15.0 2023-10-03 15:02:19,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:02:20,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:02:26,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 15:02:29,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:36,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1308513.3333333333, ans=0.125 2023-10-03 15:02:37,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:02:40,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:40,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:02:40,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:02:40,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:02:41,564 INFO [train.py:1046] (1/4) Epoch 37, batch 5050, loss[loss=0.1729, simple_loss=0.2501, pruned_loss=0.04785, over 23420.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2382, pruned_loss=0.03964, over 4708706.82 frames. ], batch size: 106, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 15:02:41,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:02:41,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:48,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:48,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 15:02:49,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:02:50,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:02:52,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:02:52,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 15:02:54,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:02:54,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:02:54,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1308580.0, ans=0.2 2023-10-03 15:02:55,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:02:57,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:02:58,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:03:02,109 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.73 vs. limit=22.5 2023-10-03 15:03:06,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 15:03:06,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:03:07,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:03:08,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 15:03:08,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:03:10,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:03:10,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:03:11,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:03:11,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 15:03:12,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 15:03:14,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:03:16,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:03:19,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:03:19,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 15:03:20,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:03:23,312 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.49 vs. limit=15.0 2023-10-03 15:03:23,936 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.822e+02 1.997e+02 2.206e+02 4.245e+02, threshold=3.993e+02, percent-clipped=0.0 2023-10-03 15:03:24,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 15:03:24,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1308713.3333333333, ans=0.125 2023-10-03 15:03:25,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:03:25,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:03:26,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:03:26,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:03:28,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:03:29,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:03:31,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:03:31,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:03:31,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:03:32,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 15:03:33,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:03:34,795 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.99 vs. limit=10.0 2023-10-03 15:03:35,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:03:38,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:03:38,430 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 15:03:38,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:03:39,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:03:41,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:03:41,187 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 15:03:42,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:03:42,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 15:03:42,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:03:47,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:03:49,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:03:49,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 15:03:50,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 15:03:53,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:03:53,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:03:54,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:03:56,090 INFO [train.py:1046] (1/4) Epoch 37, batch 5100, loss[loss=0.1668, simple_loss=0.2568, pruned_loss=0.03838, over 24364.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2396, pruned_loss=0.04023, over 4691219.26 frames. ], batch size: 74, lr: 2.73e-03, grad_scale: 8.0 2023-10-03 15:03:56,183 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 15:03:57,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:03:57,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1308913.3333333333, ans=0.0 2023-10-03 15:04:01,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 15:04:01,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff3.min_abs, batch_count=1308913.3333333333, ans=0.2 2023-10-03 15:04:02,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 15:04:03,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:04:06,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:04:09,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:04:09,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 15:04:09,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 15:04:13,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:04:14,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:04:17,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:04:20,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 15:04:20,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:04:22,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:04:22,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 15:04:25,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:04:26,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:04:26,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 15:04:28,156 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 15:04:29,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:04:29,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 15:04:29,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 15:04:32,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:04:39,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:04:42,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 15:04:42,637 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 15:04:43,919 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 15:04:45,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 15:04:45,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:04:47,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 15:04:51,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 15:04:55,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 15:04:56,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:04:57,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 15:05:00,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:05:02,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 15:05:06,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:05:06,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:05:06,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:05:08,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:05:08,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:05:09,435 INFO [train.py:1046] (1/4) Epoch 37, batch 5150, loss[loss=0.1526, simple_loss=0.2403, pruned_loss=0.03248, over 24635.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.24, pruned_loss=0.04054, over 4682950.67 frames. ], batch size: 65, lr: 2.73e-03, grad_scale: 8.0 2023-10-03 15:05:09,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:05:09,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 15:05:09,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 15:05:09,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 15:05:09,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:05:09,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 15:05:11,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:05:12,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 15:05:12,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1309246.6666666667, ans=0.0 2023-10-03 15:05:13,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:05:13,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:05:19,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 15:05:19,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 15:05:20,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:05:21,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:05:23,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:05:23,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:05:23,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:05:25,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:05:25,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:05:25,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 15:05:26,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:05:26,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1309313.3333333333, ans=0.125 2023-10-03 15:05:28,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:05:31,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 15:05:31,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 15:05:32,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:05:36,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:05:37,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1309313.3333333333, ans=0.0 2023-10-03 15:05:40,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 15:05:42,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:05:47,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:05:48,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:05:51,927 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.958e+02 2.128e+02 2.414e+02 3.634e+02, threshold=4.256e+02, percent-clipped=0.0 2023-10-03 15:05:52,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:05:52,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:05:53,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 15:05:59,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:05:59,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:06:00,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:06:03,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:06:03,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:06:04,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 15:06:09,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:06:09,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1309513.3333333333, ans=0.0 2023-10-03 15:06:10,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 15:06:13,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:06:13,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:06:15,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:06:15,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:06:16,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:06:16,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:06:22,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:06:24,036 INFO [train.py:1046] (1/4) Epoch 37, batch 5200, loss[loss=0.1548, simple_loss=0.2436, pruned_loss=0.03301, over 24644.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2399, pruned_loss=0.04067, over 4684030.77 frames. ], batch size: 68, lr: 2.73e-03, grad_scale: 16.0 2023-10-03 15:06:24,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:06:27,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:06:30,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 15:06:31,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:06:31,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:06:33,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1309580.0, ans=0.125 2023-10-03 15:06:34,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:06:34,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1309580.0, ans=0.0 2023-10-03 15:06:35,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:06:35,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:06:38,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 15:06:40,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 15:06:41,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:06:43,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 15:06:45,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:06:46,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1309646.6666666667, ans=0.0 2023-10-03 15:06:47,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 15:06:47,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 15:06:48,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 15:06:51,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 15:06:51,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:06:51,953 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 15:06:51,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:06:55,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:06:55,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:06:56,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 15:06:57,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:06:59,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:07:02,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 15:07:02,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 15:07:02,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 15:07:07,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 15:07:08,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:07:14,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:07:14,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:07:17,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 15:07:17,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:07:17,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 15:07:17,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:07:18,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:07:20,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:07:21,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:07:23,294 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.56 vs. limit=15.0 2023-10-03 15:07:25,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:07:27,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:07:27,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:07:34,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:07:34,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 15:07:35,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:07:35,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:07:37,235 INFO [train.py:1046] (1/4) Epoch 37, batch 5250, loss[loss=0.1709, simple_loss=0.2403, pruned_loss=0.05077, over 23679.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2388, pruned_loss=0.0402, over 4692764.23 frames. ], batch size: 149, lr: 2.73e-03, grad_scale: 16.0 2023-10-03 15:07:37,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:07:37,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:07:38,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:07:40,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:07:43,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:07:44,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:07:46,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:07:50,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:07:51,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:07:53,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:07:56,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:07:57,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 15:07:59,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:07:59,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:07:59,927 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.61 vs. limit=15.0 2023-10-03 15:08:01,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1309980.0, ans=0.05 2023-10-03 15:08:01,622 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.44 vs. limit=15.0 2023-10-03 15:08:18,119 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.876e+02 2.013e+02 2.256e+02 3.803e+02, threshold=4.026e+02, percent-clipped=0.0 2023-10-03 15:08:35,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1310180.0, ans=0.125 2023-10-03 15:08:43,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1310180.0, ans=0.125 2023-10-03 15:08:45,609 INFO [train.py:1046] (1/4) Epoch 37, batch 5300, loss[loss=0.1379, simple_loss=0.2216, pruned_loss=0.02713, over 24367.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2373, pruned_loss=0.0398, over 4691827.32 frames. ], batch size: 61, lr: 2.73e-03, grad_scale: 8.0 2023-10-03 15:09:00,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:09:00,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 15:09:00,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 15:09:00,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:09:00,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:09:00,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:09:00,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:09:00,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:09:00,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:00,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:09:00,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:09:00,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:09:00,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 15:09:00,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 15:09:00,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 15:09:01,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:09:01,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 15:09:01,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 15:09:01,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:09:01,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:09:01,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:09:02,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:09:02,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:09:02,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:09:02,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:09:02,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:09:02,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:09:02,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:09:02,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:09:02,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:09:02,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:09:03,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 15:09:03,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:09:03,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:09:03,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 15:09:03,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 15:09:03,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:09:03,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:09:03,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 15:09:04,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 15:09:04,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 15:09:04,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:09:04,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:09:04,925 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 15:09:05,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 15:09:05,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 15:09:05,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:09:05,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 15:09:05,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 15:09:05,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 15:09:05,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 15:09:12,151 INFO [train.py:1046] (1/4) Epoch 38, batch 0, loss[loss=0.1626, simple_loss=0.2513, pruned_loss=0.03697, over 24520.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2513, pruned_loss=0.03697, over 24520.00 frames. ], batch size: 71, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:09:12,151 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 15:09:24,071 INFO [train.py:1078] (1/4) Epoch 38, validation: loss=0.3257, simple_loss=0.2715, pruned_loss=0.1899, over 1125622.00 frames. 2023-10-03 15:09:24,072 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 15:09:27,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 15:09:28,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:09:30,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1310326.6666666667, ans=0.125 2023-10-03 15:09:31,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:09:35,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=1310326.6666666667, ans=15.0 2023-10-03 15:09:35,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:09:35,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:09:36,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:37,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 15:09:38,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 15:09:40,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:40,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:43,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:44,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:09:44,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:09:44,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:09:47,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 15:09:48,528 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.52 vs. limit=15.0 2023-10-03 15:09:49,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:09:55,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:09:55,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:09:57,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 15:09:59,084 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.87 vs. limit=10.0 2023-10-03 15:10:01,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:10:01,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:10:02,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:10:07,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:10:10,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:10:16,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 15:10:18,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 15:10:18,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:10:18,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:10:20,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:10:22,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:10:23,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 15:10:25,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1310593.3333333333, ans=0.125 2023-10-03 15:10:28,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:10:28,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:10:32,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:10:35,232 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 15:10:37,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:10:38,630 INFO [train.py:1046] (1/4) Epoch 38, batch 50, loss[loss=0.1491, simple_loss=0.2207, pruned_loss=0.03872, over 24421.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.239, pruned_loss=0.03993, over 1061008.83 frames. ], batch size: 58, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:10:38,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:10:40,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:10:40,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 15:10:41,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:10:41,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:10:43,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:10:45,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:10:47,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:10:49,941 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.97 vs. limit=22.5 2023-10-03 15:10:51,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 15:10:51,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:10:56,407 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.94 vs. limit=6.0 2023-10-03 15:10:57,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:10:59,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 15:11:01,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 15:11:02,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:11:03,995 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.896e+02 2.083e+02 2.341e+02 4.077e+02, threshold=4.166e+02, percent-clipped=1.0 2023-10-03 15:11:04,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:11:04,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:11:04,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:11:04,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:11:05,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 15:11:05,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:11:07,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1310793.3333333333, ans=0.0 2023-10-03 15:11:14,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:11:17,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:11:17,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:11:18,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 15:11:20,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:11:21,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:11:21,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 15:11:23,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:11:24,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 15:11:25,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1310860.0, ans=0.0 2023-10-03 15:11:25,785 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.26 vs. limit=15.0 2023-10-03 15:11:31,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:11:31,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:11:31,926 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.90 vs. limit=10.0 2023-10-03 15:11:32,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:11:33,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:11:33,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:11:36,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 15:11:36,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 15:11:39,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:11:39,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:11:41,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:11:41,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:11:41,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 15:11:43,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 15:11:44,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=1310926.6666666667, ans=0.025 2023-10-03 15:11:45,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 15:11:46,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:11:46,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:11:48,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 15:11:48,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 15:11:48,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:11:49,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:11:51,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 15:11:51,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:11:52,666 INFO [train.py:1046] (1/4) Epoch 38, batch 100, loss[loss=0.1689, simple_loss=0.2399, pruned_loss=0.04892, over 23806.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2399, pruned_loss=0.03938, over 1878545.51 frames. ], batch size: 195, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:11:52,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:11:54,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1310993.3333333333, ans=0.125 2023-10-03 15:11:57,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:12:00,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:12:02,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 15:12:02,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:12:05,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:12:05,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:12:05,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:12:06,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:12:06,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:12:07,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 15:12:10,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 15:12:10,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:12:10,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:12:10,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:12:13,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 15:12:15,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:12:16,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:12:16,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1311060.0, ans=0.0 2023-10-03 15:12:17,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:12:20,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:12:25,103 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 15:12:25,128 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 15:12:26,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:12:26,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:12:29,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:12:31,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:12:32,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:12:38,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:12:38,586 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 15:12:41,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 15:12:44,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:12:45,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:12:48,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:12:52,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:12:56,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:12:57,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:13:00,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:13:00,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:13:03,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:13:03,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:13:03,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:13:03,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 15:13:03,722 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 15:13:05,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:13:05,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:13:06,872 INFO [train.py:1046] (1/4) Epoch 38, batch 150, loss[loss=0.1578, simple_loss=0.2363, pruned_loss=0.03971, over 24432.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2401, pruned_loss=0.03967, over 2511655.15 frames. ], batch size: 63, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:13:06,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:06,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:13:06,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 15:13:06,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 15:13:08,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:13:08,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:08,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:13:09,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:13:11,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:13:11,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:13:12,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:13:14,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1311326.6666666667, ans=0.125 2023-10-03 15:13:15,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:13:15,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:13:16,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:19,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:13:19,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:24,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:13:24,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:28,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 15:13:28,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 15:13:28,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 15:13:31,439 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.858e+02 2.070e+02 2.351e+02 3.809e+02, threshold=4.141e+02, percent-clipped=0.0 2023-10-03 15:13:31,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:13:31,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:13:32,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:13:34,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:13:34,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:13:34,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:36,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:38,429 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 15:13:39,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1311460.0, ans=0.0 2023-10-03 15:13:40,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:13:45,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:13:45,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1311460.0, ans=0.125 2023-10-03 15:13:49,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:13:49,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 15:13:52,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1311526.6666666667, ans=0.125 2023-10-03 15:13:52,663 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.76 vs. limit=15.0 2023-10-03 15:13:53,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:13:53,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:13:53,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1311526.6666666667, ans=0.0 2023-10-03 15:13:54,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:13:56,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:13:59,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:14:00,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:14:00,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:14:02,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 15:14:05,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:14:07,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:14:07,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:14:07,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:14:10,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:14:12,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 15:14:13,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:14:14,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:14:16,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:14:18,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:14:19,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 15:14:19,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:14:19,050 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 15:14:20,205 INFO [train.py:1046] (1/4) Epoch 38, batch 200, loss[loss=0.152, simple_loss=0.231, pruned_loss=0.03646, over 23327.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2406, pruned_loss=0.04054, over 3003438.07 frames. ], batch size: 119, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:14:22,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:14:25,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:14:25,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:14:29,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 15:14:29,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:14:30,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:14:31,135 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.87 vs. limit=6.0 2023-10-03 15:14:31,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 15:14:32,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:14:34,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:14:35,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:14:38,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:14:38,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:14:38,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:14:44,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1311726.6666666667, ans=0.0 2023-10-03 15:14:56,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:14:56,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:14:58,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:14:58,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:15:01,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 15:15:01,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:15:03,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:04,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1311860.0, ans=0.125 2023-10-03 15:15:05,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:15:05,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:15:07,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:15:07,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 15:15:09,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 15:15:09,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:15:11,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1311860.0, ans=0.07 2023-10-03 15:15:11,770 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.29 vs. limit=15.0 2023-10-03 15:15:12,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:15:16,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:15:23,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:23,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:15:30,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:31,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 15:15:32,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:15:32,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:15:32,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:15:34,195 INFO [train.py:1046] (1/4) Epoch 38, batch 250, loss[loss=0.1575, simple_loss=0.2334, pruned_loss=0.04083, over 18442.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2414, pruned_loss=0.04166, over 3375457.37 frames. ], batch size: 40, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:15:34,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:15:34,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 15:15:36,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:15:36,136 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 15:15:38,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:39,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:15:41,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:41,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:15:44,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:15:45,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:46,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:15:49,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:15:59,009 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.823e+02 2.031e+02 2.288e+02 2.955e+02, threshold=4.062e+02, percent-clipped=0.0 2023-10-03 15:16:01,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:16:02,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:16:02,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:16:06,333 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.78 vs. limit=15.0 2023-10-03 15:16:10,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:16:10,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1312126.6666666667, ans=0.07 2023-10-03 15:16:11,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:16:13,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:16:13,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:16:13,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:16:13,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:16:15,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:16:19,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:16:20,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 15:16:20,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:16:22,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:16:22,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:16:22,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:16:23,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:16:25,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:16:25,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:16:25,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:16:27,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:16:27,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:16:30,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:16:32,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1312260.0, ans=0.5 2023-10-03 15:16:35,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:16:39,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:16:43,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:16:44,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:16:48,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 15:16:48,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:16:48,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:16:49,540 INFO [train.py:1046] (1/4) Epoch 38, batch 300, loss[loss=0.1483, simple_loss=0.2127, pruned_loss=0.0419, over 23547.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2392, pruned_loss=0.0406, over 3674160.58 frames. ], batch size: 256, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:16:49,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 15:16:49,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 15:16:51,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:16:51,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 15:16:55,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:16:56,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:16:59,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:16:59,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 15:17:02,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:17:02,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:17:04,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 15:17:04,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:17:09,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 15:17:11,192 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.74 vs. limit=22.5 2023-10-03 15:17:13,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:17:13,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 15:17:16,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 15:17:18,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:18,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1312460.0, ans=0.1 2023-10-03 15:17:19,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:17:21,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:21,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 15:17:21,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:17:22,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:17:23,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:17:23,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:17:24,489 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.99 vs. limit=15.0 2023-10-03 15:17:28,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 15:17:28,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 15:17:28,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:17:32,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:32,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 15:17:34,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:17:39,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:17:42,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:17:42,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 15:17:47,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:47,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:17:47,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1312593.3333333333, ans=0.0 2023-10-03 15:17:49,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:51,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:17:52,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 15:17:52,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 15:17:52,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:17:54,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 15:17:55,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:55,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:17:56,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:17:58,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:17:58,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:02,277 INFO [train.py:1046] (1/4) Epoch 38, batch 350, loss[loss=0.1592, simple_loss=0.2425, pruned_loss=0.0379, over 24510.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2364, pruned_loss=0.03994, over 3897626.10 frames. ], batch size: 66, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:18:04,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:18:04,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 15:18:06,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:12,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:18:14,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:18:16,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:16,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1312726.6666666667, ans=0.0 2023-10-03 15:18:17,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 15:18:19,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:18:19,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 15:18:21,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:22,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 15:18:23,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:18:25,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 15:18:26,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:18:28,150 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.973e+02 2.217e+02 2.536e+02 3.904e+02, threshold=4.435e+02, percent-clipped=0.0 2023-10-03 15:18:29,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:18:29,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:18:31,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:18:31,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:18:31,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:18:32,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:18:32,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:18:33,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1312793.3333333333, ans=0.0 2023-10-03 15:18:34,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:18:34,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:40,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:18:40,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:18:41,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:18:41,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:18:46,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 15:18:46,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:51,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:18:51,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:18:52,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:18:53,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 15:18:56,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:18:56,637 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 15:18:56,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1312860.0, ans=0.0 2023-10-03 15:18:57,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 15:18:57,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:19:01,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:19:01,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 15:19:04,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:19:07,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:19:07,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:19:09,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:19:09,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:19:10,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:19:14,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:19:17,104 INFO [train.py:1046] (1/4) Epoch 38, batch 400, loss[loss=0.1442, simple_loss=0.2316, pruned_loss=0.02839, over 24323.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2359, pruned_loss=0.03974, over 4062814.05 frames. ], batch size: 61, lr: 2.69e-03, grad_scale: 32.0 2023-10-03 15:19:17,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:19:17,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 15:19:17,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:19:18,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:19:20,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:19:20,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:19:23,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:19:23,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:19:26,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 15:19:26,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1312993.3333333333, ans=0.125 2023-10-03 15:19:28,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 15:19:28,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:19:28,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 15:19:30,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:19:30,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1313060.0, ans=0.1 2023-10-03 15:19:30,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1313060.0, ans=0.2 2023-10-03 15:19:36,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:19:36,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:19:38,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 15:19:38,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:19:38,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:19:38,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:19:39,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:19:39,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1313060.0, ans=0.125 2023-10-03 15:19:41,098 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 15:19:43,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 15:19:43,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1313060.0, ans=0.125 2023-10-03 15:19:47,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:19:47,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1313126.6666666667, ans=0.0 2023-10-03 15:19:48,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:19:48,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 15:19:50,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 15:19:54,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:19:57,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:20:03,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 15:20:07,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:20:09,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 15:20:11,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:20:12,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:20:12,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 15:20:12,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1313193.3333333333, ans=0.125 2023-10-03 15:20:16,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:20:16,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1313260.0, ans=0.125 2023-10-03 15:20:18,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:20:20,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:20:21,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:20:21,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 15:20:24,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 15:20:24,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 15:20:27,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:20:27,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:20:28,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 15:20:30,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:20:31,629 INFO [train.py:1046] (1/4) Epoch 38, batch 450, loss[loss=0.2012, simple_loss=0.2667, pruned_loss=0.06785, over 19229.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2365, pruned_loss=0.03968, over 4208576.15 frames. ], batch size: 389, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:20:31,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:20:31,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 15:20:31,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 15:20:32,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1313326.6666666667, ans=0.125 2023-10-03 15:20:33,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:20:33,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:20:34,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:20:34,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 15:20:34,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:20:36,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:20:39,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:20:49,410 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.44 vs. limit=10.0 2023-10-03 15:20:50,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:20:50,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:20:51,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 15:20:53,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 15:20:57,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:20:58,823 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.857e+02 2.046e+02 2.242e+02 3.489e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-03 15:20:58,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:21:00,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1313460.0, ans=0.125 2023-10-03 15:21:01,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:21:03,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1313460.0, ans=0.2 2023-10-03 15:21:04,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:21:05,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:21:08,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 15:21:08,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 15:21:10,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 15:21:10,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:21:12,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:21:12,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:21:13,693 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 15:21:13,701 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 15:21:13,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:21:16,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:21:17,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 15:21:21,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1313526.6666666667, ans=0.125 2023-10-03 15:21:22,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:21:22,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:21:22,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 15:21:24,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 15:21:25,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:21:27,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1313526.6666666667, ans=0.2 2023-10-03 15:21:28,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:21:29,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:21:31,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 15:21:31,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1313593.3333333333, ans=0.2 2023-10-03 15:21:35,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:21:35,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 15:21:35,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 15:21:36,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:21:39,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1313593.3333333333, ans=0.125 2023-10-03 15:21:41,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:21:43,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:21:43,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:21:44,858 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 15:21:46,214 INFO [train.py:1046] (1/4) Epoch 38, batch 500, loss[loss=0.1511, simple_loss=0.2283, pruned_loss=0.03696, over 23689.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2368, pruned_loss=0.03922, over 4324627.73 frames. ], batch size: 149, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:21:49,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:21:50,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:21:52,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:21:52,285 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 15:21:53,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 15:21:53,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:21:55,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 15:21:59,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 15:22:00,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1313726.6666666667, ans=0.125 2023-10-03 15:22:01,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:22:02,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:22:02,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:22:03,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:15,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:22:15,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:22:17,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 15:22:17,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:22:17,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 15:22:18,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:22:20,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:22:21,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:22:21,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:22:21,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:22:22,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 15:22:24,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1313793.3333333333, ans=0.1 2023-10-03 15:22:24,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1313793.3333333333, ans=0.1 2023-10-03 15:22:26,445 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.68 vs. limit=12.0 2023-10-03 15:22:27,470 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 15:22:30,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:22:30,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:31,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1313860.0, ans=0.04949747468305833 2023-10-03 15:22:32,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:33,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:33,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:22:34,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 15:22:35,211 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.67 vs. limit=22.5 2023-10-03 15:22:37,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:22:38,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:22:41,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:22:46,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:47,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.52 vs. limit=22.5 2023-10-03 15:22:50,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:22:52,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 15:22:53,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:22:53,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:22:55,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1313926.6666666667, ans=0.1 2023-10-03 15:22:57,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 15:22:57,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 15:22:58,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:23:00,353 INFO [train.py:1046] (1/4) Epoch 38, batch 550, loss[loss=0.1843, simple_loss=0.2531, pruned_loss=0.05769, over 22687.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2376, pruned_loss=0.03945, over 4415150.13 frames. ], batch size: 323, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:23:03,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 15:23:04,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 15:23:04,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:23:04,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 15:23:05,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:23:05,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:23:06,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:07,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:07,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:23:08,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:23:11,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:23:12,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 15:23:12,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:23:19,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:23:19,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:20,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:23:22,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:26,465 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.899e+02 2.084e+02 2.370e+02 3.616e+02, threshold=4.167e+02, percent-clipped=0.0 2023-10-03 15:23:27,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 15:23:27,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 15:23:29,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:23:34,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:23:35,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:23:36,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:23:39,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:23:39,484 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 15:23:40,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:40,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 15:23:43,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:23:43,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:23:43,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:23:45,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:23:45,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 15:23:48,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 15:23:50,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:23:50,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:23:51,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:23:51,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:23:53,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:23:54,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:23:59,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:23:59,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:24:00,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 15:24:00,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:24:02,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:24:03,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:24:03,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:24:04,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:24:04,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 15:24:11,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 15:24:12,791 INFO [train.py:1046] (1/4) Epoch 38, batch 600, loss[loss=0.1559, simple_loss=0.2375, pruned_loss=0.03717, over 23303.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.238, pruned_loss=0.03957, over 4484194.09 frames. ], batch size: 93, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:24:14,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 15:24:15,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:24:17,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:24:17,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:24:23,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:24:25,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:24:28,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 15:24:30,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:24:33,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:24:33,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:24:36,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 15:24:36,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:24:41,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 15:24:44,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:24:44,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:24:44,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:24:50,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:24:50,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:24:50,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:24:51,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1314460.0, ans=0.125 2023-10-03 15:24:57,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:24:59,931 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:25:01,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:25:01,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:25:03,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:25:04,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1314526.6666666667, ans=0.125 2023-10-03 15:25:07,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 15:25:07,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1314526.6666666667, ans=0.2 2023-10-03 15:25:12,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:25:13,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:25:14,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1314593.3333333333, ans=0.09899494936611666 2023-10-03 15:25:18,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 15:25:18,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1314593.3333333333, ans=0.125 2023-10-03 15:25:18,957 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1314593.3333333333, ans=0.125 2023-10-03 15:25:20,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:25:21,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 15:25:21,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:25:23,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:25:27,775 INFO [train.py:1046] (1/4) Epoch 38, batch 650, loss[loss=0.148, simple_loss=0.2124, pruned_loss=0.04179, over 23593.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2381, pruned_loss=0.03965, over 4528347.68 frames. ], batch size: 256, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:25:27,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 15:25:27,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 15:25:31,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:25:31,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1314660.0, ans=0.125 2023-10-03 15:25:33,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:25:34,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:25:35,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 15:25:35,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:25:41,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:25:41,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:25:44,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:25:48,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 15:25:48,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1314726.6666666667, ans=0.95 2023-10-03 15:25:49,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:25:50,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:25:52,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:25:54,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 15:25:55,439 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.877e+02 2.025e+02 2.307e+02 3.864e+02, threshold=4.051e+02, percent-clipped=0.0 2023-10-03 15:25:56,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:25:58,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:25:59,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 15:25:59,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:01,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:26:03,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:26:03,094 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 15:26:03,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:26:03,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:26:07,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:08,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:26:08,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:26:09,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:26:10,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 15:26:11,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:26:11,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:26:11,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:26:12,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:26:14,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 15:26:16,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 15:26:17,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 15:26:17,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:18,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1314860.0, ans=0.2 2023-10-03 15:26:19,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:26:19,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:26:19,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:26:20,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:26:26,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:26,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:26:28,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:26:31,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:26:31,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:26:31,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:26:36,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:26:37,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:26:37,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:26:38,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:26:42,489 INFO [train.py:1046] (1/4) Epoch 38, batch 700, loss[loss=0.1804, simple_loss=0.2675, pruned_loss=0.04667, over 24001.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2372, pruned_loss=0.03972, over 4546414.67 frames. ], batch size: 80, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:26:42,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 15:26:44,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 15:26:45,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 15:26:47,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:49,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:26:50,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 15:26:55,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:26:55,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1314993.3333333333, ans=0.2 2023-10-03 15:26:58,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:27:00,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:27:01,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:27:01,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:27:01,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1315060.0, ans=0.2 2023-10-03 15:27:04,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:27:05,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 15:27:05,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:27:07,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 15:27:11,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 15:27:13,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:27:14,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:27:17,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:27:20,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:27:20,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1315126.6666666667, ans=0.0 2023-10-03 15:27:21,058 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.21 vs. limit=15.0 2023-10-03 15:27:21,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 15:27:26,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:27:26,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:27:27,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1315193.3333333333, ans=0.125 2023-10-03 15:27:28,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 15:27:32,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:27:34,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:27:35,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:27:41,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:27:41,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 15:27:43,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 15:27:45,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 15:27:47,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:27:49,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:27:50,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:27:53,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:27:53,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 15:27:57,698 INFO [train.py:1046] (1/4) Epoch 38, batch 750, loss[loss=0.1645, simple_loss=0.2429, pruned_loss=0.043, over 23937.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2372, pruned_loss=0.03954, over 4592962.59 frames. ], batch size: 86, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:27:57,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 15:27:57,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 15:27:59,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 15:27:59,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 15:27:59,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1315326.6666666667, ans=0.125 2023-10-03 15:28:00,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 15:28:00,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:28:02,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 15:28:03,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:28:05,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:28:06,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:28:08,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:28:09,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:28:09,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:28:11,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:28:11,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:28:12,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:28:16,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:28:16,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:28:17,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 15:28:18,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:28:19,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:28:21,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:28:22,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 15:28:24,073 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.901e+02 2.100e+02 2.467e+02 3.467e+02, threshold=4.200e+02, percent-clipped=0.0 2023-10-03 15:28:24,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 15:28:24,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:28:24,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1315393.3333333333, ans=0.125 2023-10-03 15:28:25,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 15:28:27,056 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 15:28:27,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 15:28:27,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:28:28,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 15:28:30,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1315460.0, ans=0.95 2023-10-03 15:28:31,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:28:36,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1315460.0, ans=0.1 2023-10-03 15:28:37,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:28:37,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:28:39,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:28:39,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1315460.0, ans=0.2 2023-10-03 15:28:40,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:28:43,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:28:43,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 15:28:43,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:28:46,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 15:28:46,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:28:49,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:28:50,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 15:28:50,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:28:55,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:28:56,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:28:58,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:28:59,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:29:03,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 15:29:05,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:29:05,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:29:08,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:29:08,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:29:08,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1315593.3333333333, ans=0.2 2023-10-03 15:29:09,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:29:09,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:29:11,388 INFO [train.py:1046] (1/4) Epoch 38, batch 800, loss[loss=0.1574, simple_loss=0.2396, pruned_loss=0.03759, over 23282.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2378, pruned_loss=0.03959, over 4624991.77 frames. ], batch size: 93, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:29:18,558 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.90 vs. limit=10.0 2023-10-03 15:29:19,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:29:19,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:20,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:29:22,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:29:22,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:22,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:29:23,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1315660.0, ans=0.125 2023-10-03 15:29:24,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:27,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1315726.6666666667, ans=0.125 2023-10-03 15:29:28,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:29:30,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:29:31,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1315726.6666666667, ans=0.125 2023-10-03 15:29:32,399 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.79 vs. limit=15.0 2023-10-03 15:29:32,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 15:29:33,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:29:34,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:29:34,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:29:34,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:29:36,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 15:29:36,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:29:37,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 15:29:39,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:39,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1315793.3333333333, ans=0.0 2023-10-03 15:29:41,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:29:42,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:29:43,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:29:45,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:29:45,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:29:49,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:29:49,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:29:49,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 15:29:51,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1315793.3333333333, ans=0.125 2023-10-03 15:29:52,597 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 15:29:52,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 15:29:52,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:29:52,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:29:53,250 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.67 vs. limit=12.0 2023-10-03 15:29:54,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:54,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:29:59,928 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 15:30:01,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 15:30:02,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:30:04,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:30:08,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:30:12,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:30:13,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 15:30:13,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:30:16,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 15:30:21,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:30:23,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:30:24,889 INFO [train.py:1046] (1/4) Epoch 38, batch 850, loss[loss=0.1605, simple_loss=0.2479, pruned_loss=0.03652, over 24563.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2389, pruned_loss=0.03994, over 4648046.20 frames. ], batch size: 71, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:30:24,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 15:30:25,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:30:26,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:30:27,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 15:30:27,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:30:29,192 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.27 vs. limit=15.0 2023-10-03 15:30:29,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:30:29,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:30:31,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:30:31,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1315993.3333333333, ans=0.2 2023-10-03 15:30:33,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:30:34,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 15:30:34,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 15:30:34,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 15:30:35,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:30:35,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:30:38,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:30:40,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:30:40,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:30:43,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1316060.0, ans=0.125 2023-10-03 15:30:44,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:30:46,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:30:46,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 15:30:46,990 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.12 vs. limit=22.5 2023-10-03 15:30:47,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1316060.0, ans=0.125 2023-10-03 15:30:50,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 15:30:51,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:30:52,106 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:30:53,069 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.915e+02 2.103e+02 2.491e+02 3.805e+02, threshold=4.207e+02, percent-clipped=0.0 2023-10-03 15:30:53,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 15:30:58,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 15:30:59,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 15:31:00,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1316126.6666666667, ans=0.0 2023-10-03 15:31:02,486 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 15:31:02,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:31:02,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:31:02,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 15:31:05,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:31:07,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:31:07,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 15:31:08,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:31:11,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:31:12,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:31:12,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:31:15,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:31:15,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:31:17,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 15:31:20,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:31:20,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:31:21,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:31:21,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:31:23,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:31:25,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:31:29,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:31:29,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:31:29,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:31:31,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:31:34,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1316260.0, ans=0.0 2023-10-03 15:31:38,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 15:31:40,035 INFO [train.py:1046] (1/4) Epoch 38, batch 900, loss[loss=0.168, simple_loss=0.2592, pruned_loss=0.03843, over 24266.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2394, pruned_loss=0.04003, over 4652897.01 frames. ], batch size: 74, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:31:40,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:31:40,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 15:31:41,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:31:41,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:31:43,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 15:31:48,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:31:51,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:31:51,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 15:31:51,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1316326.6666666667, ans=0.0 2023-10-03 15:31:53,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1316393.3333333333, ans=0.125 2023-10-03 15:31:55,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:31:55,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 15:31:55,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 15:31:57,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:31:57,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:31:59,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:31:59,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:32:07,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:32:07,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:32:08,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:32:10,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:32:14,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 15:32:15,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:32:19,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1316460.0, ans=0.1 2023-10-03 15:32:20,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:32:20,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:32:20,343 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 15:32:21,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 15:32:28,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:32:28,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:32:29,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:32:30,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1316526.6666666667, ans=0.1 2023-10-03 15:32:33,159 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.65 vs. limit=10.0 2023-10-03 15:32:35,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:32:37,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:32:37,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 15:32:37,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:32:40,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1316593.3333333333, ans=0.125 2023-10-03 15:32:41,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 15:32:42,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:32:42,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:32:42,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1316593.3333333333, ans=0.125 2023-10-03 15:32:45,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:32:45,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:32:50,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 15:32:50,156 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 15:32:51,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 15:32:51,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 15:32:52,514 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.65 vs. limit=6.0 2023-10-03 15:32:54,238 INFO [train.py:1046] (1/4) Epoch 38, batch 950, loss[loss=0.1674, simple_loss=0.2597, pruned_loss=0.03754, over 24572.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2398, pruned_loss=0.04004, over 4671885.88 frames. ], batch size: 71, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:32:54,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:32:57,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 15:33:03,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:33:04,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:33:04,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:33:06,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 15:33:09,546 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 15:33:11,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:33:12,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:33:12,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:33:12,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:33:13,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 15:33:13,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 15:33:15,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:33:16,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 15:33:16,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:33:21,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:33:21,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:33:21,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:33:22,509 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.928e+02 2.082e+02 2.343e+02 3.541e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-03 15:33:23,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 15:33:25,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 15:33:28,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:33:30,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:33:34,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:33:34,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:33:39,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 15:33:42,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 15:33:42,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:33:42,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:33:42,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:33:42,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:33:46,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 15:33:46,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:33:49,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:33:50,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:33:50,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 15:33:50,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:33:50,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:33:50,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1316860.0, ans=0.0 2023-10-03 15:33:51,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 15:33:52,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1316926.6666666667, ans=0.0 2023-10-03 15:33:57,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:33:59,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:34:04,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:34:05,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 15:34:05,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 15:34:08,418 INFO [train.py:1046] (1/4) Epoch 38, batch 1000, loss[loss=0.1449, simple_loss=0.2254, pruned_loss=0.03217, over 21065.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2384, pruned_loss=0.04013, over 4674359.80 frames. ], batch size: 46, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:34:08,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:34:13,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 15:34:13,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:34:18,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:34:20,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 15:34:20,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 15:34:24,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:34:24,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:34:26,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:34:27,512 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.08 vs. limit=15.0 2023-10-03 15:34:27,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 15:34:30,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 15:34:32,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 15:34:33,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:34:35,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 15:34:36,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 15:34:36,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 15:34:38,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:34:38,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:34:45,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:34:46,030 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:34:47,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:34:48,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:34:48,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:34:48,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 15:34:48,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:34:48,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1317126.6666666667, ans=0.125 2023-10-03 15:34:49,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:34:49,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:34:51,121 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 15:34:52,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1317193.3333333333, ans=0.125 2023-10-03 15:34:54,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 15:34:54,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1317193.3333333333, ans=0.0 2023-10-03 15:34:55,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 15:34:56,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 15:34:59,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:35:05,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:35:05,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:35:06,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:35:06,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:35:06,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 15:35:06,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1317260.0, ans=0.125 2023-10-03 15:35:08,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:35:09,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 15:35:10,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1317260.0, ans=0.0 2023-10-03 15:35:11,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 15:35:12,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:35:12,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:35:12,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1317260.0, ans=0.09899494936611666 2023-10-03 15:35:13,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:35:15,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:35:16,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:35:20,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:35:22,126 INFO [train.py:1046] (1/4) Epoch 38, batch 1050, loss[loss=0.1609, simple_loss=0.2511, pruned_loss=0.03532, over 24575.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.237, pruned_loss=0.03991, over 4672177.54 frames. ], batch size: 71, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:35:22,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:35:25,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:35:25,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:35:26,411 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.95 vs. limit=12.0 2023-10-03 15:35:27,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:35:30,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:35:30,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:35:34,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:35:34,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:35:34,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:35:37,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:35:37,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 15:35:39,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:35:39,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 15:35:41,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:35:41,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 15:35:41,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 15:35:47,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:35:48,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:35:48,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:35:49,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1317393.3333333333, ans=0.2 2023-10-03 15:35:50,133 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.487e+02 1.904e+02 2.119e+02 2.413e+02 3.551e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-03 15:35:51,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 15:35:51,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 15:35:51,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:35:56,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 15:35:58,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 15:35:59,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:36:02,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 15:36:05,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 15:36:05,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:36:06,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:36:10,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:36:10,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1317526.6666666667, ans=0.125 2023-10-03 15:36:13,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 15:36:15,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 15:36:15,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 15:36:15,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:36:15,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:36:17,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 15:36:19,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:36:22,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:36:22,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:36:22,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:36:23,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:36:30,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:36:30,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 15:36:32,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:36:32,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 15:36:33,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 15:36:33,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:36:36,043 INFO [train.py:1046] (1/4) Epoch 38, batch 1100, loss[loss=0.175, simple_loss=0.2564, pruned_loss=0.04679, over 24018.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2366, pruned_loss=0.03967, over 4669561.03 frames. ], batch size: 80, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:36:38,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:36:39,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1317660.0, ans=0.125 2023-10-03 15:36:42,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:36:48,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:36:48,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:36:48,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:36:49,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 15:36:51,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:36:52,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:36:55,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:36:58,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:36:58,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 15:36:59,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 15:37:01,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:37:01,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:37:03,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:37:04,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:37:10,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:37:13,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 15:37:14,972 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 15:37:15,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:37:17,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:37:19,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:37:19,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:37:19,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 15:37:20,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:37:20,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:37:20,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:37:20,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:37:20,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 15:37:26,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:37:26,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1317860.0, ans=0.125 2023-10-03 15:37:27,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 15:37:29,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:37:30,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1317860.0, ans=0.125 2023-10-03 15:37:32,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:37:32,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1317860.0, ans=0.1 2023-10-03 15:37:35,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 15:37:35,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 15:37:37,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:37:37,798 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.75 vs. limit=15.0 2023-10-03 15:37:39,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:37:39,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:37:42,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 15:37:42,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:37:44,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:37:45,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 15:37:45,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:37:47,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 15:37:48,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:37:48,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:37:48,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:37:50,248 INFO [train.py:1046] (1/4) Epoch 38, batch 1150, loss[loss=0.1785, simple_loss=0.2588, pruned_loss=0.04912, over 24377.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.237, pruned_loss=0.04001, over 4648044.23 frames. ], batch size: 77, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:37:53,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:37:55,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:37:57,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:37:57,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:37:57,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 15:37:57,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:37:57,936 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.74 vs. limit=15.0 2023-10-03 15:38:01,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 15:38:02,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:38:02,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:38:08,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 15:38:10,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:38:12,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:38:12,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1318060.0, ans=0.125 2023-10-03 15:38:14,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:38:14,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 15:38:14,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:38:14,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:38:18,536 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 1.952e+02 2.181e+02 2.480e+02 4.023e+02, threshold=4.362e+02, percent-clipped=0.0 2023-10-03 15:38:18,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 15:38:20,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:38:22,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:38:24,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1318126.6666666667, ans=0.125 2023-10-03 15:38:33,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:38:40,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:38:40,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 15:38:41,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:38:43,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:38:47,656 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 15:38:47,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1318260.0, ans=0.1 2023-10-03 15:38:49,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:38:55,124 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 15:38:59,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:39:00,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:39:00,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:39:01,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:39:03,909 INFO [train.py:1046] (1/4) Epoch 38, batch 1200, loss[loss=0.1612, simple_loss=0.2436, pruned_loss=0.03937, over 24511.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2376, pruned_loss=0.03966, over 4680658.71 frames. ], batch size: 63, lr: 2.69e-03, grad_scale: 32.0 2023-10-03 15:39:05,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:39:09,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:39:10,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1318326.6666666667, ans=0.125 2023-10-03 15:39:11,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:39:12,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:39:12,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:39:12,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:39:16,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:39:16,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:39:18,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:39:19,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:39:20,348 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 15:39:23,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 15:39:25,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:39:27,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:39:29,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:39:32,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:39:32,019 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 15:39:32,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:39:40,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:39:40,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:39:41,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 15:39:43,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:39:44,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 15:39:48,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 15:39:48,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:39:50,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:39:51,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:39:52,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1318526.6666666667, ans=0.125 2023-10-03 15:39:53,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:39:54,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:39:54,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:39:55,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:39:57,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 15:39:57,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:39:57,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:39:57,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:39:59,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:39:59,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:40:04,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 15:40:07,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:40:08,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 15:40:13,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1318593.3333333333, ans=0.0 2023-10-03 15:40:16,046 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 15:40:16,730 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.10 vs. limit=22.5 2023-10-03 15:40:17,490 INFO [train.py:1046] (1/4) Epoch 38, batch 1250, loss[loss=0.1829, simple_loss=0.254, pruned_loss=0.05588, over 19431.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2382, pruned_loss=0.03961, over 4702437.49 frames. ], batch size: 42, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:40:17,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:40:19,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:40:20,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:40:22,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:40:22,906 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:40:23,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1318660.0, ans=0.0 2023-10-03 15:40:25,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 15:40:25,698 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:40:29,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:40:29,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:40:30,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 15:40:31,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1318726.6666666667, ans=0.1 2023-10-03 15:40:32,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:40:32,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1318726.6666666667, ans=0.125 2023-10-03 15:40:33,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:40:37,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:40:39,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:40:40,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:40:40,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:40:42,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:40:47,222 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.889e+02 2.073e+02 2.333e+02 3.437e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 15:40:47,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 15:40:47,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:40:47,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:40:49,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:40:49,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1318793.3333333333, ans=0.2 2023-10-03 15:40:50,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:40:52,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:40:52,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 15:40:53,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1318793.3333333333, ans=0.125 2023-10-03 15:40:58,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 15:40:58,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:41:00,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:41:01,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1318860.0, ans=0.1 2023-10-03 15:41:02,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 15:41:02,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:41:02,218 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 15:41:02,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:41:02,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:41:03,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1318860.0, ans=0.125 2023-10-03 15:41:06,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:41:10,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:41:10,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:41:13,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 15:41:13,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 15:41:13,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 15:41:15,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:41:16,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 15:41:16,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:41:19,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 15:41:19,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:41:20,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 15:41:22,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 15:41:22,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:41:22,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 15:41:23,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:41:25,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 15:41:27,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:41:27,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:41:29,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:41:32,046 INFO [train.py:1046] (1/4) Epoch 38, batch 1300, loss[loss=0.1547, simple_loss=0.2392, pruned_loss=0.03505, over 23302.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2387, pruned_loss=0.03941, over 4707002.43 frames. ], batch size: 105, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:41:33,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:41:36,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:41:36,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 15:41:41,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:41:42,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:41:43,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:41:44,619 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.60 vs. limit=15.0 2023-10-03 15:41:45,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:41:46,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:41:46,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 15:41:51,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:41:53,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:41:54,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 15:41:54,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1319060.0, ans=0.1 2023-10-03 15:41:57,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 15:42:01,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:42:01,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:42:02,030 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.41 vs. limit=15.0 2023-10-03 15:42:02,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:42:04,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:42:04,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:42:05,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 15:42:07,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 15:42:11,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:42:13,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:42:14,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 15:42:14,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 15:42:17,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:42:18,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:42:20,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 15:42:20,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:42:20,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 15:42:22,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:42:23,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1319193.3333333333, ans=0.0 2023-10-03 15:42:26,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:42:26,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:42:29,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 15:42:30,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 15:42:32,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 15:42:36,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:42:39,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 15:42:39,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:42:45,092 INFO [train.py:1046] (1/4) Epoch 38, batch 1350, loss[loss=0.1385, simple_loss=0.2211, pruned_loss=0.02795, over 24491.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2378, pruned_loss=0.039, over 4717333.08 frames. ], batch size: 63, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:42:46,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 15:42:48,916 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.82 vs. limit=10.0 2023-10-03 15:42:49,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:42:50,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1319326.6666666667, ans=0.125 2023-10-03 15:42:51,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:42:52,484 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.78 vs. limit=10.0 2023-10-03 15:42:54,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:42:54,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:42:55,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:42:55,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:43:00,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:43:01,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 15:43:03,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:43:03,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:43:06,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 15:43:07,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:43:08,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1319393.3333333333, ans=0.2 2023-10-03 15:43:09,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:43:09,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 15:43:09,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1319393.3333333333, ans=0.2 2023-10-03 15:43:11,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 15:43:13,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 15:43:14,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:43:14,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 15:43:16,101 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.487e+02 1.825e+02 1.970e+02 2.155e+02 2.948e+02, threshold=3.940e+02, percent-clipped=0.0 2023-10-03 15:43:27,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:43:34,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:43:34,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:43:36,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 15:43:36,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1319526.6666666667, ans=0.07 2023-10-03 15:43:39,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:43:40,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 15:43:40,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:43:41,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:43:43,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:43:44,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 15:43:44,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1319593.3333333333, ans=0.125 2023-10-03 15:43:46,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:43:52,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 15:43:54,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 15:43:59,519 INFO [train.py:1046] (1/4) Epoch 38, batch 1400, loss[loss=0.1499, simple_loss=0.2202, pruned_loss=0.03978, over 23667.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2361, pruned_loss=0.03896, over 4702427.03 frames. ], batch size: 232, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:43:59,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 15:44:00,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:44:03,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:44:05,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:44:10,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 15:44:10,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 15:44:14,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1319726.6666666667, ans=0.05 2023-10-03 15:44:21,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:44:25,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:44:27,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:44:27,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 15:44:29,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:44:30,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 15:44:40,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:44:40,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:44:42,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1319860.0, ans=0.125 2023-10-03 15:44:44,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 15:44:44,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:44:45,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:44:46,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:44:47,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:44:48,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:44:48,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:44:48,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1319860.0, ans=0.0 2023-10-03 15:44:49,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:44:51,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 15:44:51,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:44:56,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:44:59,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:45:06,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 15:45:06,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:45:07,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:45:10,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 15:45:10,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:45:12,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:45:13,313 INFO [train.py:1046] (1/4) Epoch 38, batch 1450, loss[loss=0.1444, simple_loss=0.2319, pruned_loss=0.0284, over 24497.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2358, pruned_loss=0.03899, over 4704926.71 frames. ], batch size: 66, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:45:18,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:45:19,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:45:19,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:19,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 15:45:24,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:45:26,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:45:26,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1319993.3333333333, ans=0.1 2023-10-03 15:45:29,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:45:29,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 15:45:29,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:45:30,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 15:45:30,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:31,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:45:31,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 15:45:33,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:45:33,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:45:34,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 15:45:34,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:45:34,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:45:36,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:36,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1320060.0, ans=0.1 2023-10-03 15:45:37,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:45:40,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:45:40,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:45:42,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:45:43,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:45,229 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.843e+02 2.021e+02 2.311e+02 3.468e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-03 15:45:45,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:45:45,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:45:46,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:46,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:45:49,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 15:45:52,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:45:57,313 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 15:45:58,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:45:58,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:46:00,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:46:02,130 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.40 vs. limit=22.5 2023-10-03 15:46:02,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 15:46:06,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:46:08,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 15:46:09,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 15:46:11,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:46:13,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1320260.0, ans=0.2 2023-10-03 15:46:14,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:46:14,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:46:17,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 15:46:19,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 15:46:19,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 15:46:20,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:46:20,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 15:46:25,277 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1320260.0, ans=0.125 2023-10-03 15:46:28,094 INFO [train.py:1046] (1/4) Epoch 38, batch 1500, loss[loss=0.1562, simple_loss=0.2313, pruned_loss=0.04059, over 23421.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2361, pruned_loss=0.03895, over 4701150.60 frames. ], batch size: 285, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:46:32,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 15:46:32,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:46:32,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:46:35,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:46:35,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:46:35,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1320326.6666666667, ans=0.0 2023-10-03 15:46:36,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:46:36,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1320326.6666666667, ans=0.0 2023-10-03 15:46:37,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 15:46:39,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:46:39,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 15:46:39,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:46:40,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:46:42,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:46:43,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:46:49,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:46:49,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 15:46:49,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:46:50,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:46:50,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:46:50,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1320393.3333333333, ans=0.2 2023-10-03 15:46:54,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 15:46:59,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 15:47:01,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:47:02,287 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.72 vs. limit=22.5 2023-10-03 15:47:02,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 15:47:04,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 15:47:05,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:47:05,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:47:06,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:47:07,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 15:47:08,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:47:08,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:47:09,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 15:47:09,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:47:12,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1320526.6666666667, ans=0.125 2023-10-03 15:47:13,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:47:13,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 15:47:19,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:47:20,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:47:20,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1320526.6666666667, ans=0.1 2023-10-03 15:47:24,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1320593.3333333333, ans=0.125 2023-10-03 15:47:26,004 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 15:47:27,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:47:27,848 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 15:47:29,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:47:31,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:47:31,159 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 15:47:32,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:47:35,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 15:47:36,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:47:39,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:47:40,799 INFO [train.py:1046] (1/4) Epoch 38, batch 1550, loss[loss=0.1696, simple_loss=0.2407, pruned_loss=0.04923, over 23676.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2366, pruned_loss=0.03885, over 4716752.85 frames. ], batch size: 256, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:47:40,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:47:40,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:47:40,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:47:40,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:47:41,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1320660.0, ans=0.125 2023-10-03 15:47:43,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 15:47:43,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 15:47:43,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:47:44,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 15:47:45,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 15:47:45,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1320660.0, ans=0.125 2023-10-03 15:47:48,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:47:48,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:47:49,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:47:49,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:47:51,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:47:51,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:47:54,107 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 15:47:54,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:47:54,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:47:55,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:47:57,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:47:57,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 15:47:59,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:47:59,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 15:48:00,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 15:48:00,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 15:48:00,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:48:01,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1320726.6666666667, ans=0.0 2023-10-03 15:48:03,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:48:08,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:48:11,708 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.894e+02 2.092e+02 2.413e+02 3.361e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-03 15:48:11,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 15:48:11,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 15:48:14,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1320793.3333333333, ans=0.0 2023-10-03 15:48:18,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:48:22,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:48:24,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:48:24,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:48:24,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 15:48:31,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:48:31,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:48:33,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1320860.0, ans=0.125 2023-10-03 15:48:34,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:48:37,164 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.61 vs. limit=22.5 2023-10-03 15:48:37,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:48:37,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:48:37,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 15:48:38,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:48:40,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:48:40,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:48:41,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 15:48:41,650 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 15:48:41,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:48:48,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 15:48:49,303 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.57 vs. limit=12.0 2023-10-03 15:48:52,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:48:54,719 INFO [train.py:1046] (1/4) Epoch 38, batch 1600, loss[loss=0.1501, simple_loss=0.2345, pruned_loss=0.03285, over 24308.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2381, pruned_loss=0.03925, over 4714608.53 frames. ], batch size: 61, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:48:54,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:48:54,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 15:48:56,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:48:57,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:48:57,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:48:57,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:48:57,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:49:02,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:49:02,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 15:49:03,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1320993.3333333333, ans=0.2 2023-10-03 15:49:04,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 15:49:05,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 15:49:08,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:49:09,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 15:49:11,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:49:12,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:49:17,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:49:19,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 15:49:21,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:49:21,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 15:49:22,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:49:22,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 15:49:27,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 15:49:36,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:49:36,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1321126.6666666667, ans=0.0 2023-10-03 15:49:37,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 15:49:37,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:49:38,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:49:38,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:49:40,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 15:49:44,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 15:49:46,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:49:46,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1321193.3333333333, ans=0.125 2023-10-03 15:49:47,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:49:47,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:49:48,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:49:51,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:49:51,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:49:53,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:49:59,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:50:01,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:50:02,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 15:50:02,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:50:03,588 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.16 vs. limit=15.0 2023-10-03 15:50:05,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 15:50:08,132 INFO [train.py:1046] (1/4) Epoch 38, batch 1650, loss[loss=0.1522, simple_loss=0.2239, pruned_loss=0.04031, over 18804.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2389, pruned_loss=0.03965, over 4702901.10 frames. ], batch size: 40, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:50:11,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:50:11,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:50:12,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:50:12,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 15:50:12,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 15:50:12,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 15:50:12,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 15:50:13,277 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:50:15,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:50:17,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:50:17,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:50:17,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:50:19,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:50:19,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1321326.6666666667, ans=0.2 2023-10-03 15:50:22,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 15:50:24,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:50:24,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:50:24,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:50:24,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:50:25,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 15:50:25,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 15:50:31,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:50:34,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:50:41,111 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.937e+02 2.128e+02 2.357e+02 3.873e+02, threshold=4.255e+02, percent-clipped=0.0 2023-10-03 15:50:43,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 15:50:44,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1321460.0, ans=0.0 2023-10-03 15:50:45,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:50:46,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 15:50:49,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:50:51,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:50:51,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:50:52,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:50:53,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:50:53,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1321526.6666666667, ans=0.0 2023-10-03 15:50:54,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:50:55,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:50:55,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:50:55,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:50:55,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:50:57,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:50:57,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:51:02,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:51:02,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 15:51:05,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:51:06,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 15:51:06,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 15:51:06,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 15:51:06,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:51:08,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:51:08,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:51:09,186 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.72 vs. limit=22.5 2023-10-03 15:51:09,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:51:09,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 15:51:12,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:51:15,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:51:15,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:51:18,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 15:51:22,530 INFO [train.py:1046] (1/4) Epoch 38, batch 1700, loss[loss=0.1482, simple_loss=0.2137, pruned_loss=0.04137, over 23608.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2386, pruned_loss=0.03989, over 4704821.36 frames. ], batch size: 256, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:51:22,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:51:22,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:51:22,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 15:51:22,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1321660.0, ans=0.1 2023-10-03 15:51:23,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:51:23,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:51:23,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:51:25,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:51:25,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:51:25,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 15:51:28,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:51:38,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:51:39,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1321726.6666666667, ans=0.125 2023-10-03 15:51:40,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:51:46,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:51:46,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:51:46,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:51:47,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:51:50,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 15:51:51,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:51:51,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:51:53,131 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.12 vs. limit=15.0 2023-10-03 15:51:53,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:51:55,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 15:51:56,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 15:51:56,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 15:51:58,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:52:00,030 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:52:01,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 15:52:01,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1321793.3333333333, ans=0.0 2023-10-03 15:52:02,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:52:07,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1321860.0, ans=0.07 2023-10-03 15:52:11,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:52:11,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:52:13,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:52:14,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 15:52:14,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 15:52:14,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:52:17,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:52:17,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 15:52:17,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:52:17,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:52:17,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:52:18,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:52:20,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:52:20,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:52:22,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:52:22,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:52:22,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:52:22,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1321926.6666666667, ans=0.0 2023-10-03 15:52:26,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:52:26,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1321926.6666666667, ans=0.125 2023-10-03 15:52:27,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 15:52:29,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:52:31,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:52:34,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 15:52:34,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1321926.6666666667, ans=0.125 2023-10-03 15:52:37,193 INFO [train.py:1046] (1/4) Epoch 38, batch 1750, loss[loss=0.1605, simple_loss=0.2428, pruned_loss=0.0391, over 23340.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2371, pruned_loss=0.03972, over 4697332.16 frames. ], batch size: 119, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:52:40,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:52:41,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:52:41,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 15:52:43,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 15:52:43,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:52:46,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:52:46,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:52:47,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1321993.3333333333, ans=10.0 2023-10-03 15:52:48,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1321993.3333333333, ans=0.1 2023-10-03 15:52:50,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 15:52:53,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:52:54,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 15:52:56,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:52:56,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:52:59,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1322060.0, ans=0.125 2023-10-03 15:53:00,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 15:53:00,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 15:53:00,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1322060.0, ans=0.0 2023-10-03 15:53:03,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:53:03,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 15:53:08,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=1322126.6666666667, ans=0.025 2023-10-03 15:53:09,931 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.868e+02 2.023e+02 2.283e+02 3.172e+02, threshold=4.046e+02, percent-clipped=0.0 2023-10-03 15:53:12,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:53:15,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:53:15,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:53:18,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:53:19,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:53:19,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:53:21,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:53:24,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:53:25,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:53:26,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 15:53:27,219 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.46 vs. limit=15.0 2023-10-03 15:53:29,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:53:32,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 15:53:32,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:53:34,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:53:35,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:53:38,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:53:38,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 15:53:38,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:53:39,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1322260.0, ans=0.0 2023-10-03 15:53:41,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:53:43,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1322260.0, ans=0.1 2023-10-03 15:53:44,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:53:45,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:53:47,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:53:48,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 15:53:48,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:53:48,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:53:48,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:53:48,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:53:50,050 INFO [train.py:1046] (1/4) Epoch 38, batch 1800, loss[loss=0.1577, simple_loss=0.2469, pruned_loss=0.03422, over 24638.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2368, pruned_loss=0.03921, over 4719851.52 frames. ], batch size: 73, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:53:50,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:53:50,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:53:53,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:53:53,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:53:56,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 15:53:56,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1322326.6666666667, ans=0.125 2023-10-03 15:53:58,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:54:02,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 15:54:02,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:54:05,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:54:08,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:54:09,355 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.40 vs. limit=15.0 2023-10-03 15:54:10,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:54:11,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:54:12,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:54:12,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 15:54:12,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:54:13,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1322393.3333333333, ans=0.125 2023-10-03 15:54:15,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1322393.3333333333, ans=0.125 2023-10-03 15:54:16,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:54:19,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 15:54:22,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 15:54:24,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 15:54:24,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:54:25,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:54:25,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:54:25,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:54:32,848 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 15:54:33,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_na.min_abs, batch_count=1322526.6666666667, ans=0.02 2023-10-03 15:54:34,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:54:36,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:54:37,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 15:54:37,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 15:54:37,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:54:39,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:54:41,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:54:43,289 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.74 vs. limit=10.0 2023-10-03 15:54:46,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 15:54:46,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1322526.6666666667, ans=0.0 2023-10-03 15:54:52,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:54:52,495 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:54:53,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 15:54:53,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:54:53,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:54:55,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:54:56,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 15:54:59,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:54:59,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:55:02,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 15:55:02,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:55:03,819 INFO [train.py:1046] (1/4) Epoch 38, batch 1850, loss[loss=0.1626, simple_loss=0.2383, pruned_loss=0.0434, over 23757.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2375, pruned_loss=0.03931, over 4730325.50 frames. ], batch size: 150, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:55:03,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:55:03,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:55:05,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:55:05,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:55:07,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:55:08,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:55:08,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:55:12,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:55:12,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:55:17,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:55:17,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 15:55:21,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 15:55:22,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1322726.6666666667, ans=0.2 2023-10-03 15:55:22,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1322726.6666666667, ans=0.2 2023-10-03 15:55:22,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1322726.6666666667, ans=0.1 2023-10-03 15:55:23,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 15:55:27,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:55:27,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 15:55:27,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 15:55:32,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1322793.3333333333, ans=0.125 2023-10-03 15:55:36,366 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.902e+02 2.104e+02 2.357e+02 3.020e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-03 15:55:38,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:55:39,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 15:55:42,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:55:42,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:55:46,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 15:55:47,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:55:47,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 15:55:48,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:55:50,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:55:53,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:55:56,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:55:57,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:55:57,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 15:55:57,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:55:59,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:56:00,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:56:02,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=1322926.6666666667, ans=15.0 2023-10-03 15:56:04,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 15:56:04,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:56:08,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:56:09,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:56:09,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 15:56:09,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 15:56:11,448 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 15:56:12,806 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 15:56:15,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:56:15,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:56:15,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:56:15,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:56:15,565 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 15:56:15,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:56:15,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:56:17,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:56:18,356 INFO [train.py:1046] (1/4) Epoch 38, batch 1900, loss[loss=0.1519, simple_loss=0.2265, pruned_loss=0.03863, over 24462.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.238, pruned_loss=0.03963, over 4733566.37 frames. ], batch size: 58, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:56:18,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:56:18,916 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.56 vs. limit=22.5 2023-10-03 15:56:19,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:56:19,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 15:56:21,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:56:21,258 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 15:56:21,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:56:22,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:56:28,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:56:30,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:56:31,435 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 15:56:31,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 15:56:32,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:56:33,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:56:33,018 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 15:56:34,341 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 15:56:36,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1323060.0, ans=0.2 2023-10-03 15:56:37,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 15:56:38,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:56:42,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 15:56:43,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 15:56:48,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1323126.6666666667, ans=0.125 2023-10-03 15:56:51,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1323126.6666666667, ans=0.2 2023-10-03 15:56:52,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 15:56:55,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 15:56:55,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:56:57,037 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 15:56:57,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 15:56:57,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 15:56:58,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 15:56:58,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:57:02,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 15:57:04,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1323193.3333333333, ans=0.0 2023-10-03 15:57:05,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:57:07,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:57:07,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 15:57:08,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:57:13,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 15:57:13,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:57:18,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:57:18,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:57:20,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:57:21,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:57:22,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:57:22,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 15:57:24,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:57:27,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:57:27,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:57:28,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:57:28,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:57:30,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:57:31,652 INFO [train.py:1046] (1/4) Epoch 38, batch 1950, loss[loss=0.1723, simple_loss=0.2383, pruned_loss=0.05314, over 23826.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2386, pruned_loss=0.03997, over 4722057.74 frames. ], batch size: 164, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:57:31,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:57:34,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:57:37,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:57:37,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:57:37,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:57:39,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1323326.6666666667, ans=0.04949747468305833 2023-10-03 15:57:40,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 15:57:40,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 15:57:40,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:57:42,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:57:45,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:57:46,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:57:46,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:57:48,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:57:52,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:57:52,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:57:52,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:57:52,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:57:56,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:57:59,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:57:59,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:57:59,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 15:57:59,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 15:58:01,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 15:58:01,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:58:02,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:58:04,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:58:05,335 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.744e+02 1.976e+02 2.296e+02 2.556e+02 3.551e+02, threshold=4.593e+02, percent-clipped=0.0 2023-10-03 15:58:06,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:58:10,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:58:13,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:58:15,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:58:15,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 15:58:15,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:58:18,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:58:19,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:58:20,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:58:28,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:58:28,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:58:32,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:58:33,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:58:37,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:58:37,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:58:37,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 15:58:37,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:58:39,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:58:41,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 15:58:44,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:58:45,752 INFO [train.py:1046] (1/4) Epoch 38, batch 2000, loss[loss=0.1804, simple_loss=0.2462, pruned_loss=0.05733, over 22725.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2391, pruned_loss=0.04007, over 4713818.35 frames. ], batch size: 322, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 15:58:47,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:58:47,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:58:48,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1323660.0, ans=0.1 2023-10-03 15:58:49,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:58:51,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:58:53,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:58:54,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 15:58:56,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:58:58,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:59:00,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 15:59:02,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:59:02,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:59:04,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:59:06,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 15:59:07,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:08,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:08,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:10,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 15:59:10,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 15:59:12,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 15:59:12,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:59:16,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:59:18,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:59:18,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:18,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:59:20,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:59:21,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 15:59:24,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 15:59:24,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:59:24,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:59:30,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:59:30,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1323860.0, ans=0.125 2023-10-03 15:59:31,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:59:31,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:59:33,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:59:34,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:59:35,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:59:36,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:59:36,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:59:37,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:40,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:59:40,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 15:59:42,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1323860.0, ans=0.1 2023-10-03 15:59:45,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1323926.6666666667, ans=0.125 2023-10-03 15:59:46,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 15:59:47,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:59:50,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:59:51,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:59:53,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:56,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:59:56,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:57,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:59:57,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:59:59,295 INFO [train.py:1046] (1/4) Epoch 38, batch 2050, loss[loss=0.145, simple_loss=0.2395, pruned_loss=0.02531, over 24642.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2376, pruned_loss=0.03941, over 4719754.44 frames. ], batch size: 73, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:00:00,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:00:01,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:00:01,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1323993.3333333333, ans=0.1 2023-10-03 16:00:04,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:00:05,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:00:09,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:00:10,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:00:11,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:00:12,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:00:14,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 16:00:15,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:00:16,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:00:16,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:00:21,677 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.25 vs. limit=22.5 2023-10-03 16:00:26,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:00:26,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:00:27,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 16:00:28,840 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:00:29,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:00:29,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 16:00:31,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:00:32,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:00:34,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1324126.6666666667, ans=0.1 2023-10-03 16:00:35,355 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.904e+02 2.086e+02 2.285e+02 3.176e+02, threshold=4.171e+02, percent-clipped=0.0 2023-10-03 16:00:35,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:00:36,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 16:00:37,415 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.67 vs. limit=22.5 2023-10-03 16:00:38,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:00:39,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:00:39,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:00:40,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:00:44,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:00:45,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1324193.3333333333, ans=0.125 2023-10-03 16:00:46,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:00:48,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1324193.3333333333, ans=0.1 2023-10-03 16:00:50,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:00:50,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:00:53,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:00:56,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1324193.3333333333, ans=0.125 2023-10-03 16:01:00,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:01:00,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 16:01:05,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:01:06,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:01:07,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:01:09,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 16:01:13,247 INFO [train.py:1046] (1/4) Epoch 38, batch 2100, loss[loss=0.1635, simple_loss=0.2339, pruned_loss=0.04657, over 23897.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2364, pruned_loss=0.03935, over 4709224.98 frames. ], batch size: 212, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:01:13,356 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 16:01:13,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:01:15,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:01:15,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:01:16,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:01:16,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 16:01:16,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 16:01:18,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:01:23,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:01:23,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:01:26,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:01:26,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:01:27,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 16:01:27,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:01:28,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 16:01:28,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 16:01:31,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:01:31,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:01:31,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 16:01:32,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 16:01:33,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1324393.3333333333, ans=0.2 2023-10-03 16:01:37,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 16:01:37,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:01:40,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:01:41,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:01:43,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:01:45,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 16:01:45,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:01:45,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 16:01:48,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 16:01:48,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1324460.0, ans=0.2 2023-10-03 16:01:49,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:01:49,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 16:01:49,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 16:01:51,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 16:01:52,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:01:53,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:01:55,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:01:57,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:01:58,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:01:59,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:01:59,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 16:01:59,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:01:59,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1324526.6666666667, ans=0.0 2023-10-03 16:02:00,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:02:00,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:02:00,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 16:02:02,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 16:02:03,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 16:02:06,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:02:09,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:02:09,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 16:02:10,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1324526.6666666667, ans=0.04949747468305833 2023-10-03 16:02:14,545 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.51 vs. limit=15.0 2023-10-03 16:02:16,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:02:17,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:02:19,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:02:19,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:02:19,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 16:02:19,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:02:22,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:02:22,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:02:23,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:02:23,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:02:25,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 16:02:26,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 16:02:26,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:02:28,744 INFO [train.py:1046] (1/4) Epoch 38, batch 2150, loss[loss=0.1475, simple_loss=0.2311, pruned_loss=0.03195, over 24466.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2359, pruned_loss=0.03929, over 4706003.47 frames. ], batch size: 63, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:02:30,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:02:30,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:02:30,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:02:31,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:02:36,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 16:02:38,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:02:40,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:02:41,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:02:41,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:02:42,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:02:45,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:02:46,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:02:46,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:02:46,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1324726.6666666667, ans=0.1 2023-10-03 16:02:50,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:02:50,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 16:02:56,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:02:56,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:02:57,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:02:57,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:02:59,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:02:59,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:03:00,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:03:00,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:03:00,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:03:02,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 16:03:04,124 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.852e+02 2.034e+02 2.219e+02 3.109e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-03 16:03:04,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:03:05,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:03:05,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:03:05,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1324793.3333333333, ans=0.0 2023-10-03 16:03:06,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:03:08,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:03:10,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:03:10,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:03:12,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:03:12,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 16:03:12,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:03:15,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:03:17,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:17,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:03:18,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 16:03:19,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:19,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:19,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 16:03:23,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 16:03:23,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:03:24,550 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 16:03:24,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1324860.0, ans=0.125 2023-10-03 16:03:25,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:25,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:03:26,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 16:03:27,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:03:27,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 16:03:27,294 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 16:03:27,295 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 16:03:27,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 16:03:28,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:30,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:03:30,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:03:31,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:33,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 16:03:34,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:34,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:42,157 INFO [train.py:1046] (1/4) Epoch 38, batch 2200, loss[loss=0.1468, simple_loss=0.2335, pruned_loss=0.03, over 24662.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2358, pruned_loss=0.03896, over 4711603.59 frames. ], batch size: 65, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:03:42,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:03:42,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 16:03:46,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:03:49,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:49,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:03:49,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:03:50,504 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.62 vs. limit=15.0 2023-10-03 16:03:51,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:03:53,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:54,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:03:54,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 16:03:58,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 16:04:01,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:04:06,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 16:04:06,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1325060.0, ans=0.125 2023-10-03 16:04:10,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:04:11,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:04:13,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:04:15,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1325126.6666666667, ans=0.125 2023-10-03 16:04:16,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:04:18,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 16:04:18,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1325126.6666666667, ans=0.125 2023-10-03 16:04:21,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:04:21,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:04:21,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1325126.6666666667, ans=0.1 2023-10-03 16:04:22,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 16:04:24,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:04:26,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:04:27,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:04:28,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:04:31,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 16:04:32,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:04:32,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 16:04:36,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:04:36,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:04:36,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:04:37,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:04:37,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1325193.3333333333, ans=0.1 2023-10-03 16:04:39,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:04:39,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:04:39,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:04:39,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 16:04:40,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:04:42,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:04:45,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 16:04:47,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:04:48,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:04:50,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1325260.0, ans=0.125 2023-10-03 16:04:51,791 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 16:04:53,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:04:53,187 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 16:04:54,042 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.17 vs. limit=10.0 2023-10-03 16:04:54,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 16:04:55,923 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 16:04:57,207 INFO [train.py:1046] (1/4) Epoch 38, batch 2250, loss[loss=0.1528, simple_loss=0.2336, pruned_loss=0.03606, over 23493.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2369, pruned_loss=0.03937, over 4702722.42 frames. ], batch size: 134, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:04:57,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:04:58,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 16:05:00,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:05:00,351 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 16:05:01,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:05:04,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:05:07,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:05:10,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:05:15,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:05:15,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:05:16,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:05:19,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 16:05:19,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:05:19,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:05:21,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 16:05:23,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:05:23,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:05:25,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:05:28,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:05:28,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 16:05:30,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 16:05:31,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 16:05:32,779 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.873e+02 2.067e+02 2.203e+02 2.954e+02, threshold=4.134e+02, percent-clipped=0.0 2023-10-03 16:05:32,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:05:34,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1325460.0, ans=0.125 2023-10-03 16:05:36,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:05:36,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1325460.0, ans=0.0 2023-10-03 16:05:38,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:05:41,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:05:41,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:05:41,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:05:42,474 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.67 vs. limit=22.5 2023-10-03 16:05:44,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=1325526.6666666667, ans=6.0 2023-10-03 16:05:44,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:05:46,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:05:50,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:05:54,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 16:05:55,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1325593.3333333333, ans=0.0 2023-10-03 16:05:55,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1325593.3333333333, ans=0.2 2023-10-03 16:05:59,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:05:59,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:05:59,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:06:05,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 16:06:06,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:06:06,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 16:06:06,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:06:08,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:06:09,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 16:06:11,259 INFO [train.py:1046] (1/4) Epoch 38, batch 2300, loss[loss=0.1452, simple_loss=0.2186, pruned_loss=0.03587, over 15573.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2374, pruned_loss=0.03955, over 4705175.84 frames. ], batch size: 33, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:06:12,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:06:14,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:06:19,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:06:19,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:06:22,103 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 16:06:23,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:06:29,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:06:29,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 16:06:29,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:06:29,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1325726.6666666667, ans=0.0 2023-10-03 16:06:30,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:06:30,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 16:06:31,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:06:33,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:06:33,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:06:34,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1325726.6666666667, ans=0.1 2023-10-03 16:06:39,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:06:41,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:06:44,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:06:44,317 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:06:50,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:06:50,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:06:53,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:06:56,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:07:00,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:07:00,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:07:01,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:07:01,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 16:07:05,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 16:07:05,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:07:05,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:07:05,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:07:05,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:07:05,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 16:07:05,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:07:06,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 16:07:06,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:07:06,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:07:06,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 16:07:07,723 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.20 vs. limit=22.5 2023-10-03 16:07:11,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1325926.6666666667, ans=0.1 2023-10-03 16:07:13,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1325926.6666666667, ans=0.125 2023-10-03 16:07:15,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:07:18,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:07:22,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:07:22,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:07:22,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:07:24,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:07:24,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:07:26,602 INFO [train.py:1046] (1/4) Epoch 38, batch 2350, loss[loss=0.1527, simple_loss=0.2402, pruned_loss=0.03264, over 24662.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2382, pruned_loss=0.03977, over 4699537.28 frames. ], batch size: 68, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:07:26,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:07:27,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 16:07:31,443 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.36 vs. limit=22.5 2023-10-03 16:07:33,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1325993.3333333333, ans=0.04949747468305833 2023-10-03 16:07:35,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:07:35,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 16:07:39,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 16:07:43,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:07:46,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:07:46,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:07:46,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:07:47,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:07:47,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 16:07:51,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:07:55,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 16:07:57,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:08:02,027 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.917e+02 2.142e+02 2.413e+02 3.614e+02, threshold=4.285e+02, percent-clipped=0.0 2023-10-03 16:08:02,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:08:02,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:08:03,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:08:04,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 16:08:04,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:08:07,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:08:07,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:08:09,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:08:10,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:08:13,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 16:08:13,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:08:15,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1326193.3333333333, ans=0.125 2023-10-03 16:08:16,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:08:16,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:08:17,168 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.34 vs. limit=15.0 2023-10-03 16:08:18,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 16:08:18,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1326193.3333333333, ans=0.1 2023-10-03 16:08:19,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:08:23,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 16:08:23,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:08:23,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1326193.3333333333, ans=0.125 2023-10-03 16:08:27,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 16:08:28,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 16:08:30,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:08:30,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:08:30,744 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 16:08:30,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1326260.0, ans=0.1 2023-10-03 16:08:31,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 16:08:34,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 16:08:36,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:08:40,331 INFO [train.py:1046] (1/4) Epoch 38, batch 2400, loss[loss=0.1696, simple_loss=0.2589, pruned_loss=0.04019, over 24070.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.238, pruned_loss=0.03965, over 4703464.79 frames. ], batch size: 80, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:08:40,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:08:40,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1326326.6666666667, ans=0.125 2023-10-03 16:08:42,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1326326.6666666667, ans=0.125 2023-10-03 16:08:43,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:08:46,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:08:47,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 16:08:47,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 16:08:48,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1326326.6666666667, ans=0.2 2023-10-03 16:08:54,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:08:54,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:08:56,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 16:08:58,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:08:59,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:08:59,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 16:08:59,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1326393.3333333333, ans=0.125 2023-10-03 16:09:04,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:09:06,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1326393.3333333333, ans=0.125 2023-10-03 16:09:07,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 16:09:09,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:09:14,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 16:09:15,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:09:17,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:09:22,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:09:22,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 16:09:22,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:09:25,823 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.34 vs. limit=15.0 2023-10-03 16:09:28,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1326526.6666666667, ans=0.5 2023-10-03 16:09:29,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:09:32,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:09:35,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:09:36,146 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.82 vs. limit=15.0 2023-10-03 16:09:36,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:09:36,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 16:09:36,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:09:36,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:09:37,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:09:37,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:09:41,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:09:43,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:09:43,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 16:09:44,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 16:09:47,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:09:47,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:09:49,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 16:09:49,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 16:09:49,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 16:09:49,257 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 16:09:51,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 16:09:53,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:09:54,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:09:55,237 INFO [train.py:1046] (1/4) Epoch 38, batch 2450, loss[loss=0.1577, simple_loss=0.2326, pruned_loss=0.0414, over 23421.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2362, pruned_loss=0.03964, over 4674907.96 frames. ], batch size: 134, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:09:55,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:09:55,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1326660.0, ans=0.125 2023-10-03 16:09:56,973 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 16:09:57,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:09:58,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:10:01,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:10:01,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:10:06,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:10:06,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:10:06,580 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.34 vs. limit=12.0 2023-10-03 16:10:07,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 16:10:11,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:10:11,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:10:16,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:10:16,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:10:16,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:10:16,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 16:10:20,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:10:22,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:10:22,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1326726.6666666667, ans=0.125 2023-10-03 16:10:23,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:10:26,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:10:26,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:10:28,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:10:29,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:10:31,034 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.921e+02 2.165e+02 2.566e+02 3.578e+02, threshold=4.329e+02, percent-clipped=0.0 2023-10-03 16:10:31,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 16:10:32,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:10:38,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:10:40,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:10:41,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:10:41,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:10:42,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:10:42,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:10:43,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1326860.0, ans=0.1 2023-10-03 16:10:44,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 16:10:47,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:10:47,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:10:50,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:10:50,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:10:55,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:10:55,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 16:10:57,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:10:58,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:10:58,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 16:10:59,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:11:01,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:11:03,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:11:05,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:11:05,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:11:07,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1326926.6666666667, ans=0.0 2023-10-03 16:11:10,267 INFO [train.py:1046] (1/4) Epoch 38, batch 2500, loss[loss=0.1598, simple_loss=0.2425, pruned_loss=0.03858, over 23278.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2354, pruned_loss=0.03929, over 4667722.55 frames. ], batch size: 105, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:11:10,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 16:11:11,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:11:15,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1326993.3333333333, ans=0.0 2023-10-03 16:11:16,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:11:27,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:11:27,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:11:27,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:11:27,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 16:11:28,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1327060.0, ans=0.125 2023-10-03 16:11:35,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:11:35,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:11:36,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 16:11:37,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 16:11:38,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 16:11:39,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:11:41,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:11:43,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 16:11:43,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:11:43,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 16:11:43,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:11:47,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:11:49,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:11:51,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:11:51,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 16:11:53,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:11:54,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:11:55,346 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.73 vs. limit=15.0 2023-10-03 16:11:58,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:12:02,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:12:04,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:12:04,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1327193.3333333333, ans=0.0 2023-10-03 16:12:08,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 16:12:09,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 16:12:09,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:12:09,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 16:12:12,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:12:12,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:12:12,819 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 16:12:12,819 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 16:12:14,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 16:12:17,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:12:20,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 16:12:20,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 16:12:20,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:12:20,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 16:12:22,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 16:12:23,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1327326.6666666667, ans=0.2 2023-10-03 16:12:24,751 INFO [train.py:1046] (1/4) Epoch 38, batch 2550, loss[loss=0.1474, simple_loss=0.2381, pruned_loss=0.0283, over 24311.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2359, pruned_loss=0.03944, over 4680787.24 frames. ], batch size: 74, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:12:25,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1327326.6666666667, ans=0.125 2023-10-03 16:12:27,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:12:29,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:12:29,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:12:30,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:12:32,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 16:12:32,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:12:37,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 16:12:38,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:12:40,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:12:42,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:12:42,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 16:12:43,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:12:44,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:12:44,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:12:47,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:12:47,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 16:12:47,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 16:12:47,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:12:47,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 16:12:51,747 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.12 vs. limit=6.0 2023-10-03 16:12:53,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1327460.0, ans=0.125 2023-10-03 16:12:55,952 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.32 vs. limit=15.0 2023-10-03 16:12:59,652 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.861e+02 2.049e+02 2.382e+02 3.363e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-03 16:13:03,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:13:07,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:13:07,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:13:07,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:13:08,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 16:13:14,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:13:17,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:13:17,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:13:17,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:13:17,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 16:13:19,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:13:22,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:13:22,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:13:27,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:13:27,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 16:13:27,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:13:27,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:13:29,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:13:29,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:13:31,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:13:35,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:13:38,558 INFO [train.py:1046] (1/4) Epoch 38, batch 2600, loss[loss=0.1592, simple_loss=0.2403, pruned_loss=0.039, over 23308.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2366, pruned_loss=0.03952, over 4686365.81 frames. ], batch size: 105, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:13:38,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:13:40,643 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 16:13:43,391 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 16:13:44,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:13:44,721 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 16:13:46,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 16:13:46,110 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 16:13:48,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:13:49,783 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.32 vs. limit=15.0 2023-10-03 16:13:50,457 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 16:13:50,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 16:13:51,975 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 16:13:54,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:13:56,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 16:13:56,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 16:13:58,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:13:58,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 16:14:02,132 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 16:14:02,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 16:14:08,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:14:09,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:14:09,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:14:09,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 16:14:12,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:14:16,188 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.22 vs. limit=22.5 2023-10-03 16:14:17,011 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 16:14:21,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:14:21,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1327860.0, ans=0.2 2023-10-03 16:14:23,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:14:24,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 16:14:24,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:14:24,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:14:25,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 16:14:27,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1327860.0, ans=0.1 2023-10-03 16:14:28,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:14:28,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:14:31,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:14:34,553 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 16:14:36,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:14:36,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:14:39,538 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:14:40,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:14:41,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:14:41,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 16:14:43,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:14:44,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:14:44,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:14:46,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1327926.6666666667, ans=0.0 2023-10-03 16:14:51,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 16:14:52,590 INFO [train.py:1046] (1/4) Epoch 38, batch 2650, loss[loss=0.1548, simple_loss=0.2469, pruned_loss=0.03134, over 24651.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2373, pruned_loss=0.04003, over 4678492.49 frames. ], batch size: 68, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:14:52,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:14:54,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:14:58,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 16:14:58,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:14:59,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:14:59,883 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 16:14:59,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:15:02,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:15:04,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:15:06,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:15:08,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:15:10,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 16:15:10,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:15:10,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:15:12,089 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.32 vs. limit=12.0 2023-10-03 16:15:14,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 16:15:15,798 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 16:15:17,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:15:18,491 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.75 vs. limit=15.0 2023-10-03 16:15:20,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 16:15:20,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:15:20,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1328126.6666666667, ans=0.125 2023-10-03 16:15:22,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 16:15:24,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:15:26,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:15:26,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:15:26,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:15:29,040 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 1.957e+02 2.182e+02 2.477e+02 3.538e+02, threshold=4.363e+02, percent-clipped=0.0 2023-10-03 16:15:29,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1328126.6666666667, ans=0.0 2023-10-03 16:15:30,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 16:15:30,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 16:15:30,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1328126.6666666667, ans=0.125 2023-10-03 16:15:33,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:15:34,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 16:15:36,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:15:36,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:15:38,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:15:38,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:15:38,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:15:41,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:15:41,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:15:41,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1328193.3333333333, ans=0.1 2023-10-03 16:15:44,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:15:44,614 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.82 vs. limit=15.0 2023-10-03 16:15:45,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:15:45,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:15:48,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:15:48,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:15:49,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:15:50,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:15:50,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:15:53,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:15:54,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:15:54,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:15:54,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 16:15:58,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:16:00,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:16:00,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:16:01,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:03,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:16:03,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:06,461 INFO [train.py:1046] (1/4) Epoch 38, batch 2700, loss[loss=0.1707, simple_loss=0.2487, pruned_loss=0.04636, over 23511.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2384, pruned_loss=0.04052, over 4682520.02 frames. ], batch size: 93, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:16:06,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:16:06,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 16:16:09,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:16:11,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 16:16:13,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:16:13,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:13,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:16,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:16:16,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:16:16,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:16:16,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 16:16:16,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 16:16:18,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:16:20,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:16:20,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:16:21,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:16:24,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:16:24,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 16:16:25,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:16:31,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:16:31,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:16:36,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:16:37,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:16:37,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:16:37,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:16:40,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:16:41,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1328460.0, ans=0.1 2023-10-03 16:16:43,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:16:43,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:16:43,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:16:47,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:47,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:16:53,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1328526.6666666667, ans=0.125 2023-10-03 16:16:53,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1328526.6666666667, ans=0.1 2023-10-03 16:16:57,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:16:58,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:17:01,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:17:01,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:17:04,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:17:06,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:17:06,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:17:07,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:08,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:17:10,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:17:11,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:17:13,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1328593.3333333333, ans=0.125 2023-10-03 16:17:14,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:17:14,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:17:17,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 16:17:18,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:17:20,112 INFO [train.py:1046] (1/4) Epoch 38, batch 2750, loss[loss=0.1305, simple_loss=0.1948, pruned_loss=0.03308, over 23407.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2381, pruned_loss=0.03981, over 4708794.57 frames. ], batch size: 285, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:17:20,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1328660.0, ans=0.2 2023-10-03 16:17:21,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:17:21,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 16:17:24,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 16:17:24,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:17:26,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:17:26,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:17:29,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:29,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:17:29,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:31,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:17:31,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 16:17:33,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:17:33,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:33,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 16:17:33,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:17:34,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:17:38,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 16:17:39,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:17:40,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:41,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:17:41,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:17:42,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:17:44,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:17:44,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:17:44,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:17:44,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=1328726.6666666667, ans=0.025 2023-10-03 16:17:48,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:17:49,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1328793.3333333333, ans=0.1 2023-10-03 16:17:51,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 16:17:51,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:17:52,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:54,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:17:58,232 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.929e+02 2.191e+02 2.513e+02 4.361e+02, threshold=4.383e+02, percent-clipped=0.0 2023-10-03 16:17:58,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:17:59,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:18:01,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:18:01,937 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1328793.3333333333, ans=0.1 2023-10-03 16:18:04,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:18:04,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:18:04,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:18:11,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:18:12,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:18:12,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 16:18:15,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:18:18,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 16:18:25,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 16:18:27,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:18:27,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 16:18:29,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:18:31,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:18:31,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 16:18:31,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:18:34,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 16:18:34,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:18:34,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:18:35,424 INFO [train.py:1046] (1/4) Epoch 38, batch 2800, loss[loss=0.1569, simple_loss=0.2378, pruned_loss=0.03796, over 24361.00 frames. ], tot_loss[loss=0.158, simple_loss=0.237, pruned_loss=0.03952, over 4703244.38 frames. ], batch size: 77, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:18:35,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 16:18:36,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:18:36,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:18:38,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:18:39,508 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 16:18:39,509 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 16:18:40,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:18:43,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:18:43,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:18:46,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:18:47,121 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.60 vs. limit=15.0 2023-10-03 16:18:51,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 16:18:51,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 16:18:52,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 16:18:54,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:18:55,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:18:55,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:18:58,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:18:58,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:18:58,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 16:18:59,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:19:06,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1329126.6666666667, ans=0.125 2023-10-03 16:19:08,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:19:10,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:19:12,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:19:12,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:19:14,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:19:18,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:19:18,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 16:19:20,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:19:22,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:19:22,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:19:26,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:19:26,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:19:30,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:19:32,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:19:32,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:19:32,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:19:33,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 16:19:34,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:19:35,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:19:35,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 16:19:35,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:19:35,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:19:35,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:19:38,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 16:19:39,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:19:39,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:19:40,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:19:41,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 16:19:47,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:19:47,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:19:47,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:19:47,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1329326.6666666667, ans=0.0 2023-10-03 16:19:48,412 INFO [train.py:1046] (1/4) Epoch 38, batch 2850, loss[loss=0.1662, simple_loss=0.2558, pruned_loss=0.03829, over 24311.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.236, pruned_loss=0.03914, over 4692192.33 frames. ], batch size: 74, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:19:48,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:19:52,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:19:54,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:19:54,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:19:54,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1329326.6666666667, ans=0.125 2023-10-03 16:19:58,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:19:58,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:19:59,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:20:01,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 16:20:07,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 16:20:07,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:20:10,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 16:20:11,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:20:14,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 16:20:14,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 16:20:15,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:20:17,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1329460.0, ans=0.0 2023-10-03 16:20:21,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1329460.0, ans=0.125 2023-10-03 16:20:26,108 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.929e+02 2.175e+02 2.437e+02 3.531e+02, threshold=4.350e+02, percent-clipped=0.0 2023-10-03 16:20:26,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:20:27,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:20:27,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1329460.0, ans=0.0 2023-10-03 16:20:27,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1329460.0, ans=0.1 2023-10-03 16:20:28,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:20:29,615 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.28 vs. limit=22.5 2023-10-03 16:20:30,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:20:30,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:20:30,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:20:31,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:20:33,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 16:20:35,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:20:35,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:20:36,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:20:36,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:20:38,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:20:39,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:20:40,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:20:42,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:20:45,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:20:45,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:20:46,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:20:47,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:20:50,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:20:52,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 16:20:53,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 16:20:55,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:20:55,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:20:55,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 16:20:57,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:20:58,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:20:58,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:20:58,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:20:58,590 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 16:20:58,627 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 16:20:58,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:20:59,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:21:02,753 INFO [train.py:1046] (1/4) Epoch 38, batch 2900, loss[loss=0.1398, simple_loss=0.2229, pruned_loss=0.02838, over 24312.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.236, pruned_loss=0.03908, over 4684929.21 frames. ], batch size: 61, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:21:04,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 16:21:04,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:21:06,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:21:06,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 16:21:09,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:21:10,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 16:21:11,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 16:21:13,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:21:13,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:21:14,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:21:15,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:21:19,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:21:19,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:21:23,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:21:23,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 16:21:25,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:21:26,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:21:28,129 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:21:29,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 16:21:29,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1329726.6666666667, ans=0.125 2023-10-03 16:21:30,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 16:21:34,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:21:34,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 16:21:34,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:21:35,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:21:35,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 16:21:37,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:21:39,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:21:43,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:21:44,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:21:46,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 16:21:46,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 16:21:46,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:21:49,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1329860.0, ans=0.125 2023-10-03 16:21:50,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:21:50,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1329860.0, ans=0.1 2023-10-03 16:21:51,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 16:21:53,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:21:58,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:22:04,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1329926.6666666667, ans=0.125 2023-10-03 16:22:06,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:22:07,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:22:08,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 16:22:11,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:22:11,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 16:22:11,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:22:13,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:22:15,794 INFO [train.py:1046] (1/4) Epoch 38, batch 2950, loss[loss=0.1569, simple_loss=0.2335, pruned_loss=0.04013, over 23752.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2369, pruned_loss=0.03895, over 4701184.01 frames. ], batch size: 179, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:22:18,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:22:19,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 16:22:21,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:22:21,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:22:22,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:22:24,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:22:26,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 16:22:26,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 16:22:27,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:22:27,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:22:34,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.84 vs. limit=15.0 2023-10-03 16:22:35,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:22:37,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:22:38,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:22:38,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:22:42,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:22:42,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:22:44,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:22:44,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:22:45,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:22:47,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 16:22:50,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 16:22:51,272 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 16:22:51,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:22:52,582 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 1.960e+02 2.141e+02 2.460e+02 3.177e+02, threshold=4.282e+02, percent-clipped=0.0 2023-10-03 16:22:52,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1330126.6666666667, ans=0.0 2023-10-03 16:22:53,882 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 16:22:53,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 16:22:55,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:22:57,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:22:57,166 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 16:22:57,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:22:59,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 16:23:01,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:23:01,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:23:04,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:23:05,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:23:05,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:23:07,829 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 16:23:07,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:23:09,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 16:23:14,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:23:14,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:23:14,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 16:23:14,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:23:16,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 16:23:18,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:23:20,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:23:20,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:23:20,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1330260.0, ans=0.0 2023-10-03 16:23:21,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:23:21,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 16:23:23,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:23:24,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:23:24,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:23:24,798 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:23:26,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:23:27,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:23:28,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:23:29,358 INFO [train.py:1046] (1/4) Epoch 38, batch 3000, loss[loss=0.1556, simple_loss=0.2469, pruned_loss=0.03214, over 24479.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2373, pruned_loss=0.03905, over 4705253.35 frames. ], batch size: 66, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:23:29,359 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 16:23:36,756 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([1.2879, 2.3642, 3.2169, 2.1223], device='cuda:1') 2023-10-03 16:23:41,538 INFO [train.py:1078] (1/4) Epoch 38, validation: loss=0.3508, simple_loss=0.2758, pruned_loss=0.2129, over 1125622.00 frames. 2023-10-03 16:23:41,538 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 16:23:41,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:23:41,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 16:23:43,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:23:45,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:23:45,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:23:48,756 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 16:23:48,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 16:23:51,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1330326.6666666667, ans=0.2 2023-10-03 16:23:52,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:23:52,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:23:53,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1330326.6666666667, ans=0.0 2023-10-03 16:23:54,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 16:23:54,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:23:55,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1330393.3333333333, ans=0.0 2023-10-03 16:24:01,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:24:01,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1330393.3333333333, ans=0.1 2023-10-03 16:24:02,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1330393.3333333333, ans=0.09899494936611666 2023-10-03 16:24:09,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:24:09,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1330393.3333333333, ans=0.125 2023-10-03 16:24:14,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 16:24:16,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:24:17,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:24:17,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:24:17,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:24:20,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:24:20,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 16:24:23,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 16:24:23,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:24:25,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 16:24:27,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:24:27,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:24:27,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:24:27,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:24:31,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:24:31,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:24:31,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:24:33,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:24:33,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1330526.6666666667, ans=0.125 2023-10-03 16:24:35,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 16:24:37,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:24:37,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:24:38,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:24:41,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:24:41,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:24:42,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 16:24:42,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 16:24:42,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:24:42,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 16:24:44,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:24:45,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 16:24:49,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:24:49,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:24:50,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 16:24:50,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 16:24:50,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 16:24:52,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:24:54,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:24:54,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 16:24:54,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:24:54,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:24:56,285 INFO [train.py:1046] (1/4) Epoch 38, batch 3050, loss[loss=0.1378, simple_loss=0.2193, pruned_loss=0.02821, over 24593.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2382, pruned_loss=0.0395, over 4714631.74 frames. ], batch size: 60, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:24:58,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 16:24:59,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:25:00,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:25:02,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:25:04,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:25:07,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 16:25:14,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 16:25:14,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 16:25:15,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:25:18,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:25:20,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:25:21,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:25:21,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:25:24,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:25:25,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:25:25,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:25:26,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:25:26,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:25:28,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:25:30,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:25:33,291 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.918e+02 2.115e+02 2.470e+02 3.368e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-03 16:25:33,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:25:33,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 16:25:33,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1330793.3333333333, ans=0.0 2023-10-03 16:25:34,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:25:34,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:25:38,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:25:38,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:25:38,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:25:40,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:25:42,211 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.18 vs. limit=6.0 2023-10-03 16:25:43,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1330860.0, ans=0.2 2023-10-03 16:25:45,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:25:45,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:25:45,876 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1330860.0, ans=0.0 2023-10-03 16:25:45,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1330860.0, ans=0.125 2023-10-03 16:25:50,818 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.68 vs. limit=12.0 2023-10-03 16:25:52,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:25:54,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:25:54,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:25:55,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:25:55,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 16:25:55,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:25:57,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 16:25:58,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1330926.6666666667, ans=0.125 2023-10-03 16:25:59,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:25:59,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:00,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 16:26:00,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1330926.6666666667, ans=0.0 2023-10-03 16:26:03,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:26:07,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:26:10,304 INFO [train.py:1046] (1/4) Epoch 38, batch 3100, loss[loss=0.1669, simple_loss=0.2525, pruned_loss=0.04065, over 24648.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2384, pruned_loss=0.03945, over 4721861.26 frames. ], batch size: 68, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:26:10,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:26:12,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:26:13,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 16:26:16,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 16:26:17,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 16:26:19,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:26:19,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1330993.3333333333, ans=0.125 2023-10-03 16:26:20,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:26:20,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:20,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1330993.3333333333, ans=0.125 2023-10-03 16:26:23,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 16:26:25,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1331060.0, ans=0.125 2023-10-03 16:26:26,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:31,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 16:26:35,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 16:26:35,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:26:35,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:26:35,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:26:37,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 16:26:40,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:26:40,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 16:26:40,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:26:43,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:44,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 16:26:44,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:26:49,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:26:50,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 16:26:50,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 16:26:53,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:26:53,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:56,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:26:56,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:26:57,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:26:59,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:26:59,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:27:00,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:27:00,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:27:00,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:27:00,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 16:27:06,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:27:06,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 16:27:08,829 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.20 vs. limit=12.0 2023-10-03 16:27:09,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:27:09,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 16:27:10,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:27:10,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:27:11,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 16:27:17,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1331260.0, ans=0.0 2023-10-03 16:27:20,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 16:27:23,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:27:23,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:27:24,355 INFO [train.py:1046] (1/4) Epoch 38, batch 3150, loss[loss=0.1541, simple_loss=0.2349, pruned_loss=0.03669, over 24479.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2372, pruned_loss=0.03914, over 4715807.22 frames. ], batch size: 63, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:27:25,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:27:25,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:27:27,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 16:27:28,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:27:28,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 16:27:30,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 16:27:30,705 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.64 vs. limit=12.0 2023-10-03 16:27:33,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:27:34,598 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 16:27:38,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 16:27:38,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:27:40,022 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 16:27:41,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 16:27:41,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 16:27:41,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1331393.3333333333, ans=0.1 2023-10-03 16:27:43,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 16:27:43,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 16:27:43,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:27:43,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:27:44,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:27:44,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 16:27:48,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:27:48,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:27:48,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:27:49,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:27:53,216 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1331460.0, ans=0.125 2023-10-03 16:27:54,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 16:27:54,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:27:56,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1331460.0, ans=0.1 2023-10-03 16:27:57,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:27:58,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:27:58,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 16:28:01,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 16:28:02,634 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 1.924e+02 2.139e+02 2.464e+02 3.251e+02, threshold=4.278e+02, percent-clipped=0.0 2023-10-03 16:28:02,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:28:02,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 16:28:02,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 16:28:04,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:28:04,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:28:06,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:28:06,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:28:07,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 16:28:08,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:28:08,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:10,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:28:10,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:28:10,963 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.84 vs. limit=22.5 2023-10-03 16:28:11,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 16:28:11,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:28:14,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 16:28:14,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:15,621 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.72 vs. limit=15.0 2023-10-03 16:28:16,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 16:28:17,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 16:28:19,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:28:19,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:28:20,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 16:28:22,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 16:28:23,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:28:23,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1331593.3333333333, ans=0.2 2023-10-03 16:28:25,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:28:25,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:25,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:28:25,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1331593.3333333333, ans=0.0 2023-10-03 16:28:26,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1331593.3333333333, ans=0.0 2023-10-03 16:28:31,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:28:32,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:33,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 16:28:38,472 INFO [train.py:1046] (1/4) Epoch 38, batch 3200, loss[loss=0.1769, simple_loss=0.2639, pruned_loss=0.04494, over 24349.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2366, pruned_loss=0.03866, over 4720725.69 frames. ], batch size: 77, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:28:39,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:28:39,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 16:28:44,228 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:28:45,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:46,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:28:46,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 16:28:48,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:28:50,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:28:55,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:29:02,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:29:11,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 16:29:11,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:29:11,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1331793.3333333333, ans=0.0 2023-10-03 16:29:11,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1331793.3333333333, ans=0.125 2023-10-03 16:29:13,494 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.01 vs. limit=15.0 2023-10-03 16:29:17,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 16:29:18,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 16:29:21,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:29:21,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:29:21,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:29:26,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 16:29:26,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 16:29:28,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 16:29:30,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 16:29:33,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:29:39,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:29:39,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:29:39,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:29:40,811 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 16:29:40,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:29:42,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1331926.6666666667, ans=0.2 2023-10-03 16:29:44,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:29:45,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1331926.6666666667, ans=0.0 2023-10-03 16:29:46,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 16:29:48,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 16:29:48,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 16:29:50,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 16:29:53,179 INFO [train.py:1046] (1/4) Epoch 38, batch 3250, loss[loss=0.175, simple_loss=0.2477, pruned_loss=0.05119, over 23775.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2362, pruned_loss=0.03839, over 4719774.89 frames. ], batch size: 164, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:29:53,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:29:53,762 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.46 vs. limit=15.0 2023-10-03 16:29:54,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:29:54,947 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 16:29:56,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:29:56,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:29:56,407 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 16:30:00,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:30:03,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:30:12,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:30:12,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 16:30:13,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:30:13,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:30:13,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:30:15,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:30:15,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:30:18,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:30:19,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:30:19,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:30:19,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:30:19,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:30:19,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:30:22,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:30:24,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:30:26,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:30:26,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:30:26,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1332126.6666666667, ans=0.125 2023-10-03 16:30:27,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:30:29,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:30:29,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:30:33,138 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 2.003e+02 2.220e+02 2.605e+02 4.440e+02, threshold=4.440e+02, percent-clipped=1.0 2023-10-03 16:30:34,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 16:30:34,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:30:34,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:30:35,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:30:37,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:30:42,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:30:50,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:30:50,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:30:50,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 16:30:50,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:30:50,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 16:30:51,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:30:53,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 16:30:53,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 16:30:53,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:30:55,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:30:56,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:30:58,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 16:30:58,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:31:00,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:31:00,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:31:01,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 16:31:01,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:31:01,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1332260.0, ans=0.0 2023-10-03 16:31:03,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:31:03,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 16:31:06,740 INFO [train.py:1046] (1/4) Epoch 38, batch 3300, loss[loss=0.1527, simple_loss=0.2302, pruned_loss=0.03757, over 23304.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.237, pruned_loss=0.03869, over 4721228.33 frames. ], batch size: 119, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:31:06,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:31:06,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 16:31:08,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 16:31:10,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 16:31:10,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:31:15,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:31:15,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:31:17,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:31:19,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 16:31:19,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 16:31:20,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1332393.3333333333, ans=0.05 2023-10-03 16:31:21,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:31:24,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:31:26,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 16:31:28,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:31:28,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:31:28,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1332393.3333333333, ans=0.0 2023-10-03 16:31:31,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:31:32,088 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 16:31:32,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1332393.3333333333, ans=0.05 2023-10-03 16:31:33,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:31:33,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 16:31:34,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:31:34,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:31:34,901 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 16:31:37,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:31:38,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:31:41,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:31:41,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 16:31:43,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 16:31:43,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:31:44,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:31:46,340 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 16:31:47,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 16:31:49,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:31:52,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 16:31:53,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:31:55,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:31:56,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:31:58,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:31:59,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:31:59,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:31:59,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:32:01,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:32:01,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:32:01,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:32:03,180 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 16:32:04,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 16:32:06,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:32:07,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:32:07,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:32:08,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:32:08,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:32:10,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:32:11,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:11,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:32:11,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1332593.3333333333, ans=0.0 2023-10-03 16:32:12,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:32:14,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:32:16,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 16:32:18,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:18,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:20,875 INFO [train.py:1046] (1/4) Epoch 38, batch 3350, loss[loss=0.2071, simple_loss=0.2761, pruned_loss=0.06901, over 19688.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2378, pruned_loss=0.03889, over 4720229.12 frames. ], batch size: 389, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:32:20,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:32:22,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:32:23,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:32:23,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:32:23,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:28,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:32:29,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:29,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:32:33,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:35,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:32:37,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:32:37,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:32:38,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 16:32:39,956 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 16:32:39,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:32:42,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 16:32:42,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 16:32:44,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:32:44,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:32:44,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:32:44,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 16:32:44,914 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.50 vs. limit=15.0 2023-10-03 16:32:45,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:45,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:32:47,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:49,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:50,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:50,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:32:53,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:32:54,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:54,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:32:55,201 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.82 vs. limit=22.5 2023-10-03 16:32:58,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:32:59,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:33:00,141 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.721e+02 1.992e+02 2.171e+02 2.466e+02 3.497e+02, threshold=4.342e+02, percent-clipped=0.0 2023-10-03 16:33:00,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:33:00,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:33:03,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:33:05,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 16:33:06,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:33:06,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 16:33:06,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:33:07,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 16:33:08,662 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.71 vs. limit=15.0 2023-10-03 16:33:09,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:33:10,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:33:12,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1332860.0, ans=0.125 2023-10-03 16:33:16,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:33:18,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 16:33:18,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:33:20,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:33:22,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:33:23,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1332926.6666666667, ans=0.125 2023-10-03 16:33:27,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:33:28,455 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.12 vs. limit=22.5 2023-10-03 16:33:29,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 16:33:29,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:33:30,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:33:31,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:33:33,597 INFO [train.py:1046] (1/4) Epoch 38, batch 3400, loss[loss=0.1424, simple_loss=0.2203, pruned_loss=0.03223, over 24471.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.238, pruned_loss=0.03917, over 4721700.27 frames. ], batch size: 58, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:33:33,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 16:33:33,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:33:35,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 16:33:36,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:33:36,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:33:37,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1332993.3333333333, ans=0.0 2023-10-03 16:33:38,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:33:38,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:33:39,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 16:33:43,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 16:33:43,944 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 16:33:43,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:33:48,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:33:48,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:33:48,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:33:48,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:33:55,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:33:56,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 16:34:01,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:34:01,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:34:02,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:34:04,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 16:34:10,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:34:13,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 16:34:18,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1333193.3333333333, ans=0.0 2023-10-03 16:34:19,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:34:21,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:34:21,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 16:34:21,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:34:21,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:34:22,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:34:22,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:34:25,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:34:27,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1333193.3333333333, ans=0.125 2023-10-03 16:34:29,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:34:29,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:34:34,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:34:35,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 16:34:41,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 16:34:45,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 16:34:47,549 INFO [train.py:1046] (1/4) Epoch 38, batch 3450, loss[loss=0.1546, simple_loss=0.2331, pruned_loss=0.03803, over 24463.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2372, pruned_loss=0.03917, over 4730075.32 frames. ], batch size: 58, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:34:51,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 16:34:52,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:34:54,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:34:54,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 16:34:54,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:34:58,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:35:03,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:35:03,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:35:05,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:35:05,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:35:07,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:35:13,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 16:35:16,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1333393.3333333333, ans=0.125 2023-10-03 16:35:17,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 16:35:17,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 16:35:17,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:35:19,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:35:24,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 16:35:26,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:35:28,780 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.894e+02 2.065e+02 2.341e+02 2.920e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-03 16:35:28,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:35:28,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:35:29,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1333460.0, ans=0.2 2023-10-03 16:35:30,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:35:31,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:35:33,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 16:35:33,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:35:33,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1333526.6666666667, ans=0.0 2023-10-03 16:35:34,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:35:35,769 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.62 vs. limit=12.0 2023-10-03 16:35:37,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:35:40,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 16:35:43,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:35:49,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:35:51,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:35:54,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:35:58,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:35:58,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:35:59,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:35:59,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:36:02,549 INFO [train.py:1046] (1/4) Epoch 38, batch 3500, loss[loss=0.158, simple_loss=0.242, pruned_loss=0.03695, over 24446.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2365, pruned_loss=0.03876, over 4731345.95 frames. ], batch size: 69, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:36:04,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:36:06,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:36:07,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 16:36:08,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 16:36:11,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 16:36:12,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1333660.0, ans=0.125 2023-10-03 16:36:14,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:36:14,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 16:36:20,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:36:21,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:36:23,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:36:23,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:36:24,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 16:36:24,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:24,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:36:25,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 16:36:28,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:29,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:36:29,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:36:34,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:36,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 16:36:36,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:36:38,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:36:41,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:36:42,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:44,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:36:44,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:36:45,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 16:36:45,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 16:36:47,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 16:36:48,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:36:48,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:50,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:36:50,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:36:53,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 16:36:54,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:36:59,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:37:00,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 16:37:00,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 16:37:00,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:37:03,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:37:05,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:37:06,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:37:08,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1333926.6666666667, ans=0.125 2023-10-03 16:37:09,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 16:37:09,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:37:10,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:37:12,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 16:37:13,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 16:37:14,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1333926.6666666667, ans=0.125 2023-10-03 16:37:16,912 INFO [train.py:1046] (1/4) Epoch 38, batch 3550, loss[loss=0.1636, simple_loss=0.2347, pruned_loss=0.04627, over 23768.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2363, pruned_loss=0.03847, over 4739722.45 frames. ], batch size: 212, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:37:16,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:37:17,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:37:18,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:37:18,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:37:19,337 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=11.65 vs. limit=15.0 2023-10-03 16:37:22,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:37:30,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:37:31,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1334060.0, ans=10.0 2023-10-03 16:37:32,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 16:37:33,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1334060.0, ans=0.1 2023-10-03 16:37:36,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:37:36,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:37:37,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:37:37,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:37:37,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:37:40,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:37:41,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:37:41,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:37:43,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 16:37:43,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:37:43,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1334060.0, ans=0.125 2023-10-03 16:37:47,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:37:47,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:37:49,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:37:49,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:37:50,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:37:50,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 16:37:50,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:37:50,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:37:52,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 16:37:57,017 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 2.030e+02 2.255e+02 2.585e+02 3.418e+02, threshold=4.510e+02, percent-clipped=0.0 2023-10-03 16:37:58,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:37:58,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:37:59,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:01,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 16:38:02,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:38:04,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 16:38:04,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:38:04,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1334193.3333333333, ans=0.2 2023-10-03 16:38:06,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:38:07,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:38:10,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 16:38:11,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:38:17,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:38:17,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 16:38:17,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:38:21,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:38:22,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 16:38:22,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1334260.0, ans=0.05 2023-10-03 16:38:22,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1334260.0, ans=0.2 2023-10-03 16:38:28,058 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:38:29,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 16:38:29,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:38:29,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:38:30,613 INFO [train.py:1046] (1/4) Epoch 38, batch 3600, loss[loss=0.1451, simple_loss=0.2301, pruned_loss=0.03, over 24658.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2362, pruned_loss=0.03832, over 4742442.55 frames. ], batch size: 65, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:38:30,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:38:31,398 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.25 vs. limit=12.0 2023-10-03 16:38:32,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:38:32,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:38:35,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:38:36,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:38,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:38:39,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:38:39,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:39,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 16:38:42,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:38:42,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:45,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:38:48,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:38:49,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:38:49,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:38:49,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1334393.3333333333, ans=0.0 2023-10-03 16:38:50,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 16:38:50,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:38:53,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:54,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:38:56,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:38:56,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1334393.3333333333, ans=0.125 2023-10-03 16:38:57,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:38:57,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:38:58,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 16:39:02,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1334460.0, ans=0.1 2023-10-03 16:39:07,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:39:07,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 16:39:08,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 16:39:13,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:39:18,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:39:21,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:39:27,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:39:28,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:39:28,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 16:39:29,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 16:39:31,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 16:39:32,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:39:32,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:39:34,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 16:39:34,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:39:34,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:39:34,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:39:36,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 16:39:36,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 16:39:40,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:39:40,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 16:39:44,419 INFO [train.py:1046] (1/4) Epoch 38, batch 3650, loss[loss=0.16, simple_loss=0.2471, pruned_loss=0.03642, over 23452.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2369, pruned_loss=0.03865, over 4740941.92 frames. ], batch size: 93, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:39:46,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 16:39:48,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:39:54,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 16:39:54,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 16:39:54,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1334660.0, ans=0.5 2023-10-03 16:39:57,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:39:57,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:39:57,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:40:00,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:40:00,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:40:01,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 16:40:01,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:40:01,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:40:03,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 16:40:04,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 16:40:04,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:40:04,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:40:07,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:40:09,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 16:40:11,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 16:40:12,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:40:14,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 16:40:15,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:40:15,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:40:18,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1334793.3333333333, ans=0.125 2023-10-03 16:40:22,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:40:22,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1334793.3333333333, ans=0.125 2023-10-03 16:40:25,395 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.902e+02 2.048e+02 2.259e+02 3.256e+02, threshold=4.095e+02, percent-clipped=0.0 2023-10-03 16:40:25,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:40:25,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:40:26,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:40:26,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:40:28,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:40:31,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:40:33,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:40:33,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:40:34,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:40:35,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:40:37,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:40:41,358 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 16:40:42,107 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.27 vs. limit=15.0 2023-10-03 16:40:43,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1334926.6666666667, ans=0.0 2023-10-03 16:40:44,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:40:44,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:40:46,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:40:46,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:40:47,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:40:49,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:40:51,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1334926.6666666667, ans=0.1 2023-10-03 16:40:52,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 16:40:52,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:40:56,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:40:58,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:40:59,337 INFO [train.py:1046] (1/4) Epoch 38, batch 3700, loss[loss=0.1329, simple_loss=0.2118, pruned_loss=0.02698, over 24438.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2383, pruned_loss=0.0391, over 4729799.73 frames. ], batch size: 58, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:40:59,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:41:02,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:41:02,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 16:41:02,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:41:03,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 16:41:03,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:41:07,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 16:41:10,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:41:10,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:41:12,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:41:13,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:41:13,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 16:41:14,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:41:16,342 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 16:41:23,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:41:23,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 16:41:26,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:41:27,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 16:41:27,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:41:31,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:41:31,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 16:41:33,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:41:34,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:41:39,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:41:39,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:41:41,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 16:41:45,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:41:45,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 16:41:46,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:41:46,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 16:41:50,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:41:50,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:41:53,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:41:54,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 16:41:54,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1335193.3333333333, ans=0.1 2023-10-03 16:41:56,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:41:56,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:41:56,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:41:56,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:42:01,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:42:01,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1335260.0, ans=0.125 2023-10-03 16:42:02,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 16:42:04,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 16:42:05,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:42:05,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:42:07,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:42:08,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:42:11,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:42:11,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:42:12,976 INFO [train.py:1046] (1/4) Epoch 38, batch 3750, loss[loss=0.1732, simple_loss=0.2598, pruned_loss=0.0433, over 23962.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2393, pruned_loss=0.03961, over 4729014.12 frames. ], batch size: 80, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:42:13,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:42:14,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 16:42:15,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 16:42:18,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:42:18,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 16:42:18,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:42:20,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:42:20,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:42:21,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:42:24,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:42:31,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:42:31,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:42:32,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:42:36,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:42:36,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 16:42:38,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:42:38,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:42:39,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:42:41,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 16:42:45,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 16:42:45,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:42:45,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:42:47,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:42:53,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:42:54,483 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.979e+02 2.269e+02 2.767e+02 4.342e+02, threshold=4.539e+02, percent-clipped=2.0 2023-10-03 16:42:54,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 16:42:57,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 16:43:01,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:43:02,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1335526.6666666667, ans=0.07 2023-10-03 16:43:04,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:43:04,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:43:08,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:43:11,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 16:43:13,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 16:43:14,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:43:15,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:43:17,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:43:25,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:43:26,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1335660.0, ans=0.0 2023-10-03 16:43:27,555 INFO [train.py:1046] (1/4) Epoch 38, batch 3800, loss[loss=0.1753, simple_loss=0.242, pruned_loss=0.05435, over 23727.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2384, pruned_loss=0.03972, over 4711880.68 frames. ], batch size: 179, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:43:29,780 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:43:30,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:43:30,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 16:43:32,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 16:43:33,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:43:35,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:43:36,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 16:43:37,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 16:43:37,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:43:39,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:43:41,347 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.49 vs. limit=6.0 2023-10-03 16:43:42,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:43:42,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:43:43,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:43:45,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 16:43:47,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 16:43:49,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:43:51,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:43:53,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:43:56,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:43:57,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:43:58,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:43:58,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1335793.3333333333, ans=0.125 2023-10-03 16:44:00,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:44:00,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:44:01,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1335793.3333333333, ans=0.125 2023-10-03 16:44:04,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:44:04,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 16:44:07,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:44:13,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:44:18,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1335860.0, ans=0.1 2023-10-03 16:44:19,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:44:21,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 16:44:23,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 16:44:23,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:44:26,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:44:27,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:44:29,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 16:44:32,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 16:44:32,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 16:44:34,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:44:35,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:44:40,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:44:41,695 INFO [train.py:1046] (1/4) Epoch 38, batch 3850, loss[loss=0.133, simple_loss=0.1947, pruned_loss=0.0356, over 23406.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2372, pruned_loss=0.03936, over 4719768.26 frames. ], batch size: 285, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:44:41,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:44:46,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:44:46,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 16:44:47,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:44:49,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:44:52,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 16:44:53,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:44:55,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1336060.0, ans=0.07 2023-10-03 16:44:56,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 16:44:57,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 16:45:03,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:04,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:45:06,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:45:07,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:45:10,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:10,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:45:10,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:45:10,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:45:13,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:16,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:16,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:17,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:45:19,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 16:45:19,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 16:45:20,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:45:20,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:22,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:45:22,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:23,444 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.919e+02 2.123e+02 2.422e+02 3.840e+02, threshold=4.245e+02, percent-clipped=0.0 2023-10-03 16:45:23,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 16:45:26,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 16:45:27,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:45:29,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 16:45:33,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 16:45:37,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:45:39,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:43,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:45:43,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 16:45:46,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 16:45:47,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:45:49,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:45:50,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:45:50,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:45:52,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:53,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:53,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:45:53,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 16:45:53,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:45:55,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 16:45:55,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:56,391 INFO [train.py:1046] (1/4) Epoch 38, batch 3900, loss[loss=0.1442, simple_loss=0.2152, pruned_loss=0.03658, over 23649.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2366, pruned_loss=0.03911, over 4735250.83 frames. ], batch size: 232, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:45:56,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:45:59,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:45:59,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:59,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:46:00,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:46:00,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:46:01,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:46:01,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 16:46:02,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:46:04,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1336326.6666666667, ans=0.07 2023-10-03 16:46:07,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:46:09,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:46:09,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:46:09,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:46:12,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:46:12,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:46:13,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:46:15,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 16:46:16,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:46:17,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 16:46:17,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:46:19,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 16:46:21,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 16:46:26,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:46:26,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:46:26,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:46:28,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:46:32,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:46:35,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:46:37,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:46:37,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:46:39,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:46:45,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:46:45,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:46:45,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1336526.6666666667, ans=0.125 2023-10-03 16:46:50,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1336526.6666666667, ans=0.035 2023-10-03 16:46:51,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:46:53,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:46:55,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1336593.3333333333, ans=0.125 2023-10-03 16:47:02,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:47:03,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:47:03,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 16:47:05,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 16:47:05,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:47:05,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 16:47:08,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:47:08,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 16:47:11,657 INFO [train.py:1046] (1/4) Epoch 38, batch 3950, loss[loss=0.1512, simple_loss=0.2132, pruned_loss=0.04457, over 19219.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2358, pruned_loss=0.03902, over 4706641.43 frames. ], batch size: 388, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:47:17,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:47:17,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 16:47:19,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:47:20,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:47:20,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:47:22,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1336660.0, ans=0.125 2023-10-03 16:47:24,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1336660.0, ans=0.125 2023-10-03 16:47:26,846 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 16:47:26,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:47:26,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 16:47:28,391 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 16:47:28,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:47:32,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:47:32,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:47:32,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:47:33,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 16:47:37,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:47:37,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:47:37,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:47:38,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:47:38,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:47:49,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:47:49,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:47:53,692 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.916e+02 2.036e+02 2.336e+02 4.528e+02, threshold=4.072e+02, percent-clipped=1.0 2023-10-03 16:47:54,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1336793.3333333333, ans=0.2 2023-10-03 16:47:55,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 16:48:00,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 16:48:00,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 16:48:01,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:48:01,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:48:09,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:48:09,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:48:09,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:48:10,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:48:10,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 16:48:11,532 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.25 vs. limit=15.0 2023-10-03 16:48:15,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:48:15,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:48:17,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1336926.6666666667, ans=0.125 2023-10-03 16:48:18,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 16:48:21,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1336926.6666666667, ans=0.0 2023-10-03 16:48:25,528 INFO [train.py:1046] (1/4) Epoch 38, batch 4000, loss[loss=0.1577, simple_loss=0.2373, pruned_loss=0.03907, over 23741.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2365, pruned_loss=0.03909, over 4711347.72 frames. ], batch size: 232, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:48:27,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:48:34,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:48:38,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:48:40,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:48:40,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:48:41,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 16:48:42,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:48:43,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 16:48:43,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:48:43,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 16:48:45,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1337060.0, ans=0.125 2023-10-03 16:48:46,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:48:49,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:48:49,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:48:49,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:48:50,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:48:50,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 16:48:52,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:48:53,676 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 16:48:53,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:48:53,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1337126.6666666667, ans=0.0 2023-10-03 16:48:55,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:48:58,342 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 16:48:59,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:48:59,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:49:02,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1337126.6666666667, ans=0.5 2023-10-03 16:49:05,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1337126.6666666667, ans=0.125 2023-10-03 16:49:06,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 16:49:07,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:49:11,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:49:11,163 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 16:49:12,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:49:12,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 16:49:12,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:49:14,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:49:15,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:49:16,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1337193.3333333333, ans=0.125 2023-10-03 16:49:17,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:49:17,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:49:17,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:49:18,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 16:49:18,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:49:22,145 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 16:49:25,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1337260.0, ans=0.125 2023-10-03 16:49:28,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:49:29,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 16:49:32,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:49:33,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:49:33,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:49:34,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:49:37,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:49:39,161 INFO [train.py:1046] (1/4) Epoch 38, batch 4050, loss[loss=0.1505, simple_loss=0.2375, pruned_loss=0.03174, over 24464.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2368, pruned_loss=0.03904, over 4720633.29 frames. ], batch size: 63, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:49:41,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 16:49:41,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 16:49:44,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:49:44,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:49:45,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:49:47,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:49:47,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:49:51,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:49:54,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:49:54,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:49:56,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:49:56,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1337393.3333333333, ans=0.1 2023-10-03 16:49:57,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:50:01,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:50:04,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:50:07,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 16:50:08,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 16:50:08,842 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 16:50:10,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:50:16,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 16:50:18,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:50:20,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:50:22,140 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.926e+02 2.112e+02 2.341e+02 3.122e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 16:50:24,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:50:24,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:50:24,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:50:28,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:50:31,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 16:50:31,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1337526.6666666667, ans=0.0 2023-10-03 16:50:32,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:50:34,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:50:34,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 16:50:38,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:50:40,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1337593.3333333333, ans=0.0 2023-10-03 16:50:44,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 16:50:47,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:50:47,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:50:50,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 16:50:50,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 16:50:50,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:50:51,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1337593.3333333333, ans=0.125 2023-10-03 16:50:52,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:50:53,512 INFO [train.py:1046] (1/4) Epoch 38, batch 4100, loss[loss=0.1681, simple_loss=0.2483, pruned_loss=0.04395, over 24054.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.238, pruned_loss=0.03953, over 4712586.38 frames. ], batch size: 86, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:50:53,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:50:54,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:51:01,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 16:51:02,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 16:51:03,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 16:51:05,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 16:51:05,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:51:05,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:51:06,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:51:06,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:51:06,791 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 16:51:09,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:51:10,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:51:10,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:51:12,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:51:16,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:51:18,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:51:18,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:51:20,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 16:51:20,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:51:20,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:51:20,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:51:20,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:51:21,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 16:51:22,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:51:26,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 16:51:26,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:51:29,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:51:29,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 16:51:29,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:51:30,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:51:30,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:51:32,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 16:51:33,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=1337793.3333333333, ans=10.0 2023-10-03 16:51:33,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1337793.3333333333, ans=0.125 2023-10-03 16:51:34,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:51:34,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:51:39,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 16:51:39,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:51:39,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:51:41,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:51:46,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:51:49,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1337860.0, ans=0.1 2023-10-03 16:51:50,248 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.56 vs. limit=6.0 2023-10-03 16:51:51,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:51:51,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:51:59,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:51:59,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:52:03,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:52:05,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:52:07,681 INFO [train.py:1046] (1/4) Epoch 38, batch 4150, loss[loss=0.1894, simple_loss=0.2593, pruned_loss=0.05976, over 19800.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2378, pruned_loss=0.03925, over 4713190.51 frames. ], batch size: 388, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:52:09,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:52:10,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:52:10,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:52:10,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:52:13,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 16:52:13,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:52:13,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 16:52:13,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 16:52:14,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 16:52:16,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:52:17,680 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.31 vs. limit=15.0 2023-10-03 16:52:20,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:52:21,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:52:24,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:52:25,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:52:26,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:52:29,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 16:52:29,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:52:30,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:52:34,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:52:38,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:52:39,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 16:52:42,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 16:52:43,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:52:44,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 16:52:44,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:52:44,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:52:44,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1338126.6666666667, ans=0.0 2023-10-03 16:52:47,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:52:49,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:52:50,424 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.901e+02 2.086e+02 2.272e+02 3.701e+02, threshold=4.173e+02, percent-clipped=0.0 2023-10-03 16:52:50,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 16:52:55,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:52:57,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:52:57,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1338193.3333333333, ans=0.07 2023-10-03 16:52:58,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 16:52:58,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:53:00,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 16:53:01,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:53:03,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:53:04,100 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.58 vs. limit=15.0 2023-10-03 16:53:04,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:53:05,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 16:53:05,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:05,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:53:08,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:53:11,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 16:53:11,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:53:11,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:53:11,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:53:11,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1338260.0, ans=0.125 2023-10-03 16:53:12,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 16:53:12,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:53:12,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:53:12,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:53:14,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:53:14,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 16:53:15,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:53:22,139 INFO [train.py:1046] (1/4) Epoch 38, batch 4200, loss[loss=0.1618, simple_loss=0.2336, pruned_loss=0.045, over 23713.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2373, pruned_loss=0.03911, over 4709902.48 frames. ], batch size: 164, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:53:22,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:53:23,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 16:53:25,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:53:27,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:53:29,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:53:29,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:53:29,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:53:33,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 16:53:35,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 16:53:35,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:37,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:53:39,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:53:42,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 16:53:43,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:53:43,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:45,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 16:53:45,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:53:46,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:47,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:53:47,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:53:49,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:53:51,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 16:53:51,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:54,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:53:56,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:53:58,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:53:59,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:54:02,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:54:02,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 16:54:03,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:54:05,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:54:08,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:54:10,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:54:15,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:54:18,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 16:54:21,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:54:24,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:54:25,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:54:28,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 16:54:31,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:54:35,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:54:35,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:54:36,549 INFO [train.py:1046] (1/4) Epoch 38, batch 4250, loss[loss=0.153, simple_loss=0.2248, pruned_loss=0.04063, over 23497.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2362, pruned_loss=0.03876, over 4704903.65 frames. ], batch size: 285, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:54:36,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1338660.0, ans=0.0 2023-10-03 16:54:39,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:54:39,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1338660.0, ans=0.2 2023-10-03 16:54:45,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:54:46,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 16:54:46,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:54:49,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:54:50,285 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.61 vs. limit=22.5 2023-10-03 16:54:53,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:54:56,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten.whitening_limit, batch_count=1338726.6666666667, ans=15.0 2023-10-03 16:54:57,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:54:57,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:54:59,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:54:59,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:55:00,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:55:02,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:55:04,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:55:04,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1338726.6666666667, ans=0.0 2023-10-03 16:55:05,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:55:06,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:55:08,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 16:55:10,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 16:55:10,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:55:12,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:55:12,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:55:13,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:55:13,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:55:14,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:55:18,904 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.873e+02 2.084e+02 2.323e+02 3.046e+02, threshold=4.167e+02, percent-clipped=0.0 2023-10-03 16:55:18,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 16:55:19,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:55:22,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1338860.0, ans=0.125 2023-10-03 16:55:24,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:55:26,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:55:26,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 16:55:26,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:55:28,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 16:55:28,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:55:30,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:55:31,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:55:31,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:55:31,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1338860.0, ans=0.0 2023-10-03 16:55:33,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 16:55:33,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1338860.0, ans=0.07 2023-10-03 16:55:36,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:55:36,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:55:37,395 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.15 vs. limit=15.0 2023-10-03 16:55:39,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:55:42,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:55:43,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:55:43,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:55:45,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:55:47,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:55:48,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:55:48,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 16:55:50,860 INFO [train.py:1046] (1/4) Epoch 38, batch 4300, loss[loss=0.1582, simple_loss=0.2367, pruned_loss=0.03984, over 23298.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2365, pruned_loss=0.03855, over 4718487.74 frames. ], batch size: 105, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:55:52,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:55:57,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:55:57,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:56:01,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:56:07,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:56:07,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 16:56:08,014 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.39 vs. limit=22.5 2023-10-03 16:56:08,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:56:11,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:56:11,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:56:11,606 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 16:56:16,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:56:17,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:56:19,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 16:56:21,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:56:21,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 16:56:21,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 16:56:23,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:56:23,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1339126.6666666667, ans=0.125 2023-10-03 16:56:26,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:56:26,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:56:26,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:56:26,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1339126.6666666667, ans=0.1 2023-10-03 16:56:28,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:56:29,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:56:29,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 16:56:30,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 16:56:32,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:56:35,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:56:35,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 16:56:35,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.94 vs. limit=15.0 2023-10-03 16:56:36,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:56:36,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:56:36,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 16:56:36,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 16:56:36,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 16:56:38,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:56:38,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 16:56:38,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 16:56:42,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:56:44,916 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 16:56:44,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:56:46,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:56:46,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:56:49,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 16:56:50,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:56:50,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:56:50,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:56:52,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:56:52,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:56:55,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:56:56,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:56:58,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:57:00,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:57:00,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=1339260.0, ans=0.05 2023-10-03 16:57:00,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1339260.0, ans=0.125 2023-10-03 16:57:01,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1339260.0, ans=0.07 2023-10-03 16:57:01,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1339260.0, ans=0.0 2023-10-03 16:57:04,235 INFO [train.py:1046] (1/4) Epoch 38, batch 4350, loss[loss=0.1655, simple_loss=0.2401, pruned_loss=0.0454, over 23731.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2369, pruned_loss=0.03851, over 4731757.91 frames. ], batch size: 179, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:57:06,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 16:57:07,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:57:07,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1339326.6666666667, ans=0.1 2023-10-03 16:57:10,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:57:13,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:57:16,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:57:16,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:57:20,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:57:23,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:57:26,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:57:26,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:57:29,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:57:31,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:57:31,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:57:37,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 16:57:37,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:57:37,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:57:40,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1339460.0, ans=0.125 2023-10-03 16:57:41,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:57:44,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 16:57:44,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1339460.0, ans=0.2 2023-10-03 16:57:47,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:57:48,468 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.931e+02 2.144e+02 2.418e+02 3.398e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 16:57:48,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 16:57:54,517 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 16:57:54,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:57:54,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1339526.6666666667, ans=0.0 2023-10-03 16:57:56,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:57:56,720 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.83 vs. limit=22.5 2023-10-03 16:57:57,373 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 16:57:58,795 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 16:57:58,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:57:58,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:00,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:58:00,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:58:02,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:58:02,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:58:04,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 16:58:04,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:04,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:58:04,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:04,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 16:58:05,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1339593.3333333333, ans=0.0 2023-10-03 16:58:06,244 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 16:58:06,248 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 16:58:08,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 16:58:10,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:58:10,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:58:10,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:58:11,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1339593.3333333333, ans=0.1 2023-10-03 16:58:12,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:58:14,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 16:58:16,394 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 16:58:16,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:17,637 INFO [train.py:1046] (1/4) Epoch 38, batch 4400, loss[loss=0.1792, simple_loss=0.2488, pruned_loss=0.05483, over 22710.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2378, pruned_loss=0.03882, over 4731582.05 frames. ], batch size: 322, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:58:20,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:58:20,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:21,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:58:23,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 16:58:23,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 16:58:25,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 16:58:25,256 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 16:58:26,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 16:58:26,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:58:29,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 16:58:30,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:32,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:32,698 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 16:58:35,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:58:35,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 16:58:36,805 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 16:58:39,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 16:58:39,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 16:58:41,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 16:58:41,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:42,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:58:42,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:58:43,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:58:44,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1339726.6666666667, ans=0.0 2023-10-03 16:58:45,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 16:58:45,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 16:58:45,811 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.91 vs. limit=15.0 2023-10-03 16:58:46,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:58:46,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:58:46,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:49,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:49,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:58:49,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 16:58:52,707 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 16:58:54,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:59,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:59:02,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 16:59:06,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:59:08,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:59:12,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:59:12,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 16:59:12,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:59:13,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:59:13,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:59:13,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 16:59:16,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 16:59:16,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1339926.6666666667, ans=0.0 2023-10-03 16:59:19,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 16:59:20,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 16:59:20,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:59:20,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 16:59:22,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:59:25,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:59:27,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 16:59:30,601 INFO [train.py:1046] (1/4) Epoch 38, batch 4450, loss[loss=0.1895, simple_loss=0.2623, pruned_loss=0.05837, over 23359.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2392, pruned_loss=0.03938, over 4719309.30 frames. ], batch size: 285, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:59:32,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:59:35,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:59:36,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:59:37,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1339993.3333333333, ans=0.1 2023-10-03 16:59:37,555 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.13 vs. limit=6.0 2023-10-03 16:59:42,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:59:44,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:59:47,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:59:50,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:59:52,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:59:52,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:59:54,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 16:59:54,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:59:55,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:59:55,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:59:55,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:59:58,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 17:00:02,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:00:02,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:00:03,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:00:03,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:00:04,605 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.38 vs. limit=15.0 2023-10-03 17:00:05,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:00:06,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1340126.6666666667, ans=0.025 2023-10-03 17:00:09,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 17:00:10,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 17:00:10,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 17:00:10,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:00:12,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:00:14,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 17:00:14,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1340193.3333333333, ans=0.2 2023-10-03 17:00:15,718 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.913e+02 2.140e+02 2.381e+02 4.249e+02, threshold=4.280e+02, percent-clipped=0.0 2023-10-03 17:00:19,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:00:22,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:00:22,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 17:00:22,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:00:22,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:00:22,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:00:22,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:00:24,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:00:27,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 17:00:27,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1340193.3333333333, ans=0.0 2023-10-03 17:00:28,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 17:00:29,681 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.55 vs. limit=15.0 2023-10-03 17:00:31,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:00:32,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:00:34,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:00:36,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:00:36,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 17:00:37,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:00:42,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 17:00:42,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:00:45,000 INFO [train.py:1046] (1/4) Epoch 38, batch 4500, loss[loss=0.16, simple_loss=0.2295, pruned_loss=0.04525, over 23838.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2389, pruned_loss=0.03944, over 4706252.74 frames. ], batch size: 212, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 17:00:47,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:00:48,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 17:00:48,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 17:00:48,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1340326.6666666667, ans=0.2 2023-10-03 17:00:51,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:00:55,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:00:55,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:00:56,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:00:56,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:00:58,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:00:58,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:01:09,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:01:09,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:01:12,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:01:12,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:01:13,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:01:22,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:01:22,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1340460.0, ans=0.0 2023-10-03 17:01:22,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1340460.0, ans=0.2 2023-10-03 17:01:26,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:01:30,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:01:33,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:01:33,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 17:01:35,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:01:35,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:01:37,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:01:37,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:01:38,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:01:38,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 17:01:38,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:01:38,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:01:43,862 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.73 vs. limit=15.0 2023-10-03 17:01:44,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:01:44,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:01:47,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:01:50,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:01:50,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:01:51,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 17:01:52,628 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.50 vs. limit=12.0 2023-10-03 17:01:53,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 17:01:53,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 17:01:56,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 17:01:58,551 INFO [train.py:1046] (1/4) Epoch 38, batch 4550, loss[loss=0.164, simple_loss=0.2538, pruned_loss=0.03715, over 24683.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.238, pruned_loss=0.03922, over 4702687.96 frames. ], batch size: 73, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 17:01:59,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 17:02:01,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:02:03,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:02:04,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:02:06,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:02:11,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:02:12,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:02:14,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:02:14,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:02:14,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:15,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:02:15,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:02:21,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:02:23,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 17:02:23,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1340726.6666666667, ans=0.0 2023-10-03 17:02:24,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 17:02:24,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:02:24,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1340726.6666666667, ans=0.2 2023-10-03 17:02:25,694 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=14.62 vs. limit=22.5 2023-10-03 17:02:26,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 17:02:30,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 17:02:30,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:02:33,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 17:02:34,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:02:35,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:35,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:36,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:02:38,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 17:02:40,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:02:41,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1340793.3333333333, ans=0.2 2023-10-03 17:02:42,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:42,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:02:44,104 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 1.957e+02 2.138e+02 2.462e+02 4.431e+02, threshold=4.276e+02, percent-clipped=1.0 2023-10-03 17:02:44,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:02:47,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 17:02:47,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 17:02:47,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:02:48,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 17:02:49,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1340860.0, ans=0.125 2023-10-03 17:02:50,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 17:02:50,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:02:50,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1340860.0, ans=0.2 2023-10-03 17:02:51,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:02:51,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:02:51,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:51,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1340860.0, ans=0.125 2023-10-03 17:02:53,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:02:54,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:02:54,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 17:02:55,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:02:55,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 17:02:57,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 17:02:57,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:02:59,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 17:03:01,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:03:01,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:03:03,197 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.01 vs. limit=10.0 2023-10-03 17:03:03,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:03:05,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:03:05,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:03:07,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:03:10,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:03:12,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:12,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:03:14,325 INFO [train.py:1046] (1/4) Epoch 38, batch 4600, loss[loss=0.1393, simple_loss=0.218, pruned_loss=0.03032, over 20971.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2366, pruned_loss=0.03856, over 4700416.79 frames. ], batch size: 46, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 17:03:15,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:03:17,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:03:17,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:03:18,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 17:03:20,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:03:23,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:03:24,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:03:26,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:34,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 17:03:36,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:40,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:42,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:03:42,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:03:47,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 17:03:47,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:03:48,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:03:52,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:53,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:03:54,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:04:00,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 17:04:02,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:04:05,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:05,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:04:07,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:07,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 17:04:09,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:04:09,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 17:04:11,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:11,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:04:13,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:14,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:04:14,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1341260.0, ans=0.125 2023-10-03 17:04:15,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:04:16,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 17:04:16,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 17:04:16,195 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:04:17,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 17:04:17,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:04:18,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:04:18,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:04:20,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:04:28,953 INFO [train.py:1046] (1/4) Epoch 38, batch 4650, loss[loss=0.1859, simple_loss=0.2502, pruned_loss=0.06079, over 19348.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2369, pruned_loss=0.03869, over 4702328.38 frames. ], batch size: 388, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 17:04:29,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:04:29,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1341326.6666666667, ans=0.125 2023-10-03 17:04:32,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:04:32,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:04:32,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:04:32,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:04:32,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:04:33,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:04:36,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 17:04:37,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1341326.6666666667, ans=0.125 2023-10-03 17:04:37,651 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.51 vs. limit=15.0 2023-10-03 17:04:40,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:04:41,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 17:04:41,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:04:43,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 17:04:43,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:04:44,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 17:04:44,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 17:04:44,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:45,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:04:48,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:04:49,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:04:49,890 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 17:04:51,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:04:52,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 17:04:53,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1341393.3333333333, ans=0.1 2023-10-03 17:04:57,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:57,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:04:58,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 17:05:00,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:05:02,600 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.29 vs. limit=15.0 2023-10-03 17:05:03,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:05:06,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:05:09,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1341460.0, ans=0.95 2023-10-03 17:05:10,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:05:13,570 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.819e+02 2.027e+02 2.299e+02 3.451e+02, threshold=4.055e+02, percent-clipped=0.0 2023-10-03 17:05:13,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:05:13,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:05:13,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:05:15,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 17:05:15,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 17:05:16,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 17:05:16,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 17:05:16,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1341526.6666666667, ans=0.125 2023-10-03 17:05:18,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:05:25,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:05:25,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:05:25,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 17:05:25,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:05:25,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1341526.6666666667, ans=0.0 2023-10-03 17:05:26,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:05:26,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:05:28,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:05:31,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:05:31,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:05:32,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:05:36,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:05:36,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:05:36,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:05:38,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 17:05:40,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:05:40,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 17:05:43,355 INFO [train.py:1046] (1/4) Epoch 38, batch 4700, loss[loss=0.1678, simple_loss=0.2472, pruned_loss=0.04415, over 23572.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2366, pruned_loss=0.03884, over 4701756.05 frames. ], batch size: 106, lr: 2.66e-03, grad_scale: 8.0 2023-10-03 17:05:43,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1341660.0, ans=0.09899494936611666 2023-10-03 17:05:47,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:05:48,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:05:48,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:05:49,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:05:51,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:05:57,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 17:05:57,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 17:06:00,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:06:00,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:06:01,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:06:03,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:06:07,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1341726.6666666667, ans=0.125 2023-10-03 17:06:08,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1341726.6666666667, ans=0.125 2023-10-03 17:06:09,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:06:10,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1341726.6666666667, ans=0.125 2023-10-03 17:06:11,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 17:06:13,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:06:19,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 17:06:19,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:06:22,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:26,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 17:06:27,413 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.87 vs. limit=15.0 2023-10-03 17:06:29,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:06:32,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:06:32,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 17:06:34,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:34,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:06:37,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:06:37,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:06:37,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 17:06:38,849 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 17:06:40,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:06:42,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:42,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:42,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 17:06:43,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:46,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 17:06:49,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:06:50,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:06:53,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:06:55,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:06:55,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 17:06:56,621 INFO [train.py:1046] (1/4) Epoch 38, batch 4750, loss[loss=0.1457, simple_loss=0.2302, pruned_loss=0.03062, over 24330.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2375, pruned_loss=0.03897, over 4703004.66 frames. ], batch size: 61, lr: 2.66e-03, grad_scale: 8.0 2023-10-03 17:06:56,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:06:59,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 17:07:00,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:07:00,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:07:02,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:07:09,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 17:07:12,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:07:14,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 17:07:16,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:07:17,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1342060.0, ans=0.05 2023-10-03 17:07:20,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:07:20,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:07:20,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:07:21,497 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 17:07:21,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 17:07:28,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 17:07:31,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:07:34,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:07:36,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:07:36,954 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 17:07:36,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:07:38,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:07:39,791 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.879e+02 2.097e+02 2.401e+02 3.213e+02, threshold=4.194e+02, percent-clipped=0.0 2023-10-03 17:07:41,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:07:42,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1342193.3333333333, ans=0.125 2023-10-03 17:07:43,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 17:07:43,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 17:07:45,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:07:45,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:07:45,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:07:46,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 17:07:46,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 17:07:49,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 17:07:51,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:07:53,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:07:53,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 17:07:55,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:07:55,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:07:58,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:07:59,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:07:59,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:08:00,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1342260.0, ans=0.125 2023-10-03 17:08:01,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:08:01,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 17:08:02,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 17:08:04,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 17:08:06,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:08:06,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:08:07,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 17:08:10,171 INFO [train.py:1046] (1/4) Epoch 38, batch 4800, loss[loss=0.1683, simple_loss=0.2416, pruned_loss=0.04747, over 23779.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.238, pruned_loss=0.03939, over 4696277.63 frames. ], batch size: 164, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:08:13,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:08:13,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:08:18,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:08:19,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:08:21,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:08:21,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 17:08:22,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:08:22,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:08:22,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:08:26,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:08:28,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:08:29,291 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.18 vs. limit=6.0 2023-10-03 17:08:29,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:08:30,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:08:30,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 17:08:30,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:08:31,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:08:34,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:08:35,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:08:36,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:08:38,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:08:39,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 17:08:40,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1342460.0, ans=0.125 2023-10-03 17:08:41,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:08:43,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 17:08:43,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 17:08:43,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:08:43,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:08:43,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:08:43,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:08:44,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:08:47,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:08:47,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:08:50,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:08:53,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:08:54,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:09:00,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 17:09:02,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:09:02,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:02,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:09:03,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:09:07,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:09:08,410 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.40 vs. limit=15.0 2023-10-03 17:09:09,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:09:09,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:09,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:09:10,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:09:10,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:09:15,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:09:15,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:15,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:09:15,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1342593.3333333333, ans=0.125 2023-10-03 17:09:18,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 17:09:18,434 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1342593.3333333333, ans=0.0 2023-10-03 17:09:19,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 17:09:19,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:09:19,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:09:21,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:09:21,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:24,434 INFO [train.py:1046] (1/4) Epoch 38, batch 4850, loss[loss=0.1607, simple_loss=0.2465, pruned_loss=0.03745, over 24358.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2379, pruned_loss=0.03945, over 4702580.65 frames. ], batch size: 77, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:09:24,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:09:31,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 17:09:33,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:09:37,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:09:39,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:09:40,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:43,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:09:45,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:09:46,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:09:46,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 17:09:52,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:09:53,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:09:53,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:09:55,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:09:55,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 17:09:57,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:09:57,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:09:58,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1342793.3333333333, ans=0.1 2023-10-03 17:10:02,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:10:02,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 17:10:02,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 17:10:03,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:10:08,523 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.910e+02 2.157e+02 2.584e+02 3.262e+02, threshold=4.314e+02, percent-clipped=0.0 2023-10-03 17:10:11,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:10:12,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 17:10:14,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:10:14,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:10:16,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:10:16,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1342860.0, ans=0.125 2023-10-03 17:10:17,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 17:10:17,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:10:20,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 17:10:20,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:10:20,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:10:21,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 17:10:23,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1342926.6666666667, ans=0.125 2023-10-03 17:10:27,301 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.77 vs. limit=15.0 2023-10-03 17:10:29,055 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.98 vs. limit=15.0 2023-10-03 17:10:29,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:10:34,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:10:34,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:10:38,180 INFO [train.py:1046] (1/4) Epoch 38, batch 4900, loss[loss=0.1402, simple_loss=0.1985, pruned_loss=0.04091, over 22732.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2369, pruned_loss=0.03939, over 4704574.57 frames. ], batch size: 322, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:10:39,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1342993.3333333333, ans=0.5 2023-10-03 17:10:41,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 17:10:41,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:10:45,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:10:47,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:10:47,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:10:49,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 17:10:51,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1343060.0, ans=0.1 2023-10-03 17:10:52,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 17:10:57,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 17:10:58,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 17:10:58,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:10:59,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:10:59,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:10:59,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:10:59,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:11:00,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 17:11:03,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 17:11:03,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:11:03,905 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.92 vs. limit=15.0 2023-10-03 17:11:04,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:11:05,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:11:10,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:11:10,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:11:11,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1343126.6666666667, ans=0.09899494936611666 2023-10-03 17:11:12,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:11:12,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 17:11:13,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:11:15,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:11:15,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 17:11:15,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 17:11:17,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1343126.6666666667, ans=0.125 2023-10-03 17:11:18,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 17:11:19,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:11:21,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:11:21,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:11:21,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:11:23,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 17:11:24,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:11:24,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 17:11:27,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:11:27,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 17:11:29,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:11:33,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 17:11:34,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:11:34,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 17:11:34,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 17:11:43,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:11:43,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:11:45,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 17:11:45,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 17:11:45,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:11:47,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:11:47,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1343260.0, ans=0.1 2023-10-03 17:11:49,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:11:49,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:11:51,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:11:51,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 17:11:52,779 INFO [train.py:1046] (1/4) Epoch 38, batch 4950, loss[loss=0.1616, simple_loss=0.2329, pruned_loss=0.04516, over 23896.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2363, pruned_loss=0.03928, over 4693353.00 frames. ], batch size: 180, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:11:52,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:11:54,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1343326.6666666667, ans=0.1 2023-10-03 17:11:56,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:11:56,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 17:11:56,740 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.98 vs. limit=22.5 2023-10-03 17:11:59,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 17:12:00,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 17:12:00,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:12:01,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 17:12:01,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:01,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:12:01,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:12:03,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:04,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:12:04,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:12:05,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:12:08,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:12:10,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:10,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:12:13,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:12:18,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:18,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:12:19,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:20,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:22,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:12:24,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 17:12:24,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 17:12:26,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:28,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:12:28,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:12:30,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:12:30,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:12:32,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:12:34,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:12:37,004 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.915e+02 2.064e+02 2.439e+02 4.072e+02, threshold=4.129e+02, percent-clipped=0.0 2023-10-03 17:12:37,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:12:38,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:12:38,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:40,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:41,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 17:12:41,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:12:41,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:12:43,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1343526.6666666667, ans=0.0 2023-10-03 17:12:45,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:12:46,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:12:46,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:12:46,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:48,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:12:48,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:12:51,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:12:51,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:12:53,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:12:54,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 17:12:57,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:12:59,318 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:13:03,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 17:13:03,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 17:13:07,167 INFO [train.py:1046] (1/4) Epoch 38, batch 5000, loss[loss=0.1647, simple_loss=0.2292, pruned_loss=0.05007, over 19445.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2356, pruned_loss=0.03861, over 4697248.77 frames. ], batch size: 388, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:13:11,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:13:11,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:13:11,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 17:13:13,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1343660.0, ans=0.1 2023-10-03 17:13:14,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 17:13:17,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:13:19,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 17:13:19,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:13:19,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 17:13:20,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 17:13:22,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:13:22,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:13:23,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 17:13:23,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:13:24,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:13:24,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 17:13:26,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 17:13:27,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1343726.6666666667, ans=0.1 2023-10-03 17:13:28,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:13:28,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 17:13:28,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:13:29,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:13:29,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:13:29,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 17:13:29,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 17:13:31,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1343726.6666666667, ans=0.125 2023-10-03 17:13:32,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 17:13:32,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:13:32,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:13:33,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 17:13:33,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:13:35,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:13:36,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:13:38,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 17:13:38,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1343793.3333333333, ans=0.2 2023-10-03 17:13:39,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 17:13:39,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:13:40,375 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.27 vs. limit=15.0 2023-10-03 17:13:40,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1343793.3333333333, ans=0.125 2023-10-03 17:13:41,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:13:42,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1343793.3333333333, ans=0.0 2023-10-03 17:13:45,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1343793.3333333333, ans=0.05 2023-10-03 17:13:46,161 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 17:13:48,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:13:49,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:13:49,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:13:52,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 17:13:52,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:13:53,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:13:53,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:13:55,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 17:13:56,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:13:58,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:13:59,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:14:03,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 17:14:06,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1343926.6666666667, ans=0.125 2023-10-03 17:14:09,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:14:17,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:14:19,519 INFO [train.py:1046] (1/4) Epoch 38, batch 5050, loss[loss=0.2093, simple_loss=0.2697, pruned_loss=0.07441, over 19220.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2363, pruned_loss=0.03889, over 4700107.86 frames. ], batch size: 388, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:14:19,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:14:19,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:14:19,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:14:19,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:14:19,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1343993.3333333333, ans=0.0 2023-10-03 17:14:21,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:14:21,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:14:23,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1343993.3333333333, ans=10.0 2023-10-03 17:14:26,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:14:26,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 17:14:26,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:14:28,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:14:30,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:14:31,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 17:14:31,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:14:31,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:14:34,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:14:35,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:14:35,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:14:45,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1344060.0, ans=0.125 2023-10-03 17:14:46,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 17:14:46,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 17:14:47,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:14:47,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 17:14:49,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:14:49,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1344126.6666666667, ans=0.1 2023-10-03 17:14:50,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:14:50,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:14:50,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:14:50,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 17:14:52,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 17:14:52,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:14:55,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:14:59,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:15:00,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 17:15:01,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=1344126.6666666667, ans=10.0 2023-10-03 17:15:02,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:15:03,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 17:15:05,037 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.875e+02 2.025e+02 2.172e+02 3.153e+02, threshold=4.049e+02, percent-clipped=0.0 2023-10-03 17:15:05,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:15:05,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:15:06,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:15:07,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:15:08,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:15:10,017 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.96 vs. limit=22.5 2023-10-03 17:15:10,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:15:10,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:15:12,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:15:12,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:15:12,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 17:15:13,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:15:16,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:15:19,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:15:20,379 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 17:15:20,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:15:21,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:15:22,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:15:22,532 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 17:15:24,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1344260.0, ans=0.1 2023-10-03 17:15:26,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:15:26,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 17:15:26,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:15:30,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:15:30,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:15:30,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 17:15:33,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 17:15:35,075 INFO [train.py:1046] (1/4) Epoch 38, batch 5100, loss[loss=0.1518, simple_loss=0.2295, pruned_loss=0.03711, over 23647.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2372, pruned_loss=0.0392, over 4701039.30 frames. ], batch size: 149, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:15:36,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:15:36,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:15:36,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1344326.6666666667, ans=0.1 2023-10-03 17:15:37,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:15:41,757 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 17:15:43,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:15:44,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 17:15:44,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 17:15:46,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:15:47,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:15:47,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1344393.3333333333, ans=0.125 2023-10-03 17:15:47,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1344393.3333333333, ans=0.125 2023-10-03 17:15:48,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:15:50,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 17:15:50,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 17:15:53,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:15:53,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:15:59,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:16:02,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 17:16:02,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:16:04,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1344460.0, ans=0.1 2023-10-03 17:16:06,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:16:06,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 17:16:08,128 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.64 vs. limit=15.0 2023-10-03 17:16:09,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:16:10,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:16:10,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 17:16:13,166 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 17:16:14,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:16:14,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 17:16:14,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 17:16:17,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:16:25,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:16:26,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=1344526.6666666667, ans=0.025 2023-10-03 17:16:27,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 17:16:27,637 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 17:16:28,965 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 17:16:29,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1344526.6666666667, ans=0.0 2023-10-03 17:16:30,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 17:16:30,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:16:30,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1344526.6666666667, ans=0.1 2023-10-03 17:16:31,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 17:16:35,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 17:16:38,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 17:16:39,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:16:42,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 17:16:43,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:16:43,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 17:16:44,671 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.93 vs. limit=12.0 2023-10-03 17:16:48,155 INFO [train.py:1046] (1/4) Epoch 38, batch 5150, loss[loss=0.166, simple_loss=0.2507, pruned_loss=0.04065, over 24487.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2383, pruned_loss=0.03998, over 4691571.54 frames. ], batch size: 66, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:16:49,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:16:49,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:16:49,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:16:49,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:16:49,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:16:51,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:16:52,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 17:16:52,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 17:16:53,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 17:16:53,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:16:53,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 17:16:55,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:16:57,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 17:16:58,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:16:59,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:17:03,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1344726.6666666667, ans=0.125 2023-10-03 17:17:04,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 17:17:04,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 17:17:06,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:17:06,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:17:09,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:17:09,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:17:09,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:17:10,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:17:10,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:17:10,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 17:17:13,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:17:13,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:17:13,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1344726.6666666667, ans=0.0 2023-10-03 17:17:14,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:17:16,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 17:17:16,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:17:20,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1344793.3333333333, ans=0.125 2023-10-03 17:17:21,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:17:23,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 17:17:29,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:17:31,772 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.952e+02 2.144e+02 2.442e+02 5.340e+02, threshold=4.289e+02, percent-clipped=2.0 2023-10-03 17:17:33,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:17:33,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:17:36,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:17:38,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:17:40,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 17:17:43,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1344860.0, ans=0.2 2023-10-03 17:17:44,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:17:45,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:17:45,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:17:48,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:17:49,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:17:51,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 17:17:55,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:17:57,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:17:58,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:17:58,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:17:59,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:17:59,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:17:59,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:17:59,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:18:01,108 INFO [train.py:1046] (1/4) Epoch 38, batch 5200, loss[loss=0.1564, simple_loss=0.244, pruned_loss=0.03437, over 24513.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2389, pruned_loss=0.04017, over 4691672.62 frames. ], batch size: 66, lr: 2.66e-03, grad_scale: 32.0 2023-10-03 17:18:03,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:18:05,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:18:09,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:18:12,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 17:18:14,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:18:15,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:18:16,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:18:19,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:18:19,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:18:21,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 17:18:22,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:18:22,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1345060.0, ans=0.1 2023-10-03 17:18:23,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:18:27,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 17:18:28,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:18:28,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:18:30,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 17:18:30,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 17:18:31,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 17:18:32,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:18:32,919 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 17:18:32,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:18:34,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:18:34,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:18:36,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 17:18:37,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:18:39,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:18:45,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 17:18:45,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 17:18:45,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 17:18:49,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 17:18:50,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:18:54,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:18:54,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:18:55,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1345193.3333333333, ans=0.125 2023-10-03 17:18:56,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 17:18:58,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:18:58,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 17:18:58,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:18:59,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:19:00,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:19:02,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:19:03,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:19:06,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:19:06,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:19:10,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:19:11,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 17:19:11,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:19:11,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:19:14,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:19:15,790 INFO [train.py:1046] (1/4) Epoch 38, batch 5250, loss[loss=0.1455, simple_loss=0.2235, pruned_loss=0.03372, over 23746.00 frames. ], tot_loss[loss=0.159, simple_loss=0.238, pruned_loss=0.04, over 4710739.35 frames. ], batch size: 149, lr: 2.66e-03, grad_scale: 32.0 2023-10-03 17:19:15,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:19:15,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:19:19,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:19:20,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:19:21,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:19:22,302 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.82 vs. limit=15.0 2023-10-03 17:19:23,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:19:27,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:19:30,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:19:32,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:19:33,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:19:35,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 17:19:35,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:19:37,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:19:50,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1345460.0, ans=0.125 2023-10-03 17:19:57,843 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.916e+02 2.104e+02 2.391e+02 3.735e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-03 17:20:15,559 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.15 vs. limit=10.0 2023-10-03 17:20:17,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1345593.3333333333, ans=0.2 2023-10-03 17:20:24,124 INFO [train.py:1046] (1/4) Epoch 38, batch 5300, loss[loss=0.1441, simple_loss=0.2226, pruned_loss=0.03284, over 22324.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2372, pruned_loss=0.03981, over 4713133.25 frames. ], batch size: 49, lr: 2.66e-03, grad_scale: 32.0 2023-10-03 17:20:26,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1345660.0, ans=0.1 2023-10-03 17:20:30,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1345660.0, ans=0.2 2023-10-03 17:20:38,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:20:38,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 17:20:38,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 17:20:38,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:20:38,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:20:38,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:20:38,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:20:38,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:20:38,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:20:39,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:20:39,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:20:39,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:20:39,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 17:20:39,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 17:20:39,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 17:20:39,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:20:39,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 17:20:39,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 17:20:39,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:20:40,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:20:40,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:20:40,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:20:40,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:20:41,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:20:41,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:20:41,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:20:41,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:20:41,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:20:41,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:20:41,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:20:41,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:20:41,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 17:20:41,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:20:42,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:20:42,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 17:20:42,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 17:20:42,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:20:42,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:20:42,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 17:20:42,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 17:20:42,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 17:20:43,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:20:43,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:20:43,954 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 17:20:44,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 17:20:44,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:20:44,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:20:44,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 17:20:44,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 17:20:44,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 17:20:44,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 17:20:50,428 INFO [train.py:1046] (1/4) Epoch 39, batch 0, loss[loss=0.157, simple_loss=0.2405, pruned_loss=0.03676, over 23331.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2405, pruned_loss=0.03676, over 23331.00 frames. ], batch size: 93, lr: 2.63e-03, grad_scale: 32.0 2023-10-03 17:20:50,428 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 17:21:02,125 INFO [train.py:1078] (1/4) Epoch 39, validation: loss=0.3329, simple_loss=0.2734, pruned_loss=0.1962, over 1125622.00 frames. 2023-10-03 17:21:02,126 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 17:21:02,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1345740.0, ans=0.125 2023-10-03 17:21:05,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 17:21:06,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:21:08,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:21:12,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:21:12,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:21:12,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:21:13,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 17:21:15,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 17:21:16,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:21:17,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:21:19,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1345806.6666666667, ans=0.125 2023-10-03 17:21:19,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1345806.6666666667, ans=0.2 2023-10-03 17:21:20,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:21:20,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1345806.6666666667, ans=0.125 2023-10-03 17:21:21,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:21:22,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:21:22,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:21:24,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 17:21:24,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:21:28,083 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.92 vs. limit=12.0 2023-10-03 17:21:32,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:21:32,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:21:35,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 17:21:38,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:21:38,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:21:41,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:21:44,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:21:47,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1345940.0, ans=0.125 2023-10-03 17:21:48,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:21:50,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1345940.0, ans=0.125 2023-10-03 17:21:54,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 17:21:58,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 17:21:58,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:21:58,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:21:59,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:21:59,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:21:59,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1346006.6666666667, ans=0.2 2023-10-03 17:22:02,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 17:22:04,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:22:06,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:22:09,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:22:13,190 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 17:22:13,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:22:14,581 INFO [train.py:1046] (1/4) Epoch 39, batch 50, loss[loss=0.1694, simple_loss=0.2553, pruned_loss=0.04178, over 23755.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.24, pruned_loss=0.04048, over 1050299.42 frames. ], batch size: 85, lr: 2.63e-03, grad_scale: 32.0 2023-10-03 17:22:16,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:22:18,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:22:18,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 17:22:20,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:22:20,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:22:22,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:22:24,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:22:27,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:22:29,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 17:22:29,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:22:35,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:22:37,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 17:22:38,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 17:22:41,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:22:41,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:22:41,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:22:42,530 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.895e+02 2.125e+02 2.422e+02 4.892e+02, threshold=4.250e+02, percent-clipped=3.0 2023-10-03 17:22:42,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:22:44,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:22:44,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:22:44,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:22:50,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:22:52,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:22:52,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 17:22:53,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 17:22:55,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:22:56,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:22:56,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 17:22:56,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:22:57,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 17:23:05,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:23:07,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:23:07,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:23:09,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:23:09,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:23:11,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 17:23:11,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 17:23:13,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:23:13,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:23:15,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:23:15,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:23:15,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 17:23:15,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 17:23:17,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 17:23:17,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:23:18,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:23:18,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 17:23:18,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 17:23:20,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:23:20,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:23:22,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 17:23:22,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:23:24,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:23:24,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1346340.0, ans=0.0 2023-10-03 17:23:26,985 INFO [train.py:1046] (1/4) Epoch 39, batch 100, loss[loss=0.1576, simple_loss=0.2486, pruned_loss=0.03335, over 24472.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2401, pruned_loss=0.03982, over 1867157.08 frames. ], batch size: 66, lr: 2.63e-03, grad_scale: 16.0 2023-10-03 17:23:27,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:23:29,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:23:31,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 17:23:31,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:23:35,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:23:36,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:23:36,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:23:36,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:23:36,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:23:37,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 17:23:38,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1346406.6666666667, ans=0.2 2023-10-03 17:23:40,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:23:40,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:23:41,512 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.06 vs. limit=15.0 2023-10-03 17:23:41,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:23:41,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:23:44,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 17:23:46,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:23:47,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:23:48,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:23:49,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1346473.3333333333, ans=0.125 2023-10-03 17:23:50,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:23:53,224 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 17:23:54,515 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 17:23:55,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:23:55,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:23:58,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:24:00,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:24:01,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:08,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:08,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1346540.0, ans=0.0 2023-10-03 17:24:09,887 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 17:24:11,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 17:24:15,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:24:16,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:24:18,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:19,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:24:24,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:24:25,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:24:29,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:29,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:24:30,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:24:30,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:24:30,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:32,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 17:24:32,325 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 17:24:32,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:24:34,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:24:34,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:34,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:24:35,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 17:24:35,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 17:24:36,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:24:36,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:36,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:24:37,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:24:38,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:24:40,692 INFO [train.py:1046] (1/4) Epoch 39, batch 150, loss[loss=0.1692, simple_loss=0.2608, pruned_loss=0.03883, over 24299.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2411, pruned_loss=0.04018, over 2497812.45 frames. ], batch size: 74, lr: 2.63e-03, grad_scale: 16.0 2023-10-03 17:24:40,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:24:44,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:24:46,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:24:46,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:24:47,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:48,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1346740.0, ans=0.0 2023-10-03 17:24:50,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:24:50,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:54,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:24:55,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:56,550 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.54 vs. limit=10.0 2023-10-03 17:24:58,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 17:24:58,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 17:24:58,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 17:25:01,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:25:01,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:25:02,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:25:04,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:25:04,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:25:04,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:25:04,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:25:04,307 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 17:25:06,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:25:07,379 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.926e+02 2.144e+02 2.350e+02 3.601e+02, threshold=4.288e+02, percent-clipped=0.0 2023-10-03 17:25:13,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:25:16,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:25:18,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 17:25:20,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:25:21,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:25:22,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:25:23,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:25:24,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:25:25,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:25:26,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:25:26,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 17:25:31,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:25:33,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:25:33,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:25:33,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:25:34,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:25:36,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 17:25:39,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:25:40,296 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.48 vs. limit=15.0 2023-10-03 17:25:41,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:25:43,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:25:44,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:25:44,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 17:25:44,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:25:45,820 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 17:25:48,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:25:51,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:25:51,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:25:51,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 17:25:52,893 INFO [train.py:1046] (1/4) Epoch 39, batch 200, loss[loss=0.1748, simple_loss=0.2475, pruned_loss=0.05104, over 23595.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2414, pruned_loss=0.04038, over 2995774.71 frames. ], batch size: 256, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:25:52,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:25:53,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:25:55,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 17:25:56,191 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.27 vs. limit=22.5 2023-10-03 17:25:57,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:25:59,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:25:59,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:26:03,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:26:03,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:26:03,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:26:26,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:26:26,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:26:27,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:26:27,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:26:29,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 17:26:29,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:26:31,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:26:33,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:26:34,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:26:34,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:26:34,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 17:26:36,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 17:26:36,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:26:42,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:26:46,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:26:54,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:26:55,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:26:57,815 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.23 vs. limit=10.0 2023-10-03 17:26:59,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1347340.0, ans=0.125 2023-10-03 17:27:01,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:27:03,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 17:27:05,267 INFO [train.py:1046] (1/4) Epoch 39, batch 250, loss[loss=0.1551, simple_loss=0.2198, pruned_loss=0.04518, over 23814.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2409, pruned_loss=0.04025, over 3369254.71 frames. ], batch size: 195, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:27:05,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:27:05,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:27:05,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:27:05,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:27:05,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 17:27:06,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:27:06,896 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 17:27:09,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:27:10,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:27:12,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:27:13,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:27:14,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:27:14,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:27:17,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:27:19,871 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:27:20,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:27:25,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1347473.3333333333, ans=0.125 2023-10-03 17:27:29,913 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.48 vs. limit=12.0 2023-10-03 17:27:30,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:27:30,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1347473.3333333333, ans=0.0 2023-10-03 17:27:32,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:27:32,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:27:32,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.09 vs. limit=22.5 2023-10-03 17:27:33,343 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.891e+02 2.094e+02 2.550e+02 3.805e+02, threshold=4.188e+02, percent-clipped=0.0 2023-10-03 17:27:34,994 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:27:38,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:27:40,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:27:40,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:27:41,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:27:42,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1347540.0, ans=0.125 2023-10-03 17:27:43,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:27:43,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:27:43,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:27:47,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:27:49,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 17:27:49,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:27:50,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:27:52,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:27:52,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:27:52,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:27:52,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:27:53,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:27:55,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:27:56,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:27:56,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1347606.6666666667, ans=0.125 2023-10-03 17:27:58,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:28:00,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:28:03,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:28:06,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:28:08,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1347673.3333333333, ans=0.125 2023-10-03 17:28:10,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:28:13,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:28:16,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 17:28:18,526 INFO [train.py:1046] (1/4) Epoch 39, batch 300, loss[loss=0.1518, simple_loss=0.2427, pruned_loss=0.03047, over 24438.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2381, pruned_loss=0.03991, over 3651865.40 frames. ], batch size: 69, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:28:18,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:28:18,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:28:20,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 17:28:21,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 17:28:23,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:28:23,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 17:28:28,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:28:28,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:28:31,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:28:31,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 17:28:34,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:28:34,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:28:35,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 17:28:35,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:28:39,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:28:43,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:28:44,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 17:28:47,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 17:28:49,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:28:51,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:28:52,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:28:52,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 17:28:52,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:28:54,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:28:55,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:28:55,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:28:58,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 17:28:58,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 17:28:59,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:29:01,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:29:03,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 17:29:05,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:29:08,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:29:10,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1347940.0, ans=0.0 2023-10-03 17:29:11,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:29:11,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 17:29:14,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1347940.0, ans=0.125 2023-10-03 17:29:16,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:29:16,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:29:19,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:29:20,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:29:20,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 17:29:20,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:29:22,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:29:23,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 17:29:25,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:29:25,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:26,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:29:28,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:29:28,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:31,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:29:31,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 17:29:32,489 INFO [train.py:1046] (1/4) Epoch 39, batch 350, loss[loss=0.1337, simple_loss=0.1877, pruned_loss=0.03984, over 19231.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2367, pruned_loss=0.03912, over 3881678.42 frames. ], batch size: 388, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:29:34,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:39,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:29:42,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:29:44,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:47,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 17:29:49,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:29:49,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 17:29:52,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:52,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 17:29:52,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1348140.0, ans=0.0 2023-10-03 17:29:54,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:29:54,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1348140.0, ans=0.0 2023-10-03 17:29:56,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 17:29:58,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:29:59,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:29:59,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:30:00,839 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 1.879e+02 2.125e+02 2.432e+02 3.754e+02, threshold=4.249e+02, percent-clipped=0.0 2023-10-03 17:30:02,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:02,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:02,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:30:02,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:30:02,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:30:03,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:30:04,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:30:09,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1348206.6666666667, ans=0.2 2023-10-03 17:30:09,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1348206.6666666667, ans=0.07 2023-10-03 17:30:12,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:30:13,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:30:13,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:30:14,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:30:19,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 17:30:19,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:30:21,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten.whitening_limit, batch_count=1348273.3333333333, ans=15.0 2023-10-03 17:30:22,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:30:22,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:30:24,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:30:26,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 17:30:28,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:30:29,879 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 17:30:31,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 17:30:31,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:34,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:30:34,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 17:30:35,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:30:36,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:30:37,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1348340.0, ans=0.125 2023-10-03 17:30:37,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1348340.0, ans=0.2 2023-10-03 17:30:38,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:39,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:30:39,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:30:41,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:30:45,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:30:47,671 INFO [train.py:1046] (1/4) Epoch 39, batch 400, loss[loss=0.1551, simple_loss=0.2448, pruned_loss=0.0327, over 24326.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2367, pruned_loss=0.03869, over 4084255.78 frames. ], batch size: 74, lr: 2.62e-03, grad_scale: 32.0 2023-10-03 17:30:49,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:30:49,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 17:30:49,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:30:50,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:30:51,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:30:53,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:30:56,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:56,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:30:59,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 17:31:00,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 17:31:00,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:31:02,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 17:31:02,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1348473.3333333333, ans=0.1 2023-10-03 17:31:03,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:31:07,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:31:07,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:31:07,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 17:31:07,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:31:09,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:31:09,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:31:10,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:31:13,277 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 17:31:14,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 17:31:19,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:31:21,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:31:22,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 17:31:22,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 17:31:23,384 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.76 vs. limit=22.5 2023-10-03 17:31:25,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:31:28,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:31:32,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 17:31:36,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 17:31:38,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 17:31:39,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:31:41,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:31:41,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 17:31:45,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:31:46,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:31:49,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:31:52,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:31:52,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 17:31:54,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 17:31:55,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 17:31:58,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:31:58,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:32:01,541 INFO [train.py:1046] (1/4) Epoch 39, batch 450, loss[loss=0.154, simple_loss=0.2478, pruned_loss=0.03007, over 24484.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2372, pruned_loss=0.03868, over 4223596.43 frames. ], batch size: 69, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:32:01,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 17:32:04,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 17:32:04,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:32:04,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 17:32:06,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 17:32:06,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:32:07,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:32:07,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:32:07,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 17:32:07,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:32:07,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1348740.0, ans=0.2 2023-10-03 17:32:10,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:32:12,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:32:21,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:32:23,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:32:24,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1348806.6666666667, ans=0.125 2023-10-03 17:32:25,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 17:32:27,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 17:32:29,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:32:31,360 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.906e+02 2.093e+02 2.336e+02 3.263e+02, threshold=4.186e+02, percent-clipped=0.0 2023-10-03 17:32:31,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:32:34,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:32:37,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:32:37,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:32:40,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 17:32:40,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 17:32:43,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 17:32:43,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:32:44,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:32:44,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:32:45,996 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 17:32:46,005 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 17:32:47,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:32:48,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:32:50,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 17:32:53,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:32:53,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:32:55,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 17:32:56,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 17:32:58,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:33:00,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:33:00,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:33:01,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 17:33:04,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:33:05,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 17:33:05,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 17:33:07,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:33:10,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:33:12,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1349006.6666666667, ans=0.2 2023-10-03 17:33:13,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:33:14,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:33:16,023 INFO [train.py:1046] (1/4) Epoch 39, batch 500, loss[loss=0.1655, simple_loss=0.2384, pruned_loss=0.04632, over 23475.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2384, pruned_loss=0.03944, over 4327883.81 frames. ], batch size: 285, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:33:16,059 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 17:33:16,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1349073.3333333333, ans=0.125 2023-10-03 17:33:18,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:33:20,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:33:20,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:33:20,821 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 17:33:21,906 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.17 vs. limit=15.0 2023-10-03 17:33:23,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 17:33:24,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:33:26,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:33:30,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:33:32,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:33:35,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:33:35,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:33:35,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:33:46,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:33:46,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 17:33:47,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:33:47,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:33:47,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 17:33:47,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:33:50,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:33:51,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:33:51,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:33:51,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:33:53,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 17:33:58,086 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 17:33:58,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:34:00,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:34:00,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1349273.3333333333, ans=0.125 2023-10-03 17:34:01,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:34:01,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:34:01,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:34:04,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 17:34:06,498 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.77 vs. limit=12.0 2023-10-03 17:34:07,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:34:08,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:34:11,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:34:15,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:34:17,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:34:19,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 17:34:19,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:34:19,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:34:24,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 17:34:25,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 17:34:28,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:34:30,216 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1349406.6666666667, ans=0.07 2023-10-03 17:34:31,318 INFO [train.py:1046] (1/4) Epoch 39, batch 550, loss[loss=0.176, simple_loss=0.25, pruned_loss=0.05098, over 23897.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2396, pruned_loss=0.03982, over 4422756.42 frames. ], batch size: 195, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:34:34,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 17:34:35,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 17:34:35,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:34:35,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 17:34:36,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:34:36,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:34:38,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:34:39,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:34:39,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:34:39,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1349406.6666666667, ans=0.0 2023-10-03 17:34:41,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:34:43,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:34:45,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 17:34:45,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:34:48,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:34:48,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:34:51,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:34:51,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:34:51,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1349473.3333333333, ans=0.125 2023-10-03 17:34:56,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 17:34:58,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 17:35:01,313 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.945e+02 2.178e+02 2.458e+02 4.129e+02, threshold=4.356e+02, percent-clipped=0.0 2023-10-03 17:35:01,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:35:04,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:35:04,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1349540.0, ans=0.125 2023-10-03 17:35:05,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:35:07,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:35:09,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:35:09,836 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 17:35:11,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:35:12,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 17:35:14,626 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.50 vs. limit=15.0 2023-10-03 17:35:15,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:35:17,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 17:35:17,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:35:18,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:35:18,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 17:35:19,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 17:35:21,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:35:21,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:35:21,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:35:21,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:35:25,089 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.97 vs. limit=15.0 2023-10-03 17:35:25,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:35:25,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:35:28,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:35:28,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:35:30,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 17:35:30,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:35:32,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:35:32,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1349673.3333333333, ans=0.95 2023-10-03 17:35:32,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1349673.3333333333, ans=0.2 2023-10-03 17:35:33,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:35:33,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:35:34,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:35:35,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 17:35:40,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 17:35:43,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 17:35:44,809 INFO [train.py:1046] (1/4) Epoch 39, batch 600, loss[loss=0.1454, simple_loss=0.2339, pruned_loss=0.02843, over 24512.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2393, pruned_loss=0.03983, over 4483953.25 frames. ], batch size: 63, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:35:44,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:35:44,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:35:46,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:35:50,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:35:53,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:35:55,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 17:35:57,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:35:59,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:36:01,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:36:03,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 17:36:04,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:36:06,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1349806.6666666667, ans=0.125 2023-10-03 17:36:07,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 17:36:13,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:36:13,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:36:14,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:36:15,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1349873.3333333333, ans=0.2 2023-10-03 17:36:16,049 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.45 vs. limit=15.0 2023-10-03 17:36:22,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:36:22,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:36:22,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:36:22,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1349873.3333333333, ans=0.1 2023-10-03 17:36:25,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1349873.3333333333, ans=0.125 2023-10-03 17:36:28,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:36:31,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:36:31,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:36:31,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:36:40,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 17:36:44,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:36:44,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:36:49,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 17:36:49,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:36:51,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 17:36:53,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:36:53,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:36:59,593 INFO [train.py:1046] (1/4) Epoch 39, batch 650, loss[loss=0.168, simple_loss=0.2381, pruned_loss=0.04893, over 23729.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2381, pruned_loss=0.03971, over 4513315.37 frames. ], batch size: 179, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:36:59,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 17:36:59,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:37:02,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:37:03,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:37:07,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:08,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 17:37:08,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:37:11,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:37:11,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:37:16,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:37:20,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 17:37:21,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:37:22,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:37:25,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1350140.0, ans=0.125 2023-10-03 17:37:26,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:37:28,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 17:37:29,367 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.906e+02 2.088e+02 2.290e+02 3.410e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-03 17:37:30,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:37:30,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:32,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:37:32,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1350206.6666666667, ans=0.0 2023-10-03 17:37:34,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:34,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:37:38,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:37:38,541 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 17:37:38,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:37:38,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:37:41,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:41,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1350206.6666666667, ans=0.125 2023-10-03 17:37:42,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:37:42,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:37:42,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:37:42,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 17:37:45,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:37:45,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:37:47,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 17:37:47,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:37:47,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:37:49,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 17:37:50,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 17:37:50,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:50,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:37:50,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1350273.3333333333, ans=0.0 2023-10-03 17:37:51,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:37:51,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:37:53,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:37:59,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:59,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:38:01,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:38:02,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:38:03,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 17:38:04,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:38:11,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:38:12,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:38:13,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:38:13,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:38:14,246 INFO [train.py:1046] (1/4) Epoch 39, batch 700, loss[loss=0.1389, simple_loss=0.1887, pruned_loss=0.04458, over 19113.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2365, pruned_loss=0.03941, over 4548127.01 frames. ], batch size: 388, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:38:19,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 17:38:19,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 17:38:21,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 17:38:21,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:38:24,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:38:25,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 17:38:29,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:38:32,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:38:34,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:38:35,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:38:36,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:38:38,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:38:41,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 17:38:41,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:38:43,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 17:38:47,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 17:38:50,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:38:51,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:38:51,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:38:55,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:38:57,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 17:38:59,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1350606.6666666667, ans=0.125 2023-10-03 17:39:00,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:39:02,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:39:02,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 17:39:06,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:39:08,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:39:11,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:39:15,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:39:17,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 17:39:20,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 17:39:20,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 17:39:22,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1350673.3333333333, ans=0.1 2023-10-03 17:39:23,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:39:24,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:39:26,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:39:28,850 INFO [train.py:1046] (1/4) Epoch 39, batch 750, loss[loss=0.137, simple_loss=0.2176, pruned_loss=0.02825, over 24434.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2363, pruned_loss=0.039, over 4589387.86 frames. ], batch size: 58, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:39:28,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:39:28,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 17:39:32,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 17:39:34,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 17:39:34,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 17:39:35,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 17:39:36,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=1350740.0, ans=6.0 2023-10-03 17:39:36,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 17:39:36,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:39:36,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 17:39:38,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:39:39,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:39:41,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:39:43,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:39:43,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:39:44,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:39:47,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:39:47,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:39:48,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:39:50,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:39:51,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:39:51,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 17:39:53,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:39:53,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:39:56,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:39:57,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 17:39:58,642 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 2.002e+02 2.322e+02 2.663e+02 4.203e+02, threshold=4.644e+02, percent-clipped=1.0 2023-10-03 17:39:58,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 17:39:58,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:40:00,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 17:40:00,799 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 17:40:02,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 17:40:02,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:40:02,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:40:03,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:40:03,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1350873.3333333333, ans=0.1 2023-10-03 17:40:10,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:40:10,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:40:10,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:40:13,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:40:13,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:40:15,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 17:40:15,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:40:17,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 17:40:17,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:40:21,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:40:21,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 17:40:21,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:40:24,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:40:25,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:40:26,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:40:28,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:40:31,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 17:40:32,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:40:33,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:40:36,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:40:37,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:40:38,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:40:38,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:40:43,535 INFO [train.py:1046] (1/4) Epoch 39, batch 800, loss[loss=0.1661, simple_loss=0.2432, pruned_loss=0.04452, over 23577.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2367, pruned_loss=0.03905, over 4608318.23 frames. ], batch size: 256, lr: 2.62e-03, grad_scale: 32.0 2023-10-03 17:40:46,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:40:46,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:40:47,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:40:47,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:40:50,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:40:50,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:40:52,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:40:57,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:40:57,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:40:59,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 17:41:01,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:41:01,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:41:02,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:41:02,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:41:02,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 17:41:02,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:41:02,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 17:41:05,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:41:07,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:41:09,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:41:09,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:41:10,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1351140.0, ans=0.2 2023-10-03 17:41:12,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:41:13,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:41:16,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:41:18,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:41:18,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 17:41:21,395 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 17:41:22,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 17:41:22,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:41:22,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:41:24,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:41:25,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:41:30,171 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 17:41:30,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 17:41:32,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:41:33,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:41:36,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1351273.3333333333, ans=0.07 2023-10-03 17:41:37,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:41:39,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:41:40,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 17:41:40,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:41:43,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 17:41:49,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:41:50,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1351340.0, ans=0.125 2023-10-03 17:41:52,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:41:53,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 17:41:55,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:41:55,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:41:56,650 INFO [train.py:1046] (1/4) Epoch 39, batch 850, loss[loss=0.1421, simple_loss=0.2261, pruned_loss=0.02903, over 24494.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2372, pruned_loss=0.03917, over 4648266.49 frames. ], batch size: 63, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:41:56,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 17:41:56,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:41:58,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:41:59,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:01,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:42:02,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:42:04,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 17:42:05,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 17:42:05,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 17:42:05,996 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.52 vs. limit=6.0 2023-10-03 17:42:08,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:42:08,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:42:09,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:09,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:42:09,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:42:13,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1351473.3333333333, ans=0.0 2023-10-03 17:42:14,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:42:14,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:42:14,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 17:42:18,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 17:42:20,700 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.82 vs. limit=15.0 2023-10-03 17:42:22,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:42:24,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 17:42:26,738 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.921e+02 2.098e+02 2.491e+02 3.402e+02, threshold=4.195e+02, percent-clipped=0.0 2023-10-03 17:42:28,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 17:42:31,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 17:42:34,583 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 17:42:34,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:42:34,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:42:34,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 17:42:37,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:39,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:39,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 17:42:40,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:42:41,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:42:43,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:42:43,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:42:44,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:42:46,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:42:46,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 17:42:48,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:42:48,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:42:49,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:42:49,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:42:50,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:42:52,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1351606.6666666667, ans=0.0 2023-10-03 17:42:55,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:56,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:42:57,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:42:59,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:43:00,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:43:08,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 17:43:09,833 INFO [train.py:1046] (1/4) Epoch 39, batch 900, loss[loss=0.1369, simple_loss=0.2185, pruned_loss=0.02769, over 24444.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2371, pruned_loss=0.03847, over 4689550.42 frames. ], batch size: 58, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:43:09,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:43:11,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 17:43:11,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:43:11,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:43:13,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 17:43:19,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:43:20,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:43:21,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 17:43:23,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:43:25,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 17:43:26,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 17:43:27,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:43:27,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:43:27,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:43:29,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:43:29,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1351806.6666666667, ans=0.125 2023-10-03 17:43:39,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:43:39,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:43:39,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:43:43,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:43:47,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 17:43:50,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:43:53,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 17:43:54,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:43:54,827 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 17:43:54,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 17:43:55,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1351940.0, ans=10.0 2023-10-03 17:44:00,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1351940.0, ans=0.09899494936611666 2023-10-03 17:44:01,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 17:44:01,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:44:01,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:44:08,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:44:08,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:44:09,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1352006.6666666667, ans=0.125 2023-10-03 17:44:11,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 17:44:11,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:44:14,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 17:44:15,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:44:15,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:44:17,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:44:17,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:44:21,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 17:44:22,015 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 17:44:22,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1352006.6666666667, ans=0.1 2023-10-03 17:44:23,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 17:44:23,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 17:44:24,780 INFO [train.py:1046] (1/4) Epoch 39, batch 950, loss[loss=0.1641, simple_loss=0.2531, pruned_loss=0.03752, over 24672.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2381, pruned_loss=0.03926, over 4680353.50 frames. ], batch size: 68, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:44:26,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:44:27,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1352073.3333333333, ans=0.125 2023-10-03 17:44:30,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 17:44:33,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:44:36,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:44:36,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:44:36,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:44:39,902 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 17:44:43,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:44:45,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:44:45,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:44:45,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:44:45,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 17:44:46,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 17:44:47,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:44:49,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 17:44:49,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:44:53,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:44:53,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:44:53,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:44:54,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1352206.6666666667, ans=0.1 2023-10-03 17:44:55,211 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.980e+02 2.279e+02 2.726e+02 3.992e+02, threshold=4.557e+02, percent-clipped=0.0 2023-10-03 17:44:55,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 17:44:56,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:44:58,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:45:00,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:45:02,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:45:02,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:45:08,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 17:45:10,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 17:45:10,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:45:11,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:45:11,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:45:11,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:45:15,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 17:45:15,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:45:19,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:45:19,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:45:19,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 17:45:19,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:45:19,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:45:20,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 17:45:20,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1352273.3333333333, ans=0.0 2023-10-03 17:45:24,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:45:26,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:45:30,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:45:33,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 17:45:33,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 17:45:38,061 INFO [train.py:1046] (1/4) Epoch 39, batch 1000, loss[loss=0.1507, simple_loss=0.2287, pruned_loss=0.03633, over 21486.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2376, pruned_loss=0.03928, over 4684460.40 frames. ], batch size: 47, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:45:38,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:45:41,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 17:45:41,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:45:44,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:45:47,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 17:45:47,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 17:45:51,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:45:51,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:45:53,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:45:54,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1352473.3333333333, ans=0.125 2023-10-03 17:45:55,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 17:46:00,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 17:46:02,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 17:46:02,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:46:04,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 17:46:07,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 17:46:07,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 17:46:09,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:46:10,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:46:18,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:46:20,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:46:20,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:46:20,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:46:20,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 17:46:20,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:46:21,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:46:21,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:46:23,167 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 17:46:26,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 17:46:27,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 17:46:28,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 17:46:29,282 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.72 vs. limit=15.0 2023-10-03 17:46:31,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:46:36,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1352673.3333333333, ans=0.07 2023-10-03 17:46:37,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:46:37,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:46:39,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:46:40,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:46:42,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 17:46:43,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:46:43,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 17:46:44,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 17:46:46,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:46:46,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:46:48,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:46:50,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:46:53,104 INFO [train.py:1046] (1/4) Epoch 39, batch 1050, loss[loss=0.1523, simple_loss=0.2426, pruned_loss=0.03105, over 24032.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2379, pruned_loss=0.0387, over 4697704.32 frames. ], batch size: 86, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:46:53,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:46:55,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:46:56,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1352740.0, ans=0.0 2023-10-03 17:46:57,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:46:57,909 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.64 vs. limit=15.0 2023-10-03 17:46:58,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:46:58,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:47:00,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:47:01,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:47:04,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:47:06,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:47:07,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:47:08,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:47:09,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1352806.6666666667, ans=0.2 2023-10-03 17:47:10,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:47:10,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 17:47:12,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:47:13,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 17:47:13,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:47:13,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 17:47:13,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 17:47:20,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:47:21,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:47:21,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:47:24,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 17:47:24,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 17:47:25,571 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.919e+02 2.049e+02 2.517e+02 3.582e+02, threshold=4.099e+02, percent-clipped=0.0 2023-10-03 17:47:25,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:47:27,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 17:47:28,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 17:47:30,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:47:34,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 17:47:35,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 17:47:36,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1352940.0, ans=0.1 2023-10-03 17:47:37,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:47:38,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:47:41,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:47:44,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1352940.0, ans=0.2 2023-10-03 17:47:46,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 17:47:48,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 17:47:48,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 17:47:48,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:47:50,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:47:51,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 17:47:53,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1353006.6666666667, ans=0.1 2023-10-03 17:47:55,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:47:57,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:47:57,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:47:57,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:47:57,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:01,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:01,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 17:48:03,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:48:04,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 17:48:04,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 17:48:04,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:48:06,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1353073.3333333333, ans=0.1 2023-10-03 17:48:07,647 INFO [train.py:1046] (1/4) Epoch 39, batch 1100, loss[loss=0.1548, simple_loss=0.2287, pruned_loss=0.04048, over 24461.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2366, pruned_loss=0.03867, over 4691778.15 frames. ], batch size: 58, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:48:07,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1353073.3333333333, ans=0.125 2023-10-03 17:48:09,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:48:10,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1353073.3333333333, ans=0.125 2023-10-03 17:48:14,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:48:18,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:48:18,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1353073.3333333333, ans=0.125 2023-10-03 17:48:20,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:48:21,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:48:21,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 17:48:21,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1353140.0, ans=0.125 2023-10-03 17:48:21,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1353140.0, ans=0.1 2023-10-03 17:48:22,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:48:23,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1353140.0, ans=0.125 2023-10-03 17:48:24,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:48:27,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:48:31,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:48:32,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 17:48:33,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 17:48:34,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:48:34,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:48:36,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:48:38,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:48:42,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:48:45,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 17:48:45,703 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 17:48:45,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1353206.6666666667, ans=0.0 2023-10-03 17:48:45,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1353206.6666666667, ans=0.0 2023-10-03 17:48:47,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:49,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:50,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:48:50,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:48:50,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1353273.3333333333, ans=0.2 2023-10-03 17:48:52,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 17:48:53,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:48:53,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:48:53,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:48:53,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:53,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 17:48:59,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:48:59,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 17:49:01,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:49:02,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1353273.3333333333, ans=0.125 2023-10-03 17:49:03,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1353273.3333333333, ans=0.125 2023-10-03 17:49:05,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:49:08,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 17:49:08,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 17:49:10,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:49:12,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:49:12,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:49:14,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 17:49:16,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:49:16,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:49:18,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 17:49:19,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:49:20,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 17:49:22,620 INFO [train.py:1046] (1/4) Epoch 39, batch 1150, loss[loss=0.1909, simple_loss=0.2534, pruned_loss=0.06417, over 19469.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2372, pruned_loss=0.03865, over 4708579.99 frames. ], batch size: 388, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:49:22,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:49:22,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:49:22,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:49:27,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1353406.6666666667, ans=0.1 2023-10-03 17:49:29,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:49:31,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:49:32,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:49:32,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:49:32,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 17:49:33,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:49:35,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 17:49:36,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:49:36,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:49:42,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 17:49:42,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1353473.3333333333, ans=0.1 2023-10-03 17:49:44,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:49:48,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:49:49,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:49:49,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 17:49:49,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:49:50,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:49:55,047 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.879e+02 2.025e+02 2.261e+02 4.283e+02, threshold=4.051e+02, percent-clipped=1.0 2023-10-03 17:49:55,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 17:49:56,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:49:57,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:50:03,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:50:10,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:50:10,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 17:50:11,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:50:11,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:50:18,355 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 17:50:21,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:50:27,763 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 17:50:31,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:50:33,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:50:33,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:50:34,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:50:35,964 INFO [train.py:1046] (1/4) Epoch 39, batch 1200, loss[loss=0.1454, simple_loss=0.2262, pruned_loss=0.0323, over 22393.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2379, pruned_loss=0.03887, over 4718075.03 frames. ], batch size: 49, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:50:38,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:50:39,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1353740.0, ans=0.125 2023-10-03 17:50:41,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:50:41,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:50:43,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:50:43,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:50:43,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:50:45,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:50:45,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1353740.0, ans=0.0 2023-10-03 17:50:48,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:50:50,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:50:50,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:50:54,003 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 17:50:57,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 17:51:00,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:51:02,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:51:04,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:51:04,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:51:04,394 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 17:51:05,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:51:08,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1353873.3333333333, ans=0.0 2023-10-03 17:51:12,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:51:12,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:51:13,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 17:51:15,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:51:18,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 17:51:23,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 17:51:23,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:51:25,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:51:25,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:51:26,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 17:51:28,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:51:28,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:51:28,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:51:28,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 17:51:29,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:51:29,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:51:29,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 17:51:32,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:51:32,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:51:35,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 17:51:37,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:51:39,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 17:51:42,378 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 17:51:43,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:51:46,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:51:48,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:51:49,622 INFO [train.py:1046] (1/4) Epoch 39, batch 1250, loss[loss=0.1777, simple_loss=0.2478, pruned_loss=0.0538, over 23768.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2396, pruned_loss=0.03963, over 4724877.25 frames. ], batch size: 179, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:51:49,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:51:55,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 17:51:59,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:51:59,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:52:00,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 17:52:03,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:52:04,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:52:09,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:52:09,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:52:10,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:52:10,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:52:12,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:52:13,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1354140.0, ans=0.125 2023-10-03 17:52:14,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1354140.0, ans=0.0 2023-10-03 17:52:16,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 17:52:16,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:52:16,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:52:17,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:52:17,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:52:17,781 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1354206.6666666667, ans=0.125 2023-10-03 17:52:21,937 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.935e+02 2.085e+02 2.357e+02 3.026e+02, threshold=4.170e+02, percent-clipped=0.0 2023-10-03 17:52:23,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:52:23,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 17:52:28,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 17:52:29,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:52:32,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:52:32,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1354273.3333333333, ans=0.125 2023-10-03 17:52:34,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 17:52:34,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:52:34,115 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 17:52:34,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:52:34,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:52:38,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:52:39,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:52:40,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:52:41,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 17:52:41,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 17:52:42,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 17:52:46,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:52:47,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 17:52:47,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:52:49,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 17:52:49,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:52:51,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=1354340.0, ans=15.0 2023-10-03 17:52:52,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 17:52:52,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:52:52,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:52:52,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 17:52:54,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:52:55,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 17:52:59,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:53:00,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:53:01,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:53:03,235 INFO [train.py:1046] (1/4) Epoch 39, batch 1300, loss[loss=0.1535, simple_loss=0.2414, pruned_loss=0.03285, over 23996.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2391, pruned_loss=0.03943, over 4723250.80 frames. ], batch size: 80, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:53:04,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:53:07,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:53:07,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 17:53:11,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:53:12,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:53:14,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:53:15,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:53:17,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:53:17,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 17:53:21,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:53:21,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:53:23,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 17:53:27,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:53:28,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1354473.3333333333, ans=0.05 2023-10-03 17:53:28,898 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.04 vs. limit=12.0 2023-10-03 17:53:29,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:53:31,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:53:32,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:53:34,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:53:34,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1354540.0, ans=0.125 2023-10-03 17:53:35,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:53:35,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1354540.0, ans=0.1 2023-10-03 17:53:36,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 17:53:36,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 17:53:40,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:53:40,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:53:42,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 17:53:42,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1354540.0, ans=0.2 2023-10-03 17:53:43,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:53:45,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:53:45,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1354606.6666666667, ans=0.125 2023-10-03 17:53:45,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1354606.6666666667, ans=0.0 2023-10-03 17:53:47,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:53:49,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 17:53:49,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:53:49,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 17:53:52,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:53:55,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:53:57,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:53:58,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 17:54:01,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 17:54:01,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 17:54:05,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:54:07,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 17:54:09,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:54:15,387 INFO [train.py:1046] (1/4) Epoch 39, batch 1350, loss[loss=0.1542, simple_loss=0.2302, pruned_loss=0.0391, over 23347.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.238, pruned_loss=0.03927, over 4705823.01 frames. ], batch size: 105, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:54:15,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 17:54:19,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:54:20,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:54:23,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:54:25,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:54:27,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:54:27,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:54:32,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:54:32,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 17:54:32,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1354806.6666666667, ans=0.2 2023-10-03 17:54:33,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:54:33,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:54:35,676 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.28 vs. limit=15.0 2023-10-03 17:54:36,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 17:54:37,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:54:39,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:54:39,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 17:54:41,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1354806.6666666667, ans=0.1 2023-10-03 17:54:42,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 17:54:43,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 17:54:46,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:54:46,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 17:54:46,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1354873.3333333333, ans=0.125 2023-10-03 17:54:48,895 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.867e+02 2.085e+02 2.496e+02 4.197e+02, threshold=4.170e+02, percent-clipped=1.0 2023-10-03 17:54:56,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:55:05,182 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.23 vs. limit=15.0 2023-10-03 17:55:05,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:55:05,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:55:05,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 17:55:07,949 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.96 vs. limit=15.0 2023-10-03 17:55:08,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:55:09,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 17:55:09,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:55:10,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:55:12,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:55:14,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 17:55:15,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:55:15,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1355006.6666666667, ans=0.2 2023-10-03 17:55:21,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 17:55:23,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 17:55:29,758 INFO [train.py:1046] (1/4) Epoch 39, batch 1400, loss[loss=0.1542, simple_loss=0.2245, pruned_loss=0.04195, over 23775.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2369, pruned_loss=0.03908, over 4707933.17 frames. ], batch size: 232, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:55:29,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 17:55:31,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:55:32,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1355073.3333333333, ans=0.2 2023-10-03 17:55:34,380 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:55:35,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:55:35,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:55:39,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 17:55:41,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 17:55:48,332 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:55:49,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:55:51,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:55:54,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:55:54,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 17:55:57,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:55:59,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 17:56:07,275 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.47 vs. limit=15.0 2023-10-03 17:56:07,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:56:09,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:56:10,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1355206.6666666667, ans=0.2 2023-10-03 17:56:11,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 17:56:13,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:56:14,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:56:14,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:56:15,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:56:17,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:56:17,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:56:18,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:56:20,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 17:56:20,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:56:24,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:56:25,606 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.50 vs. limit=15.0 2023-10-03 17:56:29,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:56:33,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1355340.0, ans=0.04949747468305833 2023-10-03 17:56:35,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 17:56:37,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:56:38,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:56:41,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 17:56:42,694 INFO [train.py:1046] (1/4) Epoch 39, batch 1450, loss[loss=0.1502, simple_loss=0.2257, pruned_loss=0.03741, over 23774.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2366, pruned_loss=0.03879, over 4728412.17 frames. ], batch size: 179, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:56:42,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:56:42,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:56:46,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:56:49,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:56:49,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:56:49,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 17:56:55,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:56:56,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1355473.3333333333, ans=0.125 2023-10-03 17:56:57,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:56:58,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:56:58,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 17:57:00,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:57:01,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 17:57:02,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:57:03,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:57:03,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 17:57:05,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:57:05,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:57:05,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 17:57:05,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:57:06,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:57:08,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:57:10,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:57:13,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:57:14,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:57:16,183 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.879e+02 2.005e+02 2.239e+02 6.029e+02, threshold=4.010e+02, percent-clipped=2.0 2023-10-03 17:57:16,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:57:17,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:57:19,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:57:19,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:57:19,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:57:19,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:57:21,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 17:57:26,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:57:26,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1355606.6666666667, ans=0.125 2023-10-03 17:57:29,208 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 17:57:30,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:57:32,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:57:33,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:57:35,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 17:57:38,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:57:38,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 17:57:39,535 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.84 vs. limit=12.0 2023-10-03 17:57:40,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 17:57:41,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:57:45,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:57:46,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:57:48,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 17:57:49,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 17:57:49,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 17:57:51,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:57:52,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:57:53,013 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.01 vs. limit=6.0 2023-10-03 17:57:55,817 INFO [train.py:1046] (1/4) Epoch 39, batch 1500, loss[loss=0.1502, simple_loss=0.237, pruned_loss=0.03169, over 24670.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2368, pruned_loss=0.03864, over 4739831.62 frames. ], batch size: 65, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:58:02,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 17:58:02,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:58:02,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:58:05,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:58:05,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:58:06,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:58:08,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 17:58:08,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:58:09,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 17:58:09,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:58:11,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:58:13,575 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.32 vs. limit=22.5 2023-10-03 17:58:13,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:58:14,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:58:18,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:58:18,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 17:58:18,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1355806.6666666667, ans=0.0 2023-10-03 17:58:19,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:58:19,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:58:21,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:58:24,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 17:58:28,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 17:58:29,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:58:29,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 17:58:33,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 17:58:34,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:58:35,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:58:35,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:58:39,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 17:58:39,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:58:39,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:58:39,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 17:58:40,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:58:42,267 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1355940.0, ans=0.125 2023-10-03 17:58:46,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:58:46,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 17:58:46,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1355940.0, ans=0.0 2023-10-03 17:58:50,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:58:51,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:58:53,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1356006.6666666667, ans=0.1 2023-10-03 17:58:55,559 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 17:58:56,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:58:56,905 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 17:58:58,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:58:59,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:58:59,787 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 17:58:59,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:59:04,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 17:59:05,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:59:10,662 INFO [train.py:1046] (1/4) Epoch 39, batch 1550, loss[loss=0.1537, simple_loss=0.2385, pruned_loss=0.03439, over 24466.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.237, pruned_loss=0.03881, over 4739158.44 frames. ], batch size: 63, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:59:10,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:59:10,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:59:10,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:59:10,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:59:12,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:59:13,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 17:59:15,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 17:59:15,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:59:15,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 17:59:16,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 17:59:16,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1356073.3333333333, ans=0.2 2023-10-03 17:59:17,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:59:19,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:59:19,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:59:19,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:59:20,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:59:21,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:59:23,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1356140.0, ans=0.125 2023-10-03 17:59:24,774 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 17:59:24,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:59:26,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:59:26,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:59:28,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:59:29,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 17:59:30,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:59:30,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 17:59:32,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 17:59:32,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 17:59:34,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:59:35,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:59:38,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:59:41,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 17:59:41,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 17:59:44,433 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.873e+02 2.058e+02 2.310e+02 4.421e+02, threshold=4.116e+02, percent-clipped=1.0 2023-10-03 17:59:47,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:59:51,042 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.74 vs. limit=12.0 2023-10-03 17:59:51,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:59:51,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:59:51,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:59:52,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 17:59:57,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:59:58,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:00:00,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:00:03,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:00:03,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:00:03,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 18:00:04,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:00:05,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1356273.3333333333, ans=0.1 2023-10-03 18:00:07,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:00:07,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:00:08,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 18:00:08,994 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 18:00:11,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:00:17,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 18:00:21,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:00:23,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:00:23,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 18:00:24,608 INFO [train.py:1046] (1/4) Epoch 39, batch 1600, loss[loss=0.1633, simple_loss=0.2481, pruned_loss=0.03922, over 24079.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2378, pruned_loss=0.03907, over 4738736.38 frames. ], batch size: 80, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 18:00:24,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:00:26,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:00:26,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:00:26,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:00:27,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:00:28,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:00:30,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 18:00:31,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 18:00:31,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 18:00:32,448 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.06 vs. limit=15.0 2023-10-03 18:00:35,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:00:36,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 18:00:36,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:00:39,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:00:42,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:00:45,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 18:00:49,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:00:49,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 18:00:50,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:00:51,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 18:00:57,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 18:01:04,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:01:05,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 18:01:06,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1356540.0, ans=0.125 2023-10-03 18:01:06,945 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.05 vs. limit=22.5 2023-10-03 18:01:07,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:01:07,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:01:07,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:01:12,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 18:01:16,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 18:01:17,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:01:19,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:01:19,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:01:20,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:01:22,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:01:22,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:01:24,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:01:30,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:01:32,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:01:33,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 18:01:33,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:01:35,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 18:01:37,827 INFO [train.py:1046] (1/4) Epoch 39, batch 1650, loss[loss=0.1569, simple_loss=0.2356, pruned_loss=0.03906, over 24499.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2387, pruned_loss=0.03937, over 4725827.10 frames. ], batch size: 63, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 18:01:38,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1356740.0, ans=0.2 2023-10-03 18:01:41,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:01:42,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:01:42,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:01:42,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 18:01:42,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 18:01:42,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 18:01:42,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 18:01:47,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:01:49,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:01:49,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:01:49,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:01:52,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:01:52,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 18:01:54,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:01:54,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:01:54,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:01:54,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:01:56,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 18:01:56,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 18:02:02,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:02:03,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:02:05,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1356806.6666666667, ans=0.125 2023-10-03 18:02:11,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 18:02:12,616 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.940e+02 2.128e+02 2.413e+02 3.924e+02, threshold=4.257e+02, percent-clipped=0.0 2023-10-03 18:02:12,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:02:12,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1356873.3333333333, ans=0.125 2023-10-03 18:02:13,347 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.52 vs. limit=15.0 2023-10-03 18:02:14,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 18:02:16,838 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.71 vs. limit=22.5 2023-10-03 18:02:18,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:02:20,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:02:20,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:02:22,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:02:22,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:02:22,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:02:25,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:02:25,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:02:27,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:02:27,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:02:28,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:02:28,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:02:31,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:02:33,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 18:02:35,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:02:35,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 18:02:36,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 18:02:36,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 18:02:36,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:02:37,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:02:37,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:02:38,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:02:38,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 18:02:43,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:02:44,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:02:45,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:02:47,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 18:02:51,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:02:51,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:02:51,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 18:02:52,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:02:52,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:02:52,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:02:53,331 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.13 vs. limit=15.0 2023-10-03 18:02:53,993 INFO [train.py:1046] (1/4) Epoch 39, batch 1700, loss[loss=0.1508, simple_loss=0.2283, pruned_loss=0.0366, over 24574.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2382, pruned_loss=0.03934, over 4720546.46 frames. ], batch size: 60, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 18:02:55,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:02:55,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:02:55,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 18:02:56,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:02:59,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1357073.3333333333, ans=0.05 2023-10-03 18:03:03,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:03:07,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:03:14,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:03:14,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:03:14,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:03:15,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:03:17,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 18:03:19,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:03:20,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:03:21,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:03:23,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:03:24,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 18:03:26,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 18:03:27,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:03:27,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1357206.6666666667, ans=0.2 2023-10-03 18:03:28,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 18:03:30,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:03:32,424 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.64 vs. limit=15.0 2023-10-03 18:03:37,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:03:39,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:03:39,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:03:40,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:03:40,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 18:03:42,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:03:42,917 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=15.0 2023-10-03 18:03:44,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:03:44,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 18:03:46,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:03:46,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:03:47,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:03:47,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:03:51,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:03:51,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:03:52,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:03:52,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:03:54,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:03:55,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1357340.0, ans=0.125 2023-10-03 18:03:58,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:03:59,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 18:03:59,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:04:02,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:04:02,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 18:04:06,629 INFO [train.py:1046] (1/4) Epoch 39, batch 1750, loss[loss=0.1652, simple_loss=0.2543, pruned_loss=0.03809, over 24445.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2363, pruned_loss=0.03908, over 4702629.24 frames. ], batch size: 69, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:04:09,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:04:12,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:04:12,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:04:14,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 18:04:14,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:04:17,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:04:17,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:04:20,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 18:04:24,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:04:26,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 18:04:26,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:04:26,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:04:29,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 18:04:31,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 18:04:32,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:04:32,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 18:04:34,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1357473.3333333333, ans=0.125 2023-10-03 18:04:39,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:04:41,282 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.897e+02 2.069e+02 2.382e+02 4.230e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-03 18:04:41,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1357540.0, ans=0.0 2023-10-03 18:04:44,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:04:44,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:04:47,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1357540.0, ans=0.0 2023-10-03 18:04:48,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:04:48,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:04:50,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:04:52,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:04:53,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:04:53,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:04:55,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 18:04:57,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:05:00,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 18:05:00,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1357606.6666666667, ans=0.125 2023-10-03 18:05:01,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:05:03,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:05:03,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:05:06,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:05:07,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 18:05:07,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:05:10,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:05:13,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:05:16,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:05:18,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:05:18,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 18:05:18,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:05:21,285 INFO [train.py:1046] (1/4) Epoch 39, batch 1800, loss[loss=0.1377, simple_loss=0.2163, pruned_loss=0.02953, over 24548.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2361, pruned_loss=0.03891, over 4709712.50 frames. ], batch size: 60, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:05:21,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:05:21,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:05:21,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:05:21,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:05:21,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:05:26,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:05:27,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:05:28,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:05:31,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:05:33,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:05:34,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:05:37,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:05:38,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:05:40,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:05:40,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:05:41,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:05:42,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 18:05:43,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:05:48,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:05:50,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 18:05:53,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 18:05:53,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 18:05:53,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:05:55,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:05:55,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:05:57,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:06:04,134 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 18:06:05,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:06:06,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:06:07,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 18:06:08,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 18:06:08,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:06:09,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:06:11,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:06:11,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1357940.0, ans=0.0 2023-10-03 18:06:11,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1357940.0, ans=0.0 2023-10-03 18:06:15,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 18:06:19,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1358006.6666666667, ans=0.1 2023-10-03 18:06:22,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:06:24,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 18:06:24,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:06:24,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:06:25,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:06:25,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 18:06:27,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:06:28,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:06:31,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 18:06:31,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:06:34,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:06:34,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:06:34,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:06:35,304 INFO [train.py:1046] (1/4) Epoch 39, batch 1850, loss[loss=0.1646, simple_loss=0.2315, pruned_loss=0.04879, over 23847.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2367, pruned_loss=0.03894, over 4715319.38 frames. ], batch size: 212, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:06:36,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:06:38,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:06:40,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:06:40,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:06:43,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:06:43,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:06:50,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:06:51,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 18:06:54,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 18:06:57,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 18:07:00,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:07:00,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 18:07:00,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 18:07:08,965 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.977e+02 2.198e+02 2.562e+02 3.885e+02, threshold=4.397e+02, percent-clipped=0.0 2023-10-03 18:07:11,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:07:13,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 18:07:13,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1358206.6666666667, ans=0.1 2023-10-03 18:07:16,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:07:16,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:07:19,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 18:07:20,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:20,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:07:23,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:07:24,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:07:25,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:07:28,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:07:30,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:30,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 18:07:30,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:07:30,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1358273.3333333333, ans=0.125 2023-10-03 18:07:31,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:07:33,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:07:35,493 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.54 vs. limit=15.0 2023-10-03 18:07:36,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 18:07:37,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:07:40,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:07:40,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:07:40,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 18:07:40,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 18:07:43,176 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 18:07:43,257 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 18:07:44,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:07:44,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:07:44,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:07:44,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:46,053 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 18:07:46,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:07:46,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1358340.0, ans=0.125 2023-10-03 18:07:47,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:47,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:07:48,762 INFO [train.py:1046] (1/4) Epoch 39, batch 1900, loss[loss=0.1581, simple_loss=0.2428, pruned_loss=0.03669, over 24085.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2371, pruned_loss=0.03883, over 4716319.58 frames. ], batch size: 80, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:07:48,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:07:50,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:07:51,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 18:07:55,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:55,279 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 18:07:55,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:07:56,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:08:01,818 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.49 vs. limit=15.0 2023-10-03 18:08:03,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:08:06,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:08:07,392 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 18:08:08,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 18:08:10,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:08:10,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:08:10,266 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 18:08:10,291 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 18:08:11,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1358473.3333333333, ans=0.125 2023-10-03 18:08:13,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 18:08:14,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:08:17,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 18:08:18,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 18:08:28,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 18:08:30,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 18:08:30,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:08:32,581 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 18:08:32,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 18:08:33,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 18:08:35,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 18:08:35,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:08:39,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 18:08:41,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:08:44,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:08:44,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 18:08:46,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:08:48,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 18:08:49,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:08:55,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:08:55,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:08:55,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:08:56,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:08:58,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:08:59,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:09:00,050 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.02 vs. limit=15.0 2023-10-03 18:09:00,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:09:02,734 INFO [train.py:1046] (1/4) Epoch 39, batch 1950, loss[loss=0.1406, simple_loss=0.2181, pruned_loss=0.03159, over 24412.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2377, pruned_loss=0.03873, over 4717593.38 frames. ], batch size: 58, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:09:04,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:09:04,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:09:06,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:09:06,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:09:06,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:09:07,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:09:11,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:09:13,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:09:14,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:09:14,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:09:16,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 18:09:16,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 18:09:17,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:09:18,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:09:20,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:09:21,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:09:21,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:09:23,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:09:26,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:09:26,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:09:26,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:09:28,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:09:30,353 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.56 vs. limit=10.0 2023-10-03 18:09:30,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:09:34,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:09:34,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:09:34,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:09:34,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 18:09:35,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:09:35,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:09:35,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:09:37,504 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 1.953e+02 2.190e+02 2.399e+02 3.415e+02, threshold=4.380e+02, percent-clipped=0.0 2023-10-03 18:09:39,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:09:41,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:09:45,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:09:48,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:09:48,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:09:48,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 18:09:49,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:09:53,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:09:54,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:09:55,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:10:01,678 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1359006.6666666667, ans=0.125 2023-10-03 18:10:04,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:10:05,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:10:08,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:10:10,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:10:12,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:10:12,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:10:14,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 18:10:14,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:10:15,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:10:15,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 18:10:17,166 INFO [train.py:1046] (1/4) Epoch 39, batch 2000, loss[loss=0.1639, simple_loss=0.2334, pruned_loss=0.04718, over 23367.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2374, pruned_loss=0.03899, over 4714262.20 frames. ], batch size: 285, lr: 2.61e-03, grad_scale: 32.0 2023-10-03 18:10:17,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:10:17,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=1359073.3333333333, ans=0.95 2023-10-03 18:10:19,366 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.23 vs. limit=15.0 2023-10-03 18:10:20,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:10:20,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:10:21,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:10:22,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:10:24,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:10:28,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 18:10:28,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:10:31,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1359140.0, ans=0.125 2023-10-03 18:10:32,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:10:33,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 18:10:34,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1359140.0, ans=0.1 2023-10-03 18:10:35,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:10:35,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:10:38,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:10:39,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 18:10:41,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:10:43,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:10:43,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:10:45,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 18:10:46,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:10:48,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 18:10:48,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:10:48,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1359206.6666666667, ans=0.125 2023-10-03 18:10:50,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:10:51,577 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.52 vs. limit=10.0 2023-10-03 18:10:52,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 18:10:52,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:10:52,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:10:53,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:10:53,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 18:10:58,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 18:10:58,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:10:58,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:01,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:11:03,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:11:03,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:11:03,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1359273.3333333333, ans=0.0 2023-10-03 18:11:04,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:11:06,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:11:08,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:11:08,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:11:08,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:11:08,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:10,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:11:11,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1359273.3333333333, ans=0.125 2023-10-03 18:11:12,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 18:11:15,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:11:16,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:20,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:20,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:11:25,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:25,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1359340.0, ans=0.125 2023-10-03 18:11:26,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:11:26,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:28,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 18:11:28,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:11:29,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:29,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:30,758 INFO [train.py:1046] (1/4) Epoch 39, batch 2050, loss[loss=0.1525, simple_loss=0.2196, pruned_loss=0.04269, over 23804.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2368, pruned_loss=0.03893, over 4705569.54 frames. ], batch size: 164, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:11:34,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:11:35,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:41,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:11:43,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:11:44,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:44,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:11:45,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 18:11:45,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:11:48,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:11:48,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:11:57,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:11:57,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:58,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 18:11:59,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:12:01,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 18:12:01,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:12:06,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:12:07,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:12:08,629 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.929e+02 2.107e+02 2.301e+02 3.275e+02, threshold=4.215e+02, percent-clipped=0.0 2023-10-03 18:12:08,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:12:10,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:12:12,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:12:12,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:12:12,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:12:16,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:12:18,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:12:21,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:12:22,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:12:24,532 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.44 vs. limit=10.0 2023-10-03 18:12:25,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:12:28,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1359673.3333333333, ans=0.125 2023-10-03 18:12:31,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:12:32,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 18:12:37,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:12:39,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:12:39,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1359673.3333333333, ans=0.1 2023-10-03 18:12:41,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:12:42,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 18:12:45,288 INFO [train.py:1046] (1/4) Epoch 39, batch 2100, loss[loss=0.1601, simple_loss=0.2352, pruned_loss=0.04247, over 19011.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2357, pruned_loss=0.0386, over 4704450.30 frames. ], batch size: 41, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:12:46,595 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 18:12:46,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:12:46,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:12:46,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:12:48,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:12:48,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 18:12:48,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 18:12:50,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:12:53,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:12:54,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:12:57,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:12:57,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:12:57,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 18:12:58,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:12:59,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 18:12:59,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 18:13:01,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:13:01,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1359806.6666666667, ans=0.125 2023-10-03 18:13:02,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:13:02,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 18:13:02,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 18:13:09,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 18:13:09,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:13:11,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:13:11,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:13:14,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:13:15,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 18:13:15,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:13:15,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1359873.3333333333, ans=0.0 2023-10-03 18:13:16,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 18:13:18,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 18:13:19,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:13:19,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 18:13:19,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 18:13:20,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 18:13:22,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:13:23,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:13:26,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:13:28,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:13:29,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:13:31,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:13:31,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 18:13:31,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:13:31,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:13:33,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:13:33,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 18:13:34,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 18:13:35,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 18:13:41,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:13:41,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1359940.0, ans=0.125 2023-10-03 18:13:46,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:13:46,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 18:13:48,506 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.91 vs. limit=22.5 2023-10-03 18:13:49,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1360006.6666666667, ans=0.2 2023-10-03 18:13:52,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:13:53,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:13:54,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:13:54,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:13:54,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 18:13:54,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:13:57,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:13:57,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:13:59,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:13:59,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:00,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 18:14:02,217 INFO [train.py:1046] (1/4) Epoch 39, batch 2150, loss[loss=0.1504, simple_loss=0.2296, pruned_loss=0.03562, over 23568.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2346, pruned_loss=0.03837, over 4701594.55 frames. ], batch size: 135, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:14:02,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 18:14:02,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:14:05,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:14:05,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:14:05,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:14:06,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:14:10,688 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=12.23 vs. limit=15.0 2023-10-03 18:14:11,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 18:14:13,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:14:13,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:16,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:14:16,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:16,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:14:18,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:19,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:14:19,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:14:22,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1360140.0, ans=0.0 2023-10-03 18:14:23,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:23,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 18:14:28,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:14:29,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:14:31,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:31,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:14:31,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:32,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:14:32,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:14:32,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:14:33,068 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.64 vs. limit=22.5 2023-10-03 18:14:33,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:14:33,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 18:14:35,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:14:37,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:37,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:14:39,433 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.915e+02 2.126e+02 2.508e+02 4.642e+02, threshold=4.251e+02, percent-clipped=1.0 2023-10-03 18:14:39,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:14:42,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:14:44,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:46,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:14:47,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:14:47,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 18:14:47,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:14:50,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:14:50,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:53,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:14:53,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:14:53,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:14:54,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:54,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 18:14:56,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 18:14:56,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:14:57,494 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 18:14:57,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:14:58,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:14:58,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 18:14:58,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:14:58,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 18:15:00,194 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 18:15:00,194 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 18:15:00,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 18:15:02,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:15:02,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:15:02,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:15:03,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:04,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:15:06,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:15:06,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:14,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1360340.0, ans=0.125 2023-10-03 18:15:15,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:15:15,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 18:15:17,332 INFO [train.py:1046] (1/4) Epoch 39, batch 2200, loss[loss=0.1515, simple_loss=0.2398, pruned_loss=0.03162, over 24448.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2349, pruned_loss=0.03831, over 4704348.72 frames. ], batch size: 66, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:15:20,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:15:24,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:25,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:15:26,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:15:27,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:15:29,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:15:29,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:15:29,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 18:15:30,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1360473.3333333333, ans=0.05 2023-10-03 18:15:33,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 18:15:34,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 18:15:40,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 18:15:43,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:45,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:15:47,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:15:50,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:15:50,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 18:15:54,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:15:54,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:54,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1360540.0, ans=0.2 2023-10-03 18:15:56,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 18:15:57,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1360540.0, ans=0.04949747468305833 2023-10-03 18:15:58,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:16:00,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:16:02,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:16:03,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:16:04,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 18:16:06,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:16:06,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 18:16:08,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:16:08,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:16:08,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:16:10,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:16:12,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:16:12,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:16:12,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:16:12,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:16:14,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:16:15,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:16:20,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 18:16:20,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:16:22,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:16:22,859 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 18:16:24,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:16:25,645 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 18:16:25,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1360673.3333333333, ans=0.0 2023-10-03 18:16:26,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 18:16:27,000 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 18:16:29,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:16:29,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:16:31,496 INFO [train.py:1046] (1/4) Epoch 39, batch 2250, loss[loss=0.1657, simple_loss=0.2397, pruned_loss=0.04583, over 22803.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2353, pruned_loss=0.03847, over 4700922.82 frames. ], batch size: 322, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:16:31,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:16:32,989 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 18:16:33,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1360740.0, ans=0.07 2023-10-03 18:16:34,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:16:35,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:16:36,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1360740.0, ans=0.0 2023-10-03 18:16:42,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:16:44,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:16:47,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:16:48,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:16:50,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:16:53,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 18:16:53,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:16:54,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:16:57,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 18:16:57,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:16:57,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:16:57,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1360806.6666666667, ans=0.2 2023-10-03 18:16:58,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:17:01,419 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.43 vs. limit=15.0 2023-10-03 18:17:03,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1360873.3333333333, ans=0.125 2023-10-03 18:17:06,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:17:07,582 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.878e+02 2.042e+02 2.352e+02 3.262e+02, threshold=4.083e+02, percent-clipped=0.0 2023-10-03 18:17:07,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 18:17:07,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:17:09,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 18:17:10,078 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.86 vs. limit=15.0 2023-10-03 18:17:10,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:17:11,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:17:16,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:17:18,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:17:19,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:17:19,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:17:22,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:17:24,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:17:27,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:17:29,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:17:34,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:17:34,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:17:35,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:17:37,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1361006.6666666667, ans=0.2 2023-10-03 18:17:39,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 18:17:41,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:17:41,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 18:17:41,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:17:43,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:17:44,279 INFO [train.py:1046] (1/4) Epoch 39, batch 2300, loss[loss=0.1505, simple_loss=0.2318, pruned_loss=0.03464, over 23572.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2363, pruned_loss=0.03852, over 4712682.85 frames. ], batch size: 149, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:17:44,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 18:17:44,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1361073.3333333333, ans=0.07 2023-10-03 18:17:46,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1361073.3333333333, ans=0.125 2023-10-03 18:17:47,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:17:49,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:17:55,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:17:55,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:17:57,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1361073.3333333333, ans=0.0 2023-10-03 18:17:58,206 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 18:17:59,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:18:05,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:18:06,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 18:18:06,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:18:06,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:18:06,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 18:18:08,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:18:09,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:18:09,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:18:12,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1361206.6666666667, ans=0.0 2023-10-03 18:18:12,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1361206.6666666667, ans=0.0 2023-10-03 18:18:15,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:18:18,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:18:22,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:18:26,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:18:28,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:18:30,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:18:30,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:18:35,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:18:36,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:18:36,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:18:36,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 18:18:40,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 18:18:40,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:18:42,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:18:42,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:18:42,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:18:42,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 18:18:42,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:18:43,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 18:18:43,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:18:43,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:18:43,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 18:18:49,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:18:52,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:18:53,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1361340.0, ans=0.0 2023-10-03 18:18:56,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:18:56,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:18:57,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:18:59,060 INFO [train.py:1046] (1/4) Epoch 39, batch 2350, loss[loss=0.1623, simple_loss=0.2552, pruned_loss=0.03468, over 24536.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2374, pruned_loss=0.03908, over 4702490.56 frames. ], batch size: 71, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:19:00,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:19:00,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:19:00,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:19:00,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 18:19:04,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:19:04,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 18:19:05,486 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.39 vs. limit=15.0 2023-10-03 18:19:10,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 18:19:12,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:19:15,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:19:15,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:19:16,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:19:16,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:19:18,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 18:19:20,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:19:26,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 18:19:27,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:19:31,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:19:31,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:19:33,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1361540.0, ans=0.2 2023-10-03 18:19:34,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:19:35,813 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.969e+02 2.119e+02 2.541e+02 4.388e+02, threshold=4.238e+02, percent-clipped=2.0 2023-10-03 18:19:35,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 18:19:37,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:19:37,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:19:39,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:19:39,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:19:42,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:19:43,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 18:19:43,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:19:45,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1361606.6666666667, ans=0.0 2023-10-03 18:19:46,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:19:46,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:19:49,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 18:19:49,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:19:51,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1361606.6666666667, ans=0.0 2023-10-03 18:19:54,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 18:19:54,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:19:58,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 18:20:01,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 18:20:02,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:20:02,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 18:20:02,770 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 18:20:02,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 18:20:05,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 18:20:07,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:20:10,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1361673.3333333333, ans=0.125 2023-10-03 18:20:11,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:20:13,066 INFO [train.py:1046] (1/4) Epoch 39, batch 2400, loss[loss=0.1314, simple_loss=0.186, pruned_loss=0.03838, over 19601.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.237, pruned_loss=0.03869, over 4700754.52 frames. ], batch size: 388, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:20:14,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:20:16,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:20:17,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 18:20:18,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 18:20:22,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:20:22,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:20:26,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 18:20:26,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:20:29,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:20:30,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 18:20:32,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1361806.6666666667, ans=0.1 2023-10-03 18:20:33,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:20:37,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 18:20:42,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:20:42,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1361873.3333333333, ans=0.1 2023-10-03 18:20:46,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 18:20:48,091 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.83 vs. limit=22.5 2023-10-03 18:20:48,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:20:49,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:20:56,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:20:57,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 18:20:58,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:21:01,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:21:02,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1361940.0, ans=0.1 2023-10-03 18:21:03,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:21:05,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:21:06,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:21:06,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 18:21:06,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:21:06,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:21:07,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:21:07,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:21:10,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1362006.6666666667, ans=0.125 2023-10-03 18:21:13,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:21:13,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:21:13,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 18:21:13,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1362006.6666666667, ans=0.0 2023-10-03 18:21:14,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 18:21:16,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:21:16,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:21:18,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 18:21:18,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 18:21:18,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 18:21:18,217 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 18:21:18,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1362006.6666666667, ans=0.2 2023-10-03 18:21:19,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 18:21:21,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:21:21,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:21:21,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:21:22,846 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 18:21:24,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:21:25,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 18:21:27,226 INFO [train.py:1046] (1/4) Epoch 39, batch 2450, loss[loss=0.1723, simple_loss=0.2614, pruned_loss=0.04155, over 24536.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2359, pruned_loss=0.03843, over 4693083.81 frames. ], batch size: 71, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:21:29,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1362073.3333333333, ans=0.0 2023-10-03 18:21:30,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:21:30,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:21:30,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1362073.3333333333, ans=0.125 2023-10-03 18:21:30,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1362073.3333333333, ans=0.1 2023-10-03 18:21:34,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:21:34,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:21:35,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 18:21:41,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:21:41,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:21:44,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:21:44,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:21:44,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:21:44,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1362140.0, ans=0.1 2023-10-03 18:21:45,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 18:21:50,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:21:53,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:21:53,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:21:58,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:21:58,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:22:00,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:22:01,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:22:03,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 18:22:03,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:22:04,682 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.889e+02 2.081e+02 2.399e+02 4.481e+02, threshold=4.162e+02, percent-clipped=1.0 2023-10-03 18:22:10,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:22:12,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:22:12,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:22:12,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:22:12,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:22:13,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:22:14,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 18:22:17,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:22:17,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:22:20,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:22:20,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:22:24,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:22:25,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 18:22:25,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:22:29,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:22:29,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 18:22:29,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:22:30,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:22:34,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:22:36,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:22:37,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:22:39,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1362406.6666666667, ans=0.0 2023-10-03 18:22:39,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1362406.6666666667, ans=0.04949747468305833 2023-10-03 18:22:40,305 INFO [train.py:1046] (1/4) Epoch 39, batch 2500, loss[loss=0.1542, simple_loss=0.2368, pruned_loss=0.0358, over 24303.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2356, pruned_loss=0.03831, over 4703081.82 frames. ], batch size: 61, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:22:40,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 18:22:41,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:22:48,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:22:49,166 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.07 vs. limit=12.0 2023-10-03 18:22:55,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:22:55,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:22:57,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:22:57,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 18:23:04,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:23:05,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:23:05,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 18:23:05,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 18:23:07,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 18:23:08,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:23:08,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:23:09,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 18:23:09,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:23:09,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 18:23:09,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:23:11,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1362540.0, ans=0.125 2023-10-03 18:23:14,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:23:14,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:23:16,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:23:18,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 18:23:18,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:23:20,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:23:23,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:23:27,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:23:27,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=1362606.6666666667, ans=15.0 2023-10-03 18:23:30,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:23:35,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:23:36,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 18:23:36,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:23:36,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:23:38,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:23:38,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:23:38,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1362673.3333333333, ans=0.2 2023-10-03 18:23:41,787 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 18:23:41,787 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 18:23:41,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 18:23:42,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1362673.3333333333, ans=0.0 2023-10-03 18:23:43,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:23:44,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1362673.3333333333, ans=0.1 2023-10-03 18:23:45,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 18:23:45,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 18:23:46,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:23:46,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 18:23:50,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 18:23:55,079 INFO [train.py:1046] (1/4) Epoch 39, batch 2550, loss[loss=0.1643, simple_loss=0.248, pruned_loss=0.04026, over 24552.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2365, pruned_loss=0.03864, over 4701351.64 frames. ], batch size: 71, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:23:55,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:23:56,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:23:56,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:23:58,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:23:59,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 18:23:59,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:24:04,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 18:24:05,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:24:08,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:09,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:24:09,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 18:24:10,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 18:24:11,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:24:11,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:24:15,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:24:15,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 18:24:15,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:24:15,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:15,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 18:24:24,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1362873.3333333333, ans=0.125 2023-10-03 18:24:27,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:24:33,652 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.933e+02 2.149e+02 2.358e+02 3.387e+02, threshold=4.299e+02, percent-clipped=0.0 2023-10-03 18:24:33,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:24:33,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:33,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:24:35,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:24:38,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1362940.0, ans=0.125 2023-10-03 18:24:42,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:24:45,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 18:24:46,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:24:46,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:24:46,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:24:46,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:24:50,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:24:50,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:55,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:24:55,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 18:24:55,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:24:56,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:58,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:24:59,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:25:01,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:25:06,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:25:08,879 INFO [train.py:1046] (1/4) Epoch 39, batch 2600, loss[loss=0.1602, simple_loss=0.2401, pruned_loss=0.0402, over 23324.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2367, pruned_loss=0.03841, over 4723980.18 frames. ], batch size: 285, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:25:08,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:25:09,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1363073.3333333333, ans=0.1 2023-10-03 18:25:10,490 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 18:25:13,323 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 18:25:14,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:25:14,540 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 18:25:14,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 18:25:14,625 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 18:25:17,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:25:17,880 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 18:25:19,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 18:25:19,375 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 18:25:22,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:25:25,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 18:25:26,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 18:25:27,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:25:28,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 18:25:29,419 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 18:25:29,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 18:25:40,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:25:40,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:25:40,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:25:40,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 18:25:42,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:25:45,154 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.87 vs. limit=22.5 2023-10-03 18:25:47,799 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 18:25:51,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1363206.6666666667, ans=0.125 2023-10-03 18:25:52,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1363273.3333333333, ans=0.0 2023-10-03 18:25:53,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:25:53,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:25:53,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 18:25:55,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:25:55,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:25:55,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 18:25:57,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:25:57,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:25:58,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:26:02,941 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 18:26:02,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:26:02,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:26:09,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:26:10,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:26:11,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 18:26:12,466 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.52 vs. limit=15.0 2023-10-03 18:26:13,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:26:14,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:26:15,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:26:22,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 18:26:23,434 INFO [train.py:1046] (1/4) Epoch 39, batch 2650, loss[loss=0.1641, simple_loss=0.2554, pruned_loss=0.03637, over 24639.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2376, pruned_loss=0.03881, over 4722927.25 frames. ], batch size: 68, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:26:23,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:26:23,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:26:28,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 18:26:28,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:26:29,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:26:31,016 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 18:26:31,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:26:32,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:26:32,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:26:34,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:26:35,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1363406.6666666667, ans=0.125 2023-10-03 18:26:36,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1363406.6666666667, ans=0.1 2023-10-03 18:26:37,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:26:38,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 18:26:38,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:26:39,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:26:40,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 18:26:41,992 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 18:26:42,521 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.15 vs. limit=15.0 2023-10-03 18:26:44,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:26:49,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 18:26:49,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:26:50,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 18:26:52,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:26:52,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:26:52,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:26:53,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:26:55,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 18:26:55,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 18:26:58,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:26:58,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1363540.0, ans=0.0 2023-10-03 18:27:01,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 18:27:01,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:27:02,907 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 2.015e+02 2.271e+02 2.609e+02 3.479e+02, threshold=4.541e+02, percent-clipped=0.0 2023-10-03 18:27:03,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:04,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:27:04,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:27:06,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:27:07,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:27:09,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:27:10,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:27:11,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:27:13,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:27:14,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:27:15,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:27:16,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:27:16,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1363606.6666666667, ans=0.0 2023-10-03 18:27:17,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:27:19,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 18:27:20,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:22,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:27:22,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:27:22,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 18:27:24,296 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.53 vs. limit=15.0 2023-10-03 18:27:26,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:27:28,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:28,605 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.48 vs. limit=15.0 2023-10-03 18:27:29,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:29,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1363673.3333333333, ans=0.125 2023-10-03 18:27:31,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:27:32,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:27:32,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:27:36,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:27:36,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 18:27:38,192 INFO [train.py:1046] (1/4) Epoch 39, batch 2700, loss[loss=0.1474, simple_loss=0.2255, pruned_loss=0.0346, over 20206.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2381, pruned_loss=0.03931, over 4706208.10 frames. ], batch size: 44, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:27:39,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:27:41,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 18:27:43,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:27:43,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:27:44,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:27:45,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:27:45,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:27:46,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:27:46,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:27:47,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 18:27:48,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:27:49,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:27:50,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:27:50,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:54,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:27:54,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 18:27:56,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:28:00,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:28:00,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:28:07,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1363873.3333333333, ans=0.0 2023-10-03 18:28:08,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:28:08,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:28:09,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:28:09,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:28:12,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:28:15,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:28:15,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:28:15,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:28:17,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1363873.3333333333, ans=0.2 2023-10-03 18:28:19,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:28:19,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:28:19,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1363873.3333333333, ans=0.0 2023-10-03 18:28:19,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1363873.3333333333, ans=0.1 2023-10-03 18:28:22,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1363940.0, ans=0.125 2023-10-03 18:28:26,120 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:28:27,967 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.33 vs. limit=15.0 2023-10-03 18:28:28,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:28:28,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:28:33,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:28:33,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:28:36,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:28:37,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:28:38,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:28:39,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:28:40,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:28:40,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:28:43,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:28:44,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:28:44,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:28:48,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 18:28:49,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:28:52,251 INFO [train.py:1046] (1/4) Epoch 39, batch 2750, loss[loss=0.1749, simple_loss=0.2452, pruned_loss=0.05236, over 23729.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2369, pruned_loss=0.03898, over 4712515.49 frames. ], batch size: 179, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:28:52,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:28:52,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 18:28:55,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 18:28:55,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:28:57,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:28:59,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:29:01,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:01,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:29:01,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:02,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1364073.3333333333, ans=0.125 2023-10-03 18:29:05,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:29:07,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 18:29:07,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:29:07,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:07,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 18:29:07,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:29:07,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:29:13,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 18:29:14,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:29:14,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:14,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:29:16,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 18:29:16,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:29:16,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=1364140.0, ans=10.0 2023-10-03 18:29:19,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:29:19,649 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:29:20,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:29:20,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:29:22,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:29:22,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 18:29:22,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:29:23,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:23,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 18:29:28,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1364206.6666666667, ans=0.2 2023-10-03 18:29:28,514 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.46 vs. limit=22.5 2023-10-03 18:29:31,659 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.877e+02 2.030e+02 2.207e+02 3.015e+02, threshold=4.060e+02, percent-clipped=0.0 2023-10-03 18:29:31,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:29:33,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:29:33,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:29:37,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:37,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:29:37,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1364273.3333333333, ans=0.2 2023-10-03 18:29:39,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:29:45,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:29:45,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:29:45,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 18:29:47,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1364273.3333333333, ans=0.0 2023-10-03 18:29:49,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1364273.3333333333, ans=0.2 2023-10-03 18:29:50,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:29:52,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 18:29:52,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1364340.0, ans=0.1 2023-10-03 18:29:56,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1364340.0, ans=0.0 2023-10-03 18:29:57,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 18:29:59,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:29:59,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 18:29:59,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:30:01,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:30:02,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1364340.0, ans=15.0 2023-10-03 18:30:02,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 18:30:02,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:30:05,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 18:30:05,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:30:06,861 INFO [train.py:1046] (1/4) Epoch 39, batch 2800, loss[loss=0.1399, simple_loss=0.1916, pruned_loss=0.04411, over 19222.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2356, pruned_loss=0.03883, over 4706352.49 frames. ], batch size: 388, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:30:06,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:30:08,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 18:30:08,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:30:08,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:30:11,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:30:11,563 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 18:30:11,563 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 18:30:14,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:30:17,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:30:17,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:30:20,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:30:21,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 18:30:23,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 18:30:24,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 18:30:25,155 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.41 vs. limit=15.0 2023-10-03 18:30:25,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:30:25,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:30:25,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:30:30,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:30:30,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:30:30,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 18:30:32,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:30:38,661 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.80 vs. limit=22.5 2023-10-03 18:30:39,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:30:41,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:30:42,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:30:44,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:30:45,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:30:51,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:30:51,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 18:30:51,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:30:52,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:30:52,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:30:56,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:30:58,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:31:01,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1364606.6666666667, ans=0.07 2023-10-03 18:31:02,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:31:04,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:31:04,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:31:04,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:31:05,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:31:05,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:31:06,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:31:06,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 18:31:07,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:31:07,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:31:07,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:31:09,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 18:31:09,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1364673.3333333333, ans=0.1 2023-10-03 18:31:11,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:31:11,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:31:12,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:31:12,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 18:31:14,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1364673.3333333333, ans=0.125 2023-10-03 18:31:18,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:31:18,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:31:20,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:31:21,329 INFO [train.py:1046] (1/4) Epoch 39, batch 2850, loss[loss=0.1465, simple_loss=0.2326, pruned_loss=0.03021, over 24480.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2357, pruned_loss=0.03881, over 4702078.13 frames. ], batch size: 66, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:31:21,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:31:25,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:31:25,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:31:26,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:31:30,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:31:31,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:31:32,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:31:34,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 18:31:41,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 18:31:41,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:31:43,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 18:31:43,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:31:46,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 18:31:46,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 18:31:47,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:31:47,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1364806.6666666667, ans=0.0 2023-10-03 18:31:52,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1364873.3333333333, ans=0.125 2023-10-03 18:31:53,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1364873.3333333333, ans=0.0 2023-10-03 18:31:55,290 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1364873.3333333333, ans=0.125 2023-10-03 18:31:59,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:32:00,732 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.914e+02 2.254e+02 2.782e+02 3.876e+02, threshold=4.507e+02, percent-clipped=0.0 2023-10-03 18:32:00,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:32:00,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:32:02,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 18:32:02,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:32:03,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:32:05,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:32:05,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 18:32:06,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:32:06,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:32:08,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:32:08,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:32:09,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:32:10,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:32:11,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:32:12,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:32:15,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:32:15,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:32:16,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:32:18,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:32:19,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1365006.6666666667, ans=0.0 2023-10-03 18:32:24,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:32:25,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 18:32:27,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 18:32:28,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:32:28,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:32:28,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 18:32:29,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:32:29,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:32:29,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:32:31,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:32:31,349 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 18:32:31,389 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 18:32:31,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:32:33,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:32:34,615 INFO [train.py:1046] (1/4) Epoch 39, batch 2900, loss[loss=0.1574, simple_loss=0.237, pruned_loss=0.03892, over 23677.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2365, pruned_loss=0.03858, over 4714317.59 frames. ], batch size: 232, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:32:39,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 18:32:39,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:32:40,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:32:40,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 18:32:44,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:32:44,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 18:32:44,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 18:32:47,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:32:47,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:32:48,009 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1365140.0, ans=0.125 2023-10-03 18:32:49,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:32:51,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:32:54,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:32:55,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:32:58,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:32:58,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 18:32:58,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1365140.0, ans=0.2 2023-10-03 18:32:59,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:33:01,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:33:04,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 18:33:04,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 18:33:08,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:33:08,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 18:33:08,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:33:11,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:33:11,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 18:33:13,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:33:13,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1365206.6666666667, ans=0.125 2023-10-03 18:33:14,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:33:16,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:33:20,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:33:22,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 18:33:22,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 18:33:22,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:33:25,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:33:28,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 18:33:28,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:33:30,036 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1365273.3333333333, ans=0.0 2023-10-03 18:33:34,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:33:41,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:33:41,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:33:43,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 18:33:43,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1365340.0, ans=0.2 2023-10-03 18:33:46,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:33:46,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 18:33:47,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:33:47,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:33:49,294 INFO [train.py:1046] (1/4) Epoch 39, batch 2950, loss[loss=0.1612, simple_loss=0.2452, pruned_loss=0.03857, over 23198.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2375, pruned_loss=0.03922, over 4709373.69 frames. ], batch size: 105, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:33:52,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:33:53,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 18:33:53,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1365406.6666666667, ans=0.125 2023-10-03 18:33:55,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:33:55,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:33:55,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:33:56,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:33:59,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 18:33:59,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 18:34:00,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:34:00,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:34:05,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:34:07,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:34:10,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:34:10,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:34:14,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:34:14,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:34:15,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:34:16,591 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.31 vs. limit=22.5 2023-10-03 18:34:17,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:34:17,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:34:18,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 18:34:24,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 18:34:24,179 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 18:34:25,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:34:27,254 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 18:34:28,556 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.923e+02 2.142e+02 2.477e+02 3.548e+02, threshold=4.284e+02, percent-clipped=0.0 2023-10-03 18:34:28,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 18:34:28,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:34:28,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:34:28,709 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 18:34:28,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 18:34:31,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 18:34:32,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:34:32,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:34:35,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:34:36,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:34:37,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:34:37,561 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 18:34:37,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1365606.6666666667, ans=0.125 2023-10-03 18:34:38,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:34:38,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 18:34:43,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:34:45,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:34:46,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 18:34:46,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:34:47,146 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.84 vs. limit=15.0 2023-10-03 18:34:47,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 18:34:51,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:34:51,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:34:52,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:34:54,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:34:54,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 18:34:55,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:34:56,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1365673.3333333333, ans=0.125 2023-10-03 18:34:57,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:34:57,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:34:57,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:34:58,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:34:59,648 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.74 vs. limit=15.0 2023-10-03 18:35:00,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:35:01,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:35:02,760 INFO [train.py:1046] (1/4) Epoch 39, batch 3000, loss[loss=0.1535, simple_loss=0.2497, pruned_loss=0.02861, over 24421.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2388, pruned_loss=0.03955, over 4710580.27 frames. ], batch size: 69, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:35:02,760 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 18:35:14,732 INFO [train.py:1078] (1/4) Epoch 39, validation: loss=0.3532, simple_loss=0.2838, pruned_loss=0.2113, over 1125622.00 frames. 2023-10-03 18:35:14,733 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 18:35:14,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 18:35:15,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1365740.0, ans=0.125 2023-10-03 18:35:16,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:35:19,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:35:19,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:35:23,599 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 18:35:23,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 18:35:25,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:35:27,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:35:27,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 18:35:27,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:35:30,575 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.64 vs. limit=15.0 2023-10-03 18:35:32,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:35:44,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:35:50,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 18:35:52,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:35:54,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:35:54,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:35:54,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:35:58,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:35:58,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 18:35:59,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 18:36:00,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:36:00,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 18:36:02,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:36:02,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:36:03,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:36:03,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:36:06,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:36:06,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:36:06,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:36:08,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:36:11,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 18:36:12,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:36:12,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:36:12,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:36:17,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:36:17,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:36:20,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 18:36:20,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 18:36:20,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:36:20,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 18:36:21,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:36:23,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 18:36:24,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:36:26,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:36:26,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 18:36:28,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 18:36:28,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:36:29,436 INFO [train.py:1046] (1/4) Epoch 39, batch 3050, loss[loss=0.1677, simple_loss=0.2508, pruned_loss=0.0423, over 23655.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2393, pruned_loss=0.03962, over 4718541.38 frames. ], batch size: 85, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:36:29,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:36:30,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:36:30,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:36:30,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:36:32,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:36:32,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1366073.3333333333, ans=0.125 2023-10-03 18:36:33,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 18:36:35,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:36:38,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:36:38,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:36:39,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1366073.3333333333, ans=0.0 2023-10-03 18:36:41,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:36:43,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 18:36:48,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 18:36:48,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 18:36:50,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:36:52,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:36:55,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:36:57,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:36:57,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:37:00,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:37:01,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:37:01,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:37:01,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:37:01,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:37:03,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:37:04,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:37:07,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:37:07,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 18:37:08,979 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.946e+02 2.164e+02 2.475e+02 3.663e+02, threshold=4.327e+02, percent-clipped=0.0 2023-10-03 18:37:09,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:37:09,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:37:09,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1366206.6666666667, ans=0.2 2023-10-03 18:37:11,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:37:11,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:37:13,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:37:13,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:37:20,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:37:20,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:37:24,347 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.66 vs. limit=15.0 2023-10-03 18:37:26,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:37:28,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:37:28,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:37:29,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:37:30,467 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.13 vs. limit=15.0 2023-10-03 18:37:31,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 18:37:31,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:37:31,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 18:37:33,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:37:33,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:37:34,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1366340.0, ans=0.2 2023-10-03 18:37:35,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 18:37:36,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:37:42,850 INFO [train.py:1046] (1/4) Epoch 39, batch 3100, loss[loss=0.1546, simple_loss=0.2406, pruned_loss=0.03426, over 24568.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2388, pruned_loss=0.03959, over 4727925.41 frames. ], batch size: 71, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:37:42,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:37:44,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:37:44,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1366406.6666666667, ans=0.0 2023-10-03 18:37:45,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:37:47,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 18:37:49,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 18:37:51,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 18:37:53,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:37:57,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:37:57,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:37:59,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 18:38:03,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:38:05,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1366473.3333333333, ans=0.2 2023-10-03 18:38:06,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1366473.3333333333, ans=0.2 2023-10-03 18:38:08,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 18:38:08,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1366473.3333333333, ans=0.125 2023-10-03 18:38:12,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 18:38:12,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:14,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:38:14,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:38:15,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 18:38:16,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:38:18,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 18:38:18,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:38:18,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:38:21,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 18:38:23,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:38:25,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:38:25,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 18:38:28,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 18:38:28,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:30,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:38:31,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:38:31,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:31,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:38:33,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:38:33,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:38:36,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:38:37,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:38:37,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:37,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 18:38:37,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1366606.6666666667, ans=0.125 2023-10-03 18:38:40,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:38:42,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 18:38:43,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:38:44,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 18:38:45,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:38:45,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:46,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 18:38:55,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 18:38:56,921 INFO [train.py:1046] (1/4) Epoch 39, batch 3150, loss[loss=0.153, simple_loss=0.2131, pruned_loss=0.04644, over 22639.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2376, pruned_loss=0.0392, over 4728931.16 frames. ], batch size: 322, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:38:58,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:38:58,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:59,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:38:59,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:39:01,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 18:39:01,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:39:01,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 18:39:04,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 18:39:05,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:39:08,633 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 18:39:10,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 18:39:12,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:39:13,376 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 18:39:13,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 18:39:14,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 18:39:16,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 18:39:16,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 18:39:16,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:39:16,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:39:19,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:39:19,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 18:39:22,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:39:22,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:39:24,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:39:24,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:39:29,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 18:39:29,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:39:30,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:39:30,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:39:31,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 18:39:33,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1366873.3333333333, ans=0.0 2023-10-03 18:39:34,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 18:39:34,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:39:35,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 18:39:37,011 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.857e+02 2.083e+02 2.385e+02 4.094e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-03 18:39:37,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 18:39:37,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:39:37,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:39:38,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:39:38,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 18:39:38,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 18:39:38,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1366873.3333333333, ans=0.1 2023-10-03 18:39:39,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:39:39,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:39:41,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:39:41,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:39:43,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 18:39:43,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:39:45,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 18:39:46,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:39:47,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 18:39:47,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 18:39:49,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:39:49,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:39:50,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 18:39:52,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 18:39:53,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:39:56,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:39:58,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:39:58,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:40:02,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:40:03,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:40:06,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 18:40:10,902 INFO [train.py:1046] (1/4) Epoch 39, batch 3200, loss[loss=0.1603, simple_loss=0.2481, pruned_loss=0.03624, over 24658.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2371, pruned_loss=0.03852, over 4739234.98 frames. ], batch size: 73, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:40:10,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:40:10,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:40:14,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:40:15,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:40:15,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 18:40:18,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:40:25,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:40:25,650 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.28 vs. limit=15.0 2023-10-03 18:40:26,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:40:35,579 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.31 vs. limit=15.0 2023-10-03 18:40:36,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:40:45,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 18:40:46,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:40:50,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 18:40:51,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 18:40:54,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:40:54,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:40:56,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:40:58,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 18:41:00,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 18:41:01,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 18:41:04,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 18:41:05,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:41:13,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:41:13,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:41:14,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:41:16,056 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 18:41:16,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:41:18,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:41:22,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 18:41:22,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 18:41:24,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1367406.6666666667, ans=0.125 2023-10-03 18:41:25,939 INFO [train.py:1046] (1/4) Epoch 39, batch 3250, loss[loss=0.1672, simple_loss=0.242, pruned_loss=0.04623, over 23788.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.237, pruned_loss=0.03877, over 4736716.94 frames. ], batch size: 179, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:41:26,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 18:41:27,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 18:41:28,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:41:30,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:41:31,606 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 18:41:31,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:41:31,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:41:32,981 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 18:41:37,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:41:39,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:41:47,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:41:48,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 18:41:48,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:41:49,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:41:49,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:41:51,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:41:51,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:41:54,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:41:54,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:41:54,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:41:55,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:41:55,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:41:55,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:41:58,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:41:59,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:42:01,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:42:02,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:42:03,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:42:03,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:42:04,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:42:05,208 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.905e+02 2.104e+02 2.395e+02 3.285e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-03 18:42:05,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1367540.0, ans=0.125 2023-10-03 18:42:09,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 18:42:09,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:42:09,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:42:10,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:42:12,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:42:18,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:42:22,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1367673.3333333333, ans=0.125 2023-10-03 18:42:23,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:42:25,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:42:25,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 18:42:25,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:42:25,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 18:42:25,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:42:29,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 18:42:29,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 18:42:29,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:42:30,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:42:31,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:42:31,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 18:42:33,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:42:36,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:42:36,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:42:37,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 18:42:37,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:42:38,906 INFO [train.py:1046] (1/4) Epoch 39, batch 3300, loss[loss=0.1544, simple_loss=0.2252, pruned_loss=0.04181, over 19922.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2372, pruned_loss=0.03866, over 4742610.72 frames. ], batch size: 43, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:42:40,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:42:40,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 18:42:41,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:42:41,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 18:42:43,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 18:42:45,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 18:42:46,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:42:46,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1367740.0, ans=0.0 2023-10-03 18:42:49,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:42:49,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1367740.0, ans=0.125 2023-10-03 18:42:49,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1367740.0, ans=0.0 2023-10-03 18:42:50,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:42:50,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:42:53,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 18:42:53,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:42:59,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:42:59,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:43:02,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 18:43:03,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:43:03,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:43:03,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:04,964 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 18:43:05,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:43:06,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 18:43:06,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:43:06,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:43:07,605 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 18:43:11,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:43:11,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:43:13,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:13,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 18:43:14,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 18:43:14,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:16,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:43:19,231 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 18:43:20,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 18:43:20,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:43:22,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 18:43:23,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:43:27,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:43:28,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:43:28,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1367940.0, ans=0.125 2023-10-03 18:43:31,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:43:31,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:43:31,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:43:31,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1367940.0, ans=0.125 2023-10-03 18:43:32,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:43:34,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:43:34,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:35,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:43:37,434 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff2.min_abs, batch_count=1368006.6666666667, ans=0.1 2023-10-03 18:43:38,548 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 18:43:38,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 18:43:40,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 18:43:40,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:43:40,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:43:41,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:43:41,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:43:42,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:43:44,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:43:44,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:43:45,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:46,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:43:47,614 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.95 vs. limit=22.5 2023-10-03 18:43:48,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 18:43:50,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:43:51,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:43:52,790 INFO [train.py:1046] (1/4) Epoch 39, batch 3350, loss[loss=0.1642, simple_loss=0.2368, pruned_loss=0.04582, over 23612.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2384, pruned_loss=0.03938, over 4728257.58 frames. ], batch size: 256, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:43:52,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:43:52,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:43:55,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:43:56,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:43:56,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:01,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:44:02,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:03,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:44:06,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:06,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:44:06,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1368140.0, ans=0.125 2023-10-03 18:44:08,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:44:09,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:44:09,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 18:44:09,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1368140.0, ans=0.2 2023-10-03 18:44:12,125 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 18:44:12,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:44:16,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 18:44:16,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 18:44:17,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:44:17,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:44:18,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:44:18,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 18:44:18,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:18,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:44:21,599 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.20 vs. limit=15.0 2023-10-03 18:44:22,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:24,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:24,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:26,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:44:31,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:44:32,929 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.940e+02 2.156e+02 2.529e+02 3.650e+02, threshold=4.311e+02, percent-clipped=0.0 2023-10-03 18:44:33,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:34,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:44:37,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:44:37,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:40,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:40,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:44:41,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:44:43,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 18:44:43,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:44:44,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 18:44:44,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:44:45,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 18:44:46,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:44:48,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:54,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:44:54,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1368340.0, ans=0.125 2023-10-03 18:44:55,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 18:44:55,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 18:44:56,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:44:58,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:45:04,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:45:06,156 INFO [train.py:1046] (1/4) Epoch 39, batch 3400, loss[loss=0.1548, simple_loss=0.2391, pruned_loss=0.03521, over 24286.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.239, pruned_loss=0.03938, over 4731723.74 frames. ], batch size: 61, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:45:07,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 18:45:07,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:45:07,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:45:09,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:45:09,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 18:45:10,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:45:10,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 18:45:11,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:45:13,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:45:14,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:45:15,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:45:15,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 18:45:17,823 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.32 vs. limit=12.0 2023-10-03 18:45:18,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 18:45:18,885 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 18:45:20,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:45:21,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:45:21,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 18:45:22,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:45:23,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:45:28,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:45:29,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 18:45:35,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:45:37,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:45:38,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:45:38,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 18:45:42,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:45:46,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 18:45:50,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:45:52,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:45:52,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 18:45:53,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:45:53,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:45:54,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:45:54,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:45:59,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:46:03,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:46:03,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:46:05,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:46:07,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 18:46:13,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 18:46:16,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 18:46:18,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1368740.0, ans=0.125 2023-10-03 18:46:19,273 INFO [train.py:1046] (1/4) Epoch 39, batch 3450, loss[loss=0.1593, simple_loss=0.2451, pruned_loss=0.03674, over 24545.00 frames. ], tot_loss[loss=0.16, simple_loss=0.24, pruned_loss=0.04002, over 4723004.71 frames. ], batch size: 71, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:46:20,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 18:46:22,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:46:25,331 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.30 vs. limit=22.5 2023-10-03 18:46:25,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:46:25,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 18:46:27,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:46:31,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:46:36,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:46:36,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:46:37,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.26 vs. limit=15.0 2023-10-03 18:46:38,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:46:38,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:46:39,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:46:41,693 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.70 vs. limit=15.0 2023-10-03 18:46:45,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1368806.6666666667, ans=0.035 2023-10-03 18:46:46,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 18:46:47,295 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.82 vs. limit=12.0 2023-10-03 18:46:47,483 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.29 vs. limit=15.0 2023-10-03 18:46:50,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 18:46:50,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 18:46:50,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:46:53,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:46:59,748 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.732e+02 1.946e+02 2.153e+02 2.434e+02 3.371e+02, threshold=4.307e+02, percent-clipped=0.0 2023-10-03 18:46:59,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 18:46:59,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:47:03,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:47:03,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:47:04,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:47:05,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:47:07,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 18:47:07,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:47:08,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:47:10,944 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.55 vs. limit=15.0 2023-10-03 18:47:11,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:47:14,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 18:47:17,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:47:22,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:47:23,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:47:25,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:47:30,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:47:30,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:47:31,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:47:32,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1369073.3333333333, ans=0.025 2023-10-03 18:47:33,227 INFO [train.py:1046] (1/4) Epoch 39, batch 3500, loss[loss=0.1398, simple_loss=0.1917, pruned_loss=0.04398, over 19585.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2383, pruned_loss=0.03981, over 4720911.94 frames. ], batch size: 390, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:47:33,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:47:37,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:47:37,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1369073.3333333333, ans=0.2 2023-10-03 18:47:41,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:47:43,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 18:47:44,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 18:47:45,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 18:47:48,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:47:48,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 18:47:51,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:47:54,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:47:54,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:47:54,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:47:54,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:47:55,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:47:55,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:47:55,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 18:47:59,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:47:59,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:48:01,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:48:05,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:05,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 18:48:05,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:48:09,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:48:09,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:48:12,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:13,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:48:13,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:48:15,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 18:48:16,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 18:48:16,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 18:48:16,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:48:17,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:19,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:48:19,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:48:20,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1369273.3333333333, ans=0.125 2023-10-03 18:48:23,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:48:24,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:48:24,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1369273.3333333333, ans=0.125 2023-10-03 18:48:31,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:48:33,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 18:48:33,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 18:48:33,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:48:34,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:48:36,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:48:36,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:39,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 18:48:39,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:48:40,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:48:43,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 18:48:44,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 18:48:45,732 INFO [train.py:1046] (1/4) Epoch 39, batch 3550, loss[loss=0.1525, simple_loss=0.2353, pruned_loss=0.03482, over 23701.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.237, pruned_loss=0.03937, over 4717619.02 frames. ], batch size: 85, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:48:45,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:47,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:48:47,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:48:47,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:48:48,214 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.80 vs. limit=15.0 2023-10-03 18:48:50,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:48:58,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:48:59,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 18:49:01,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:49:03,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:49:04,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:49:06,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:49:06,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:49:09,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:49:09,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:49:09,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:49:09,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:49:11,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:49:14,587 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.69 vs. limit=6.0 2023-10-03 18:49:16,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:49:16,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:49:18,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:49:18,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:49:19,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:49:19,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 18:49:19,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:49:19,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:49:21,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 18:49:25,660 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.892e+02 2.071e+02 2.363e+02 4.365e+02, threshold=4.143e+02, percent-clipped=1.0 2023-10-03 18:49:27,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:49:27,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:49:28,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:49:31,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 18:49:31,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:49:33,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 18:49:33,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:49:34,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:49:34,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:49:38,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 18:49:38,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:49:46,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:49:46,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 18:49:47,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:49:51,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:49:52,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 18:49:59,632 INFO [train.py:1046] (1/4) Epoch 39, batch 3600, loss[loss=0.1719, simple_loss=0.2556, pruned_loss=0.04411, over 23331.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2358, pruned_loss=0.03922, over 4711441.54 frames. ], batch size: 93, lr: 2.60e-03, grad_scale: 32.0 2023-10-03 18:49:59,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 18:49:59,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:50:01,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:50:02,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:50:02,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:50:04,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:50:08,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:50:09,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:50:09,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:50:11,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:50:11,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:50:11,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 18:50:15,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:50:16,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:50:19,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:50:19,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1369806.6666666667, ans=0.125 2023-10-03 18:50:22,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:50:23,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:50:24,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:50:24,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 18:50:25,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:50:28,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:50:28,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:50:31,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:50:32,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:50:34,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:50:36,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 18:50:42,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:50:44,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 18:50:44,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 18:50:48,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1369940.0, ans=0.125 2023-10-03 18:50:49,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:50:50,411 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.49 vs. limit=15.0 2023-10-03 18:50:53,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:50:56,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:51:02,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:51:02,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:51:02,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 18:51:03,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 18:51:04,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 18:51:08,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:51:08,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:51:08,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 18:51:09,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:51:09,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:51:09,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:51:10,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 18:51:10,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 18:51:12,184 INFO [train.py:1046] (1/4) Epoch 39, batch 3650, loss[loss=0.1566, simple_loss=0.239, pruned_loss=0.03703, over 23561.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2365, pruned_loss=0.03919, over 4733175.49 frames. ], batch size: 106, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:51:14,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:51:15,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1370073.3333333333, ans=0.125 2023-10-03 18:51:16,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 18:51:17,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1370073.3333333333, ans=0.125 2023-10-03 18:51:20,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 18:51:22,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:51:26,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 18:51:27,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 18:51:31,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:51:31,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:51:31,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:51:34,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:51:34,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:51:36,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 18:51:36,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:51:36,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:51:37,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 18:51:37,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:51:37,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:51:38,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:51:40,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:51:43,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 18:51:43,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 18:51:43,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:51:46,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 18:51:47,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:51:47,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:51:53,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:51:54,269 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.929e+02 2.156e+02 2.406e+02 3.410e+02, threshold=4.311e+02, percent-clipped=0.0 2023-10-03 18:51:55,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:51:55,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:51:57,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:51:59,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:52:00,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:52:03,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:52:05,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:52:05,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:52:07,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:52:08,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:52:09,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:52:15,625 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 18:52:18,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:52:19,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:52:21,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:52:21,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:52:22,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:52:23,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:52:25,223 INFO [train.py:1046] (1/4) Epoch 39, batch 3700, loss[loss=0.1742, simple_loss=0.2412, pruned_loss=0.05356, over 23845.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2375, pruned_loss=0.03937, over 4734177.09 frames. ], batch size: 179, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:52:25,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 18:52:25,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:52:28,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:52:29,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:52:30,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:52:33,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:52:33,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 18:52:33,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:52:35,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 18:52:35,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:52:39,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 18:52:39,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1370473.3333333333, ans=0.0 2023-10-03 18:52:42,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:52:42,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:52:45,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:52:45,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:52:46,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:52:47,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:52:49,316 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 18:52:54,069 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1370540.0, ans=0.1 2023-10-03 18:52:55,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:52:55,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 18:52:56,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:52:56,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 18:52:56,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:53:00,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:53:01,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 18:53:01,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:53:04,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:53:06,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1370540.0, ans=0.0 2023-10-03 18:53:08,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:53:09,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:53:11,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 18:53:13,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:53:15,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 18:53:15,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:53:15,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 18:53:18,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:53:19,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:53:22,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:53:24,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 18:53:24,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:53:24,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:53:24,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:53:25,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:53:28,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:53:30,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 18:53:31,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 18:53:33,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:53:33,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:53:35,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:53:35,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:53:39,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:53:40,452 INFO [train.py:1046] (1/4) Epoch 39, batch 3750, loss[loss=0.1549, simple_loss=0.2297, pruned_loss=0.04003, over 23570.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2386, pruned_loss=0.03965, over 4727226.81 frames. ], batch size: 256, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:53:40,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:53:42,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:53:43,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 18:53:45,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 18:53:46,270 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.88 vs. limit=15.0 2023-10-03 18:53:48,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:53:48,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 18:53:49,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:53:50,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:53:52,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:53:53,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:53:56,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1370806.6666666667, ans=0.125 2023-10-03 18:53:57,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:54:02,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:54:02,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:54:04,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:54:07,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:54:07,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1370806.6666666667, ans=0.125 2023-10-03 18:54:08,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 18:54:10,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:54:11,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:54:11,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:54:14,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 18:54:17,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 18:54:19,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:54:20,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:54:22,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:54:23,397 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.003e+02 2.174e+02 2.569e+02 4.236e+02, threshold=4.347e+02, percent-clipped=0.0 2023-10-03 18:54:27,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:54:29,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:54:31,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 18:54:33,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:54:37,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:54:38,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:54:41,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:54:43,919 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.32 vs. limit=22.5 2023-10-03 18:54:44,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:54:47,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 18:54:49,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:54:49,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:54:50,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:54:54,799 INFO [train.py:1046] (1/4) Epoch 39, batch 3800, loss[loss=0.1401, simple_loss=0.2179, pruned_loss=0.0312, over 24231.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2383, pruned_loss=0.03951, over 4730946.93 frames. ], batch size: 56, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:54:58,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:55:02,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:55:03,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:55:03,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 18:55:05,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:55:07,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:55:08,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:55:11,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 18:55:11,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:55:11,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:55:12,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:55:12,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:55:14,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:55:14,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 18:55:19,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 18:55:19,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:55:20,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:55:23,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:55:23,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 18:55:25,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 18:55:25,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:55:27,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:55:30,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:55:33,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:55:33,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 18:55:36,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:55:38,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1371273.3333333333, ans=0.125 2023-10-03 18:55:42,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:55:43,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1371273.3333333333, ans=0.125 2023-10-03 18:55:45,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1371273.3333333333, ans=0.07 2023-10-03 18:55:49,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:55:51,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 18:55:53,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 18:55:53,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:55:54,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1371340.0, ans=0.1 2023-10-03 18:55:55,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:55:56,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:55:58,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 18:56:01,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 18:56:02,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 18:56:02,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:56:04,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:56:09,369 INFO [train.py:1046] (1/4) Epoch 39, batch 3850, loss[loss=0.147, simple_loss=0.2267, pruned_loss=0.03368, over 23626.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2371, pruned_loss=0.03917, over 4723576.30 frames. ], batch size: 149, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:56:09,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:56:10,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:56:16,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:56:17,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 18:56:17,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:56:19,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:56:22,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:56:23,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:56:26,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:56:27,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 18:56:30,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:34,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:56:35,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:56:35,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:56:39,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:40,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:56:40,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:56:40,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:56:42,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:56:43,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:56:44,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:44,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:56:45,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1371540.0, ans=0.1 2023-10-03 18:56:46,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 18:56:46,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 18:56:46,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:56:47,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=1371540.0, ans=22.5 2023-10-03 18:56:48,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:49,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:56:50,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:50,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 18:56:52,338 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.999e+02 2.176e+02 2.496e+02 3.894e+02, threshold=4.352e+02, percent-clipped=0.0 2023-10-03 18:56:52,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 18:56:55,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:56:56,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 18:56:56,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:56:56,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1371606.6666666667, ans=0.125 2023-10-03 18:57:00,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:57:02,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:57:06,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:57:07,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 18:57:09,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1371673.3333333333, ans=0.025 2023-10-03 18:57:10,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 18:57:12,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:57:12,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:57:15,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:57:15,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:57:15,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:17,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:17,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:57:17,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 18:57:19,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:57:19,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1371673.3333333333, ans=0.125 2023-10-03 18:57:20,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 18:57:20,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:20,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:57:22,169 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:57:23,171 INFO [train.py:1046] (1/4) Epoch 39, batch 3900, loss[loss=0.1665, simple_loss=0.2535, pruned_loss=0.03977, over 24376.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2359, pruned_loss=0.03863, over 4718104.95 frames. ], batch size: 77, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:57:23,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:57:24,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:26,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:57:26,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:57:26,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:57:27,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:57:27,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 18:57:27,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:31,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:57:31,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:57:33,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:57:35,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:57:37,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:57:37,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:40,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:57:40,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 18:57:40,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:57:43,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 18:57:43,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:44,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 18:57:45,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 18:57:47,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1371806.6666666667, ans=0.0 2023-10-03 18:57:49,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:57:50,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1371806.6666666667, ans=0.0 2023-10-03 18:57:51,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:57:51,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:57:53,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:57:57,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:58:00,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:58:02,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:58:02,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:58:04,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:58:04,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1371873.3333333333, ans=0.1 2023-10-03 18:58:07,896 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:58:10,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:58:11,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:58:12,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1371940.0, ans=0.0 2023-10-03 18:58:18,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:58:19,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:58:27,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:58:30,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:58:31,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 18:58:31,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 18:58:31,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:58:34,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 18:58:34,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:58:35,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 18:58:37,060 INFO [train.py:1046] (1/4) Epoch 39, batch 3950, loss[loss=0.157, simple_loss=0.2356, pruned_loss=0.03915, over 21245.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2357, pruned_loss=0.03847, over 4707112.50 frames. ], batch size: 46, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:58:40,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:58:42,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 18:58:42,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:58:45,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:58:46,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:58:51,158 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 18:58:51,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:58:52,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 18:58:52,586 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 18:58:52,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:58:54,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:58:54,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 18:58:54,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:58:57,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 18:59:00,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:59:01,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:59:01,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:59:02,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:59:02,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:59:08,979 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.80 vs. limit=15.0 2023-10-03 18:59:12,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:59:12,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:59:18,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 18:59:21,280 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 1.928e+02 2.107e+02 2.479e+02 4.773e+02, threshold=4.215e+02, percent-clipped=2.0 2023-10-03 18:59:22,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 18:59:22,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 18:59:22,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:59:24,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:59:31,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:59:31,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:59:31,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:59:31,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:59:31,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 18:59:37,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:59:38,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:59:42,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1372340.0, ans=0.125 2023-10-03 18:59:43,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 18:59:50,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1372340.0, ans=0.125 2023-10-03 18:59:50,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1372340.0, ans=0.2 2023-10-03 18:59:52,977 INFO [train.py:1046] (1/4) Epoch 39, batch 4000, loss[loss=0.1672, simple_loss=0.2557, pruned_loss=0.03937, over 24553.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2371, pruned_loss=0.03861, over 4717735.40 frames. ], batch size: 71, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:59:53,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:00:01,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:00:05,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:00:06,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:00:06,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1372473.3333333333, ans=0.0 2023-10-03 19:00:07,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:00:08,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 19:00:09,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:00:10,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 19:00:10,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:00:10,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 19:00:10,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1372473.3333333333, ans=0.1 2023-10-03 19:00:12,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:00:15,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:00:16,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:00:16,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:00:16,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:00:16,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:00:19,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:00:20,782 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 19:00:22,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:00:22,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:00:24,924 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 19:00:25,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:00:25,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:00:32,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 19:00:32,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:00:32,920 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.57 vs. limit=12.0 2023-10-03 19:00:33,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:00:35,226 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 19:00:36,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:00:36,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 19:00:36,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:00:38,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:00:38,991 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.09 vs. limit=15.0 2023-10-03 19:00:39,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:00:41,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:00:41,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1372606.6666666667, ans=0.0 2023-10-03 19:00:42,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:00:42,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:00:43,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 19:00:44,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:00:46,113 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 19:00:50,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:00:53,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 19:00:54,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:00:55,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:00:57,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:00:57,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:01:02,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:01:03,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 19:01:04,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 19:01:06,068 INFO [train.py:1046] (1/4) Epoch 39, batch 4050, loss[loss=0.1561, simple_loss=0.2299, pruned_loss=0.04112, over 23476.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2375, pruned_loss=0.03891, over 4716234.85 frames. ], batch size: 93, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:01:06,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:01:06,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:01:08,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:01:09,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:01:09,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:01:14,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:01:18,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:01:18,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 19:01:21,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:01:21,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:01:25,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:01:27,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:01:29,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 19:01:32,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 19:01:32,559 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 19:01:33,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:01:41,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 19:01:43,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:01:46,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:01:49,653 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.853e+02 2.035e+02 2.339e+02 3.700e+02, threshold=4.069e+02, percent-clipped=0.0 2023-10-03 19:01:49,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:01:49,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:01:49,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:01:53,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:01:57,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 19:01:57,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:01:59,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:02:01,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 19:02:07,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:02:13,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 19:02:13,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1373006.6666666667, ans=0.0 2023-10-03 19:02:15,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:02:15,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:02:15,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 19:02:15,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 19:02:15,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:02:18,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:02:19,747 INFO [train.py:1046] (1/4) Epoch 39, batch 4100, loss[loss=0.1418, simple_loss=0.2196, pruned_loss=0.03203, over 24265.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2381, pruned_loss=0.0389, over 4726491.49 frames. ], batch size: 56, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:02:19,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:02:21,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:02:27,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 19:02:28,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 19:02:30,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 19:02:31,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 19:02:31,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:02:31,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:02:32,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:02:32,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:02:34,359 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 19:02:37,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:02:37,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:02:37,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:02:38,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:02:42,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:02:42,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1373140.0, ans=0.125 2023-10-03 19:02:43,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:02:43,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:02:45,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 19:02:45,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:02:45,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:02:46,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:02:46,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:02:46,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 19:02:47,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1373140.0, ans=0.125 2023-10-03 19:02:48,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=1373206.6666666667, ans=10.0 2023-10-03 19:02:49,617 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.23 vs. limit=10.0 2023-10-03 19:02:50,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:02:51,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 19:02:53,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:02:54,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:02:54,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 19:02:57,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:02:57,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:02:57,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:03:00,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 19:03:01,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1373206.6666666667, ans=0.0 2023-10-03 19:03:02,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:03:02,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:03:03,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1373273.3333333333, ans=10.0 2023-10-03 19:03:05,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 19:03:05,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:03:05,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:03:07,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:03:13,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:03:16,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:03:16,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:03:18,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1373340.0, ans=0.1 2023-10-03 19:03:21,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1373340.0, ans=0.125 2023-10-03 19:03:21,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1373340.0, ans=0.0 2023-10-03 19:03:26,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:03:26,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:03:29,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:03:32,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:03:33,860 INFO [train.py:1046] (1/4) Epoch 39, batch 4150, loss[loss=0.1481, simple_loss=0.2341, pruned_loss=0.03106, over 24477.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2391, pruned_loss=0.03966, over 4708308.06 frames. ], batch size: 63, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:03:35,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:03:36,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:03:38,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:03:38,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:03:39,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 19:03:40,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:03:42,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 19:03:42,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 19:03:42,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 19:03:44,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:03:48,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:03:48,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:03:56,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:03:56,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:03:57,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:03:59,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:03:59,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:04:00,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 19:04:03,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:04:07,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:04:07,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 19:04:08,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 19:04:10,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:04:11,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 19:04:11,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:04:12,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:04:14,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:04:16,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:04:17,650 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.923e+02 2.108e+02 2.478e+02 3.681e+02, threshold=4.216e+02, percent-clipped=0.0 2023-10-03 19:04:19,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 19:04:20,393 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.99 vs. limit=22.5 2023-10-03 19:04:23,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:04:24,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:04:25,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 19:04:27,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:04:27,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 19:04:28,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:04:29,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:04:29,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:04:30,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1373606.6666666667, ans=0.0 2023-10-03 19:04:31,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 19:04:31,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:04:31,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 19:04:32,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:04:34,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1373673.3333333333, ans=0.125 2023-10-03 19:04:35,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 19:04:35,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:04:35,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:04:36,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 19:04:36,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 19:04:36,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:04:38,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 19:04:38,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:04:39,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:04:40,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 19:04:40,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:04:46,930 INFO [train.py:1046] (1/4) Epoch 39, batch 4200, loss[loss=0.163, simple_loss=0.2473, pruned_loss=0.03931, over 23889.00 frames. ], tot_loss[loss=0.158, simple_loss=0.238, pruned_loss=0.03902, over 4715566.06 frames. ], batch size: 80, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:04:47,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:04:49,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 19:04:50,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:04:52,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:04:52,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1373740.0, ans=0.1 2023-10-03 19:04:55,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:04:55,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:04:55,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:04:58,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 19:05:01,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 19:05:01,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:05:02,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:05:06,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:05:09,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 19:05:11,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:05:11,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:05:11,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 19:05:11,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:05:13,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:05:14,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:05:14,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:05:16,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:05:18,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 19:05:18,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:05:22,825 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.45 vs. limit=15.0 2023-10-03 19:05:23,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 19:05:24,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1373873.3333333333, ans=0.04949747468305833 2023-10-03 19:05:25,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:05:26,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:05:28,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:05:29,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:05:31,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 19:05:31,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:05:32,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:05:36,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:05:39,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:05:43,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:05:46,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 19:05:48,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:05:53,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:05:53,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:05:55,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 19:05:59,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:06:01,970 INFO [train.py:1046] (1/4) Epoch 39, batch 4250, loss[loss=0.1542, simple_loss=0.2304, pruned_loss=0.03902, over 23432.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2367, pruned_loss=0.03891, over 4711365.09 frames. ], batch size: 119, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:06:03,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:06:03,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:06:06,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:06:11,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:06:11,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 19:06:13,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:06:15,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:06:18,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:06:20,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1374140.0, ans=0.125 2023-10-03 19:06:26,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:06:26,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:06:27,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:06:27,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:06:28,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:06:30,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:06:31,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:06:32,614 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.27 vs. limit=12.0 2023-10-03 19:06:33,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:06:33,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:06:36,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 19:06:38,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 19:06:38,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:06:39,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:06:40,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:06:40,700 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1374206.6666666667, ans=0.125 2023-10-03 19:06:41,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:06:41,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:06:41,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:06:44,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:06:45,936 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.890e+02 2.070e+02 2.259e+02 3.425e+02, threshold=4.141e+02, percent-clipped=0.0 2023-10-03 19:06:46,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:06:49,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1374273.3333333333, ans=0.2 2023-10-03 19:06:52,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:06:53,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:06:53,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1374273.3333333333, ans=0.09899494936611666 2023-10-03 19:06:53,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1374273.3333333333, ans=0.125 2023-10-03 19:06:54,335 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.35 vs. limit=22.5 2023-10-03 19:06:54,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 19:06:54,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:06:55,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=1374273.3333333333, ans=15.0 2023-10-03 19:06:56,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 19:06:56,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:06:58,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:07:01,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:07:01,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:07:02,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 19:07:04,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:07:04,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:07:08,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:07:10,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:07:12,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:07:13,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:07:13,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:07:15,108 INFO [train.py:1046] (1/4) Epoch 39, batch 4300, loss[loss=0.166, simple_loss=0.2374, pruned_loss=0.04729, over 23800.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2365, pruned_loss=0.03884, over 4715301.73 frames. ], batch size: 164, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:07:15,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:07:15,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1374406.6666666667, ans=0.125 2023-10-03 19:07:16,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:07:16,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 19:07:16,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:07:16,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1374406.6666666667, ans=0.95 2023-10-03 19:07:21,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1374406.6666666667, ans=0.125 2023-10-03 19:07:22,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:07:22,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:07:26,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:07:33,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:07:33,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 19:07:36,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:07:37,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:07:38,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:07:38,871 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 19:07:41,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:07:44,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:07:45,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 19:07:45,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:07:46,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 19:07:48,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 19:07:50,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:07:53,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:07:53,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:07:53,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:07:54,563 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.10 vs. limit=15.0 2023-10-03 19:07:57,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:07:57,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:07:57,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 19:07:58,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 19:07:59,351 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.61 vs. limit=10.0 2023-10-03 19:08:01,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:08:04,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:08:04,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 19:08:04,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:08:04,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:08:04,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 19:08:04,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 19:08:05,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 19:08:05,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:08:05,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 19:08:05,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 19:08:09,183 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.63 vs. limit=22.5 2023-10-03 19:08:09,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:08:11,007 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 19:08:12,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:08:12,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1374673.3333333333, ans=0.0 2023-10-03 19:08:15,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:08:15,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:08:16,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 19:08:18,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:08:18,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:08:20,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:08:20,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:08:21,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:08:23,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:08:23,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:08:23,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1374673.3333333333, ans=0.125 2023-10-03 19:08:25,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:08:26,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:08:29,489 INFO [train.py:1046] (1/4) Epoch 39, batch 4350, loss[loss=0.1627, simple_loss=0.2416, pruned_loss=0.04189, over 23757.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2375, pruned_loss=0.03917, over 4728510.47 frames. ], batch size: 232, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:08:31,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1374740.0, ans=0.125 2023-10-03 19:08:32,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 19:08:32,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:08:38,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:08:41,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:08:42,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:08:42,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:08:45,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:08:47,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1374806.6666666667, ans=0.1 2023-10-03 19:08:48,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:08:51,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:08:51,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:08:54,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:08:56,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:08:57,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:09:05,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 19:09:05,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:09:05,421 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:09:06,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:10,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:12,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 19:09:13,505 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.954e+02 2.148e+02 2.453e+02 3.516e+02, threshold=4.296e+02, percent-clipped=0.0 2023-10-03 19:09:15,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:09:17,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:09:21,020 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 19:09:22,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:09:23,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:09:24,040 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:09:25,034 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 19:09:25,099 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 19:09:25,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:09:25,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:09:27,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:09:28,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:09:28,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:09:28,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:09:31,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 19:09:33,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:33,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:09:33,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:33,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 19:09:34,663 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 19:09:34,667 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 19:09:34,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 19:09:35,185 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.25 vs. limit=15.0 2023-10-03 19:09:37,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:09:37,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:09:37,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:09:38,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:09:40,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 19:09:41,657 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 19:09:41,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:42,946 INFO [train.py:1046] (1/4) Epoch 39, batch 4400, loss[loss=0.1504, simple_loss=0.2298, pruned_loss=0.03555, over 23335.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2386, pruned_loss=0.03941, over 4730305.39 frames. ], batch size: 105, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:09:45,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:09:45,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:47,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:09:49,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 19:09:49,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 19:09:49,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 19:09:50,366 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 19:09:51,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:09:51,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:09:53,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 19:09:56,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:56,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:09:58,335 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 19:09:59,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:09:59,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 19:10:00,696 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.90 vs. limit=15.0 2023-10-03 19:10:01,199 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 19:10:05,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 19:10:06,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 19:10:07,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 19:10:08,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:10:09,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:10:09,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:10:11,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:10:12,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 19:10:12,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 19:10:13,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:10:16,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:10:16,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:10:18,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:10:18,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:10:18,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 19:10:19,428 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 19:10:24,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:10:30,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:10:32,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 19:10:32,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1375273.3333333333, ans=0.125 2023-10-03 19:10:36,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:10:37,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:10:41,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:10:42,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 19:10:42,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:10:42,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:10:42,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:10:43,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:10:47,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 19:10:50,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 19:10:52,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 19:10:52,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:10:52,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 19:10:52,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1375340.0, ans=0.0 2023-10-03 19:10:53,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:10:56,643 INFO [train.py:1046] (1/4) Epoch 39, batch 4450, loss[loss=0.1493, simple_loss=0.2342, pruned_loss=0.03217, over 24491.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2394, pruned_loss=0.03947, over 4740656.83 frames. ], batch size: 63, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:10:56,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:10:58,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 19:11:01,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:11:03,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:11:04,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:11:09,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1375406.6666666667, ans=0.125 2023-10-03 19:11:10,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:11:10,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:11:15,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:11:17,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:11:19,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:11:19,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:11:21,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 19:11:21,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:11:22,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:11:23,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:11:23,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:11:26,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 19:11:31,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:11:31,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:11:32,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:11:32,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:11:34,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:11:37,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 19:11:38,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 19:11:38,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 19:11:38,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:11:41,518 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.929e+02 2.104e+02 2.393e+02 3.740e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-03 19:11:41,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:11:41,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 19:11:41,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1375606.6666666667, ans=0.2 2023-10-03 19:11:45,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:11:49,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:11:50,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 19:11:50,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:11:50,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:11:51,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:11:51,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:11:53,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:11:53,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1375606.6666666667, ans=0.125 2023-10-03 19:11:56,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:11:56,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 19:11:57,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1375673.3333333333, ans=0.07 2023-10-03 19:11:59,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:12:00,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:12:00,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:12:03,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:12:04,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 19:12:06,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:12:07,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1375673.3333333333, ans=0.125 2023-10-03 19:12:07,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1375673.3333333333, ans=0.0 2023-10-03 19:12:09,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 19:12:10,911 INFO [train.py:1046] (1/4) Epoch 39, batch 4500, loss[loss=0.1509, simple_loss=0.2174, pruned_loss=0.04214, over 23380.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2395, pruned_loss=0.03945, over 4726900.41 frames. ], batch size: 285, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:12:12,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:12:15,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:12:16,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 19:12:16,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 19:12:18,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:12:24,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:12:24,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:12:24,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:12:26,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:12:26,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:12:27,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:12:33,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1375806.6666666667, ans=0.0 2023-10-03 19:12:39,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:12:39,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:12:42,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:12:43,239 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.83 vs. limit=15.0 2023-10-03 19:12:43,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:12:43,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:12:44,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1375873.3333333333, ans=0.07 2023-10-03 19:12:49,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 19:12:53,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1375873.3333333333, ans=10.0 2023-10-03 19:12:54,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:12:55,201 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.11 vs. limit=12.0 2023-10-03 19:12:57,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:13:00,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:13:01,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 19:13:03,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:13:03,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:13:04,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1375940.0, ans=0.125 2023-10-03 19:13:06,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:13:06,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:13:06,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:13:07,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 19:13:07,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:13:07,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:13:13,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:13:14,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:13:15,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1376006.6666666667, ans=0.0 2023-10-03 19:13:16,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:13:19,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:13:19,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:13:19,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 19:13:21,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 19:13:23,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 19:13:25,972 INFO [train.py:1046] (1/4) Epoch 39, batch 4550, loss[loss=0.1531, simple_loss=0.2401, pruned_loss=0.03308, over 24644.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2384, pruned_loss=0.03918, over 4734312.52 frames. ], batch size: 68, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:13:26,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 19:13:27,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 19:13:28,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:13:29,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1376073.3333333333, ans=0.125 2023-10-03 19:13:33,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:13:33,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:13:34,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1376073.3333333333, ans=0.0 2023-10-03 19:13:36,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:13:39,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:13:40,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:13:43,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:13:43,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:13:43,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:13:45,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:13:46,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:13:47,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:13:51,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 19:13:52,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 19:13:52,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:13:54,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 19:14:00,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 19:14:00,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:14:01,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 19:14:03,757 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.61 vs. limit=6.0 2023-10-03 19:14:04,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:14:06,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1376206.6666666667, ans=0.125 2023-10-03 19:14:08,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:08,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:09,909 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.921e+02 2.129e+02 2.564e+02 3.877e+02, threshold=4.258e+02, percent-clipped=0.0 2023-10-03 19:14:09,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:14:12,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 19:14:13,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:14:16,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:16,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:14:17,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:14:19,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 19:14:20,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 19:14:20,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:14:22,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 19:14:23,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 19:14:23,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:14:25,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:14:25,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:14:25,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:25,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:14:25,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1376340.0, ans=0.0 2023-10-03 19:14:27,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:14:28,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 19:14:31,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:14:31,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 19:14:31,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 19:14:31,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:14:31,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 19:14:35,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:14:35,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:14:37,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:14:37,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:38,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 19:14:40,457 INFO [train.py:1046] (1/4) Epoch 39, batch 4600, loss[loss=0.1469, simple_loss=0.2321, pruned_loss=0.03086, over 24503.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2367, pruned_loss=0.03882, over 4716826.15 frames. ], batch size: 63, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:14:40,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:14:41,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:14:44,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:14:46,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:14:47,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:14:47,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:14:47,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:14:49,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 19:14:49,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:14:49,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1376406.6666666667, ans=0.0 2023-10-03 19:14:53,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:14:54,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:14:57,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:04,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 19:15:04,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:04,811 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:15:07,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:11,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:15:11,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:15:17,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 19:15:17,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 19:15:17,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:15:20,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1376540.0, ans=0.0 2023-10-03 19:15:23,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:23,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:15:24,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:15:28,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 19:15:28,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:15:29,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1376606.6666666667, ans=0.2 2023-10-03 19:15:36,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:15:36,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:15:38,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:15:38,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 19:15:38,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:38,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 19:15:39,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:15:40,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:15:40,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=1376673.3333333333, ans=10.0 2023-10-03 19:15:41,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1376673.3333333333, ans=0.0 2023-10-03 19:15:42,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:15:42,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:15:44,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:15:45,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 19:15:45,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 19:15:45,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 19:15:45,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:15:46,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:15:46,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:15:48,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:15:53,991 INFO [train.py:1046] (1/4) Epoch 39, batch 4650, loss[loss=0.1507, simple_loss=0.2213, pruned_loss=0.04004, over 22654.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2358, pruned_loss=0.03892, over 4711850.14 frames. ], batch size: 322, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:15:56,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:15:57,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1376740.0, ans=0.0 2023-10-03 19:16:00,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:16:00,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:16:01,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:16:01,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:16:01,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:16:01,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:16:05,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 19:16:11,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:16:12,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 19:16:13,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:16:14,835 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.74 vs. limit=8.0 2023-10-03 19:16:15,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 19:16:15,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:16:15,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 19:16:16,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 19:16:16,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:16:16,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:16:19,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:16:20,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:16:20,736 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 19:16:24,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:16:25,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 19:16:26,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:16:26,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:16:28,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 19:16:29,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:16:33,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:16:34,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn1.whiten.whitening_limit, batch_count=1376873.3333333333, ans=22.5 2023-10-03 19:16:36,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:16:38,876 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.924e+02 2.085e+02 2.377e+02 3.719e+02, threshold=4.171e+02, percent-clipped=0.0 2023-10-03 19:16:41,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:16:42,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:16:43,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:16:45,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:16:46,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 19:16:47,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 19:16:48,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 19:16:48,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 19:16:50,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:16:56,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:16:56,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:16:58,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 19:16:58,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:16:59,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:16:59,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:17:01,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:17:04,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:17:04,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:17:05,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:17:08,723 INFO [train.py:1046] (1/4) Epoch 39, batch 4700, loss[loss=0.1515, simple_loss=0.2362, pruned_loss=0.03339, over 24504.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2367, pruned_loss=0.03885, over 4713271.11 frames. ], batch size: 63, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:17:08,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:17:10,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:17:10,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:17:11,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 19:17:13,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:17:14,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 19:17:23,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:17:23,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:17:23,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:17:24,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:17:25,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:17:29,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 19:17:29,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 19:17:31,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:17:32,722 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.05 vs. limit=15.0 2023-10-03 19:17:33,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:17:33,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:17:35,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:17:41,663 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.67 vs. limit=22.5 2023-10-03 19:17:42,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:17:44,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 19:17:46,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:17:50,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 19:17:52,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:17:55,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:01,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 19:18:02,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:18:05,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:18:05,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 19:18:07,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:07,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:18:11,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:18:11,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:18:11,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 19:18:11,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1377340.0, ans=0.125 2023-10-03 19:18:14,461 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 19:18:15,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:18:17,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:17,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:17,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 19:18:19,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:21,978 INFO [train.py:1046] (1/4) Epoch 39, batch 4750, loss[loss=0.2118, simple_loss=0.2753, pruned_loss=0.07414, over 19485.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2371, pruned_loss=0.03893, over 4716201.81 frames. ], batch size: 388, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:18:22,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 19:18:26,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:18:27,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:18:30,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:18:30,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:18:33,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 19:18:33,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:18:36,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 19:18:36,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:18:38,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:18:38,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:18:41,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1377473.3333333333, ans=0.125 2023-10-03 19:18:44,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 19:18:48,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:18:50,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 19:18:51,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:18:53,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1377540.0, ans=0.0 2023-10-03 19:18:54,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1377540.0, ans=0.0 2023-10-03 19:18:55,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:18:55,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:18:55,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:18:57,298 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 19:18:57,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 19:19:02,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 19:19:02,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1377540.0, ans=0.1 2023-10-03 19:19:04,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:19:05,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:19:06,317 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.868e+02 2.072e+02 2.356e+02 4.427e+02, threshold=4.144e+02, percent-clipped=1.0 2023-10-03 19:19:08,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:19:08,320 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 19:19:08,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:19:09,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:19:13,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:19:14,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 19:19:15,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 19:19:15,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:19:15,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1377606.6666666667, ans=0.125 2023-10-03 19:19:15,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1377606.6666666667, ans=0.0 2023-10-03 19:19:16,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:19:16,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:19:18,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 19:19:18,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 19:19:20,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 19:19:23,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:19:26,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:19:26,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 19:19:27,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:19:27,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:19:31,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:19:32,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:19:32,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:19:36,597 INFO [train.py:1046] (1/4) Epoch 39, batch 4800, loss[loss=0.1396, simple_loss=0.2152, pruned_loss=0.03198, over 23343.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2384, pruned_loss=0.03935, over 4719615.47 frames. ], batch size: 119, lr: 2.60e-03, grad_scale: 32.0 2023-10-03 19:19:36,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:19:36,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 19:19:38,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 19:19:39,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 19:19:42,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:19:42,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:19:42,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 19:19:48,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:19:48,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:19:54,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:19:55,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:19:56,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:19:56,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 19:19:57,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1377806.6666666667, ans=0.0 2023-10-03 19:19:58,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:19:58,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:19:58,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1377806.6666666667, ans=0.1 2023-10-03 19:20:01,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:20:04,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:05,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:05,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:20:07,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:07,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 19:20:07,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:20:08,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1377873.3333333333, ans=0.125 2023-10-03 19:20:10,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:20:11,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:12,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:20:16,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:20:16,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:20:16,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 19:20:17,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:20:19,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 19:20:19,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 19:20:21,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:20:21,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:20:21,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:20:21,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:20:21,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:20:23,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:20:23,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:20:26,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1377940.0, ans=0.125 2023-10-03 19:20:27,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:20:30,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:31,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:20:34,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1378006.6666666667, ans=0.125 2023-10-03 19:20:37,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 19:20:37,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:20:37,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:37,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:20:38,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:20:40,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1378006.6666666667, ans=0.0 2023-10-03 19:20:42,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:20:44,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:20:44,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:44,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:20:46,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:20:46,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:20:49,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1378073.3333333333, ans=0.125 2023-10-03 19:20:51,026 INFO [train.py:1046] (1/4) Epoch 39, batch 4850, loss[loss=0.152, simple_loss=0.2385, pruned_loss=0.03271, over 24500.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2378, pruned_loss=0.03887, over 4723816.27 frames. ], batch size: 63, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:20:51,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:20:51,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:51,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:20:52,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 19:20:55,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 19:20:55,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:55,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:56,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:20:56,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:59,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:21:04,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 19:21:06,682 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.73 vs. limit=22.5 2023-10-03 19:21:07,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:21:11,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:21:11,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:21:11,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:21:17,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:21:19,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:21:20,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:21:20,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 19:21:22,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:21:25,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:21:25,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 19:21:25,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:21:25,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 19:21:28,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:21:28,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:21:28,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1378206.6666666667, ans=0.125 2023-10-03 19:21:33,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:21:33,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 19:21:34,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 19:21:34,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:21:35,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1378273.3333333333, ans=0.1 2023-10-03 19:21:36,943 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.898e+02 2.062e+02 2.421e+02 3.265e+02, threshold=4.123e+02, percent-clipped=0.0 2023-10-03 19:21:40,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:21:41,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 19:21:41,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:21:41,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:21:44,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:21:44,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1378273.3333333333, ans=0.2 2023-10-03 19:21:46,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 19:21:46,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:21:47,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 19:21:47,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:21:49,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:21:49,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 19:21:53,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1378340.0, ans=0.125 2023-10-03 19:21:57,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:22:04,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:22:04,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:22:05,461 INFO [train.py:1046] (1/4) Epoch 39, batch 4900, loss[loss=0.1662, simple_loss=0.2594, pruned_loss=0.03647, over 24375.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2376, pruned_loss=0.03864, over 4736723.10 frames. ], batch size: 77, lr: 2.59e-03, grad_scale: 16.0 2023-10-03 19:22:06,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 19:22:06,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:22:12,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:22:14,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:22:14,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:22:16,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1378406.6666666667, ans=0.125 2023-10-03 19:22:18,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 19:22:18,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1378473.3333333333, ans=0.0 2023-10-03 19:22:24,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 19:22:28,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 19:22:28,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 19:22:28,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:22:29,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:22:29,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:22:31,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:22:31,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:22:31,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 19:22:34,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 19:22:35,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:22:35,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:22:36,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:22:38,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:22:38,963 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.67 vs. limit=15.0 2023-10-03 19:22:39,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:22:41,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:22:41,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 19:22:42,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:22:42,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:22:42,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 19:22:42,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 19:22:46,130 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:22:47,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 19:22:48,176 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.51 vs. limit=12.0 2023-10-03 19:22:49,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:22:50,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:22:51,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:22:52,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:22:52,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 19:22:52,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:22:53,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 19:22:56,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:22:57,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:22:59,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:23:02,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 19:23:02,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:23:04,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 19:23:04,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1378673.3333333333, ans=0.0 2023-10-03 19:23:05,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 19:23:13,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:23:13,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:23:15,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 19:23:15,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 19:23:15,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:23:17,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:23:20,015 INFO [train.py:1046] (1/4) Epoch 39, batch 4950, loss[loss=0.155, simple_loss=0.2193, pruned_loss=0.04537, over 22865.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2364, pruned_loss=0.03834, over 4740598.01 frames. ], batch size: 322, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:23:20,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:23:20,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:23:20,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:23:20,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 19:23:20,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:23:23,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:23:23,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 19:23:27,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 19:23:27,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 19:23:28,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:23:29,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 19:23:29,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:23:29,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:23:30,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:23:30,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:23:32,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:23:33,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:23:36,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:23:36,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:23:36,744 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:23:39,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:23:39,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:23:42,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:23:46,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:23:47,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:23:49,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:23:50,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:23:52,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:23:53,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 19:23:55,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 19:23:57,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:23:59,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1378873.3333333333, ans=0.0 2023-10-03 19:24:01,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:24:01,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:24:03,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:24:03,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:24:04,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:24:05,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:24:07,183 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.841e+02 1.990e+02 2.204e+02 4.323e+02, threshold=3.980e+02, percent-clipped=1.0 2023-10-03 19:24:07,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:24:08,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:24:09,432 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.44 vs. limit=15.0 2023-10-03 19:24:10,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:24:10,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:24:10,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1378940.0, ans=0.125 2023-10-03 19:24:11,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 19:24:12,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:24:15,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:24:18,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1379006.6666666667, ans=0.125 2023-10-03 19:24:19,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:24:20,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:24:20,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:24:20,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:24:21,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:24:21,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:24:24,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:24:24,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:24:24,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:24:26,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 19:24:27,031 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:24:29,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:24:34,093 INFO [train.py:1046] (1/4) Epoch 39, batch 5000, loss[loss=0.1483, simple_loss=0.2407, pruned_loss=0.02793, over 24345.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2358, pruned_loss=0.03807, over 4736253.44 frames. ], batch size: 74, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:24:34,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 19:24:34,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 19:24:41,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:24:41,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:24:41,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 19:24:42,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 19:24:44,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:24:47,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 19:24:47,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:24:47,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:24:47,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1379140.0, ans=0.0 2023-10-03 19:24:48,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 19:24:48,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:24:49,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:24:49,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 19:24:49,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:24:51,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:24:52,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 19:24:53,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 19:24:54,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:24:54,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 19:24:54,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:24:55,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:24:55,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:24:55,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 19:24:55,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 19:24:57,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 19:24:58,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:24:58,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:25:00,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 19:25:00,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:25:02,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:25:03,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:25:05,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 19:25:06,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 19:25:07,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:25:09,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:25:12,185 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 19:25:12,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1379206.6666666667, ans=0.1 2023-10-03 19:25:13,647 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:25:15,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:25:16,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:25:16,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:20,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 19:25:20,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:25:20,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:25:22,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:25:23,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 19:25:23,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:25:25,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1379273.3333333333, ans=0.0 2023-10-03 19:25:28,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:25:28,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:25:34,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 19:25:39,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:46,987 INFO [train.py:1046] (1/4) Epoch 39, batch 5050, loss[loss=0.1475, simple_loss=0.2315, pruned_loss=0.03172, over 24602.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2359, pruned_loss=0.03816, over 4737967.79 frames. ], batch size: 60, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:25:50,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:25:51,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:51,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:25:51,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:25:51,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:25:51,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:25:53,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:55,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:55,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 19:25:57,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:25:59,196 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.83 vs. limit=12.0 2023-10-03 19:26:00,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:26:02,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:26:02,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 19:26:03,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:26:05,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:26:06,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:26:08,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:26:08,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:26:18,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 19:26:18,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:26:19,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:26:19,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 19:26:21,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:26:23,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:26:23,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:26:24,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:26:24,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 19:26:25,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.93 vs. limit=15.0 2023-10-03 19:26:25,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 19:26:25,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:26:27,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:26:31,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:26:31,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 19:26:32,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:26:34,003 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.934e+02 2.073e+02 2.452e+02 3.577e+02, threshold=4.146e+02, percent-clipped=0.0 2023-10-03 19:26:37,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 19:26:38,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:26:38,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:26:40,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:26:40,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:26:42,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:26:43,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:26:43,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:26:43,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:26:43,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:26:45,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 19:26:46,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:26:46,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:26:50,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:26:50,608 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 19:26:50,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 19:26:52,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:26:53,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:26:53,931 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 19:26:55,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:26:55,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 19:26:55,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:26:58,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1379673.3333333333, ans=0.125 2023-10-03 19:26:59,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1379740.0, ans=0.125 2023-10-03 19:27:00,719 INFO [train.py:1046] (1/4) Epoch 39, batch 5100, loss[loss=0.165, simple_loss=0.2384, pruned_loss=0.0458, over 22828.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2368, pruned_loss=0.03843, over 4737472.07 frames. ], batch size: 322, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:27:00,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:27:00,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:27:00,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 19:27:02,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 19:27:06,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:27:06,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:27:06,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:27:08,879 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 19:27:10,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:27:13,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 19:27:15,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 19:27:16,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:27:17,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:27:19,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:27:19,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 19:27:20,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 19:27:23,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:27:25,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:27:29,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:27:33,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 19:27:33,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:27:34,246 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.94 vs. limit=6.0 2023-10-03 19:27:34,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:27:36,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 19:27:38,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:27:39,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:27:39,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 19:27:41,004 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 19:27:41,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:27:41,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1379873.3333333333, ans=0.125 2023-10-03 19:27:42,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 19:27:42,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 19:27:45,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:27:52,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:27:54,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1379940.0, ans=0.125 2023-10-03 19:27:55,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 19:27:57,133 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 19:27:57,149 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 19:27:57,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1379940.0, ans=0.125 2023-10-03 19:28:00,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 19:28:00,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:28:02,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 19:28:06,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 19:28:07,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 19:28:10,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:28:12,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 19:28:14,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:28:14,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 19:28:15,615 INFO [train.py:1046] (1/4) Epoch 39, batch 5150, loss[loss=0.1663, simple_loss=0.2424, pruned_loss=0.04508, over 23639.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2373, pruned_loss=0.03862, over 4740716.87 frames. ], batch size: 149, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:28:18,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:28:18,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:28:18,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:28:20,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:28:20,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:28:21,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:28:21,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 19:28:21,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 19:28:21,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 19:28:22,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:28:22,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 19:28:23,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:28:23,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 19:28:24,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1380073.3333333333, ans=0.1 2023-10-03 19:28:27,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:28:28,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:28:32,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:28:32,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 19:28:34,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:28:34,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:28:36,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:28:36,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:28:36,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:28:37,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:28:37,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:28:37,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 19:28:40,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:28:40,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:28:43,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:28:45,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 19:28:45,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:28:49,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:28:52,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 19:28:55,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:28:58,032 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.35 vs. limit=15.0 2023-10-03 19:29:01,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:29:02,721 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.915e+02 2.051e+02 2.368e+02 4.802e+02, threshold=4.101e+02, percent-clipped=1.0 2023-10-03 19:29:02,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:29:05,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:29:05,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:29:05,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1380273.3333333333, ans=0.1 2023-10-03 19:29:07,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 19:29:10,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:29:12,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:29:12,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:29:16,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:29:18,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:29:18,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 19:29:23,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:29:24,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:29:26,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:29:27,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:29:27,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 19:29:27,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:29:27,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:29:27,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1380340.0, ans=0.125 2023-10-03 19:29:28,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:29:29,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1380406.6666666667, ans=0.0 2023-10-03 19:29:29,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1380406.6666666667, ans=0.125 2023-10-03 19:29:30,177 INFO [train.py:1046] (1/4) Epoch 39, batch 5200, loss[loss=0.1455, simple_loss=0.2219, pruned_loss=0.03452, over 24468.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2375, pruned_loss=0.03871, over 4721089.20 frames. ], batch size: 58, lr: 2.59e-03, grad_scale: 16.0 2023-10-03 19:29:31,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:29:33,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:29:36,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:29:39,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 19:29:41,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:29:41,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:29:41,667 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=10.39 vs. limit=22.5 2023-10-03 19:29:43,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:29:45,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:29:45,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:29:46,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 19:29:49,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:29:51,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:29:52,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 19:29:55,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:29:56,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1380473.3333333333, ans=0.125 2023-10-03 19:29:57,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:29:58,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 19:29:58,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 19:30:01,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 19:30:02,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:30:02,526 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 19:30:02,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:30:03,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:30:05,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:30:07,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 19:30:07,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:30:08,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:30:12,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 19:30:12,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 19:30:12,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 19:30:17,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 19:30:17,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:30:17,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1380606.6666666667, ans=0.2 2023-10-03 19:30:22,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:30:22,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:30:24,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 19:30:24,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:30:24,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 19:30:24,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:30:25,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:30:29,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:30:30,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:30:33,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:30:35,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:30:35,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:30:40,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:30:42,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 19:30:43,599 INFO [train.py:1046] (1/4) Epoch 39, batch 5250, loss[loss=0.1386, simple_loss=0.22, pruned_loss=0.02863, over 24518.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2372, pruned_loss=0.03874, over 4720554.78 frames. ], batch size: 63, lr: 2.59e-03, grad_scale: 4.0 2023-10-03 19:30:43,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:30:43,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:30:43,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:30:45,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:30:45,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:30:47,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:30:50,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:30:52,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:30:52,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:30:59,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:31:00,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:31:02,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:31:04,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:31:04,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 19:31:06,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:31:06,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:31:08,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1380806.6666666667, ans=0.125 2023-10-03 19:31:18,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1380873.3333333333, ans=0.0 2023-10-03 19:31:31,833 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.831e+02 2.019e+02 2.223e+02 3.571e+02, threshold=4.037e+02, percent-clipped=0.0 2023-10-03 19:31:40,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1381006.6666666667, ans=0.125 2023-10-03 19:31:53,166 INFO [train.py:1046] (1/4) Epoch 39, batch 5300, loss[loss=0.149, simple_loss=0.2392, pruned_loss=0.02937, over 24300.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2355, pruned_loss=0.03848, over 4708017.59 frames. ], batch size: 74, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:32:00,347 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.90 vs. limit=22.5 2023-10-03 19:32:07,687 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:32:08,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:32:08,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 19:32:08,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 19:32:08,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:32:08,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:32:08,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:32:08,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:32:09,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:32:09,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:09,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:32:09,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:32:09,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:32:09,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 19:32:09,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 19:32:09,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 19:32:10,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 19:32:10,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 19:32:10,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 19:32:10,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:32:10,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:32:10,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:32:10,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:32:10,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:32:11,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:32:11,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:32:11,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:32:11,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:32:11,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:32:11,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:32:11,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:32:11,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:32:12,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 19:32:12,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:32:12,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:32:12,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 19:32:12,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 19:32:12,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:32:12,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:32:12,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 19:32:12,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 19:32:12,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:32:13,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:32:13,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:32:13,556 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 19:32:13,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 19:32:13,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:32:13,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:32:13,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 19:32:13,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 19:32:13,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 19:32:14,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:32:21,179 INFO [train.py:1046] (1/4) Epoch 40, batch 0, loss[loss=0.1622, simple_loss=0.2364, pruned_loss=0.04398, over 23739.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2364, pruned_loss=0.04398, over 23739.00 frames. ], batch size: 164, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:32:21,179 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 19:32:32,920 INFO [train.py:1078] (1/4) Epoch 40, validation: loss=0.3547, simple_loss=0.2733, pruned_loss=0.2181, over 1125622.00 frames. 2023-10-03 19:32:32,921 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 19:32:34,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 19:32:34,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:32:37,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:32:41,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:32:41,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:32:41,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:42,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 19:32:44,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 19:32:48,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:48,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:53,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:53,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:32:55,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:32:55,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:32:57,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 19:32:58,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:33:07,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:33:07,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:33:07,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_na.min_abs, batch_count=1381293.3333333333, ans=0.02 2023-10-03 19:33:09,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 19:33:13,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:33:13,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:33:14,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:33:16,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:33:20,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:33:25,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 19:33:29,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 19:33:29,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:33:29,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:33:30,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:33:30,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:33:30,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1381360.0, ans=0.1 2023-10-03 19:33:33,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 19:33:36,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:33:36,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:33:39,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:33:42,233 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 19:33:44,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:33:46,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:33:47,709 INFO [train.py:1046] (1/4) Epoch 40, batch 50, loss[loss=0.1572, simple_loss=0.2475, pruned_loss=0.03342, over 24672.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2376, pruned_loss=0.03884, over 1055905.50 frames. ], batch size: 68, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:33:47,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:33:47,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 19:33:49,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:33:49,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:33:52,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:33:53,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:33:53,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1381493.3333333333, ans=0.125 2023-10-03 19:33:54,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1381493.3333333333, ans=0.0 2023-10-03 19:33:55,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:33:59,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 19:33:59,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:34:03,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1381560.0, ans=0.125 2023-10-03 19:34:05,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:34:06,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 19:34:07,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 19:34:09,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:34:10,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:34:10,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:34:11,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:34:13,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 19:34:13,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:34:13,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:34:19,955 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.937e+02 2.112e+02 2.333e+02 3.745e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-03 19:34:21,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:34:22,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:34:23,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:34:24,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 19:34:26,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:34:26,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:34:26,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 19:34:26,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:34:28,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1381626.6666666667, ans=0.5 2023-10-03 19:34:29,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 19:34:38,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:34:38,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:34:39,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:34:41,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:34:41,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:34:42,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 19:34:43,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 19:34:45,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:34:45,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:34:46,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:34:48,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:34:48,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 19:34:48,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 19:34:49,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 19:34:50,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:34:50,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:34:51,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 19:34:52,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 19:34:52,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:34:54,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:34:55,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:34:55,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:34:57,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:35:01,773 INFO [train.py:1046] (1/4) Epoch 40, batch 100, loss[loss=0.1641, simple_loss=0.251, pruned_loss=0.03866, over 24595.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2394, pruned_loss=0.03895, over 1863091.14 frames. ], batch size: 71, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:35:01,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:35:02,362 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.81 vs. limit=22.5 2023-10-03 19:35:03,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:35:06,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 19:35:06,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:35:10,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1381826.6666666667, ans=0.0 2023-10-03 19:35:10,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1381826.6666666667, ans=0.07 2023-10-03 19:35:12,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:35:12,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:35:12,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:35:12,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:35:12,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:35:13,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 19:35:14,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:35:15,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:35:16,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:35:16,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:35:19,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1381893.3333333333, ans=0.1 2023-10-03 19:35:20,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 19:35:20,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:35:21,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:35:22,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1381893.3333333333, ans=0.2 2023-10-03 19:35:23,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:35:24,427 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.00 vs. limit=15.0 2023-10-03 19:35:25,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:35:27,500 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.96 vs. limit=15.0 2023-10-03 19:35:29,153 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 19:35:29,176 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 19:35:32,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:35:32,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:35:37,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:35:38,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:35:40,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:35:44,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:35:46,205 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 19:35:47,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 19:35:50,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:35:51,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:35:54,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:35:57,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:35:59,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:36:00,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:36:03,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:36:05,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:36:05,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:36:05,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:36:07,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:36:08,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 19:36:08,658 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 19:36:08,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:36:08,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1382093.3333333333, ans=0.05 2023-10-03 19:36:10,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:36:11,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:11,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:36:11,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 19:36:11,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:36:11,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:36:11,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:11,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1382093.3333333333, ans=0.0 2023-10-03 19:36:13,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:36:13,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:36:14,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:36:14,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:36:16,170 INFO [train.py:1046] (1/4) Epoch 40, batch 150, loss[loss=0.1422, simple_loss=0.2204, pruned_loss=0.03202, over 20297.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2393, pruned_loss=0.03915, over 2489878.87 frames. ], batch size: 44, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:36:17,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:36:20,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:36:20,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:36:20,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:24,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:36:24,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:26,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:36:28,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:32,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 19:36:32,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 19:36:34,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 19:36:35,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:36:35,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:36:37,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:36:38,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:36:38,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:36:38,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:40,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:41,774 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 19:36:45,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:36:49,007 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.890e+02 2.037e+02 2.278e+02 3.667e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-03 19:36:49,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:36:53,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:36:53,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 19:36:56,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:36:56,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:36:56,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:36:59,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:36:59,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:37:01,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:37:01,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:37:02,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 19:37:04,587 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.81 vs. limit=15.0 2023-10-03 19:37:07,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:37:08,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:37:08,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:37:08,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:37:09,135 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.21 vs. limit=15.0 2023-10-03 19:37:11,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:37:11,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1382360.0, ans=0.0 2023-10-03 19:37:13,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 19:37:15,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:37:16,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:37:17,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:37:20,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:37:20,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 19:37:20,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:37:20,327 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 19:37:23,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:37:26,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:37:26,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:37:30,422 INFO [train.py:1046] (1/4) Epoch 40, batch 200, loss[loss=0.1697, simple_loss=0.2392, pruned_loss=0.05004, over 23641.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2403, pruned_loss=0.04006, over 2978211.40 frames. ], batch size: 256, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:37:30,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 19:37:30,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:37:30,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:37:32,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1382493.3333333333, ans=0.0 2023-10-03 19:37:33,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 19:37:34,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 19:37:36,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:37:37,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:37:43,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:37:43,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:37:43,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:38:02,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:38:02,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:38:03,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:38:05,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:38:05,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 19:38:05,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:38:05,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1382626.6666666667, ans=0.125 2023-10-03 19:38:08,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:08,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:38:09,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:38:09,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:38:11,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 19:38:12,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:38:13,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:38:17,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:38:23,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:38:28,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:29,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:38:33,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1382760.0, ans=0.0 2023-10-03 19:38:36,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:36,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1382760.0, ans=0.2 2023-10-03 19:38:37,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 19:38:38,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1382760.0, ans=0.125 2023-10-03 19:38:39,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:38:39,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:38:39,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:38:41,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:38:41,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 19:38:43,791 INFO [train.py:1046] (1/4) Epoch 40, batch 250, loss[loss=0.1597, simple_loss=0.2406, pruned_loss=0.03941, over 24481.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2397, pruned_loss=0.03911, over 3378120.82 frames. ], batch size: 66, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:38:43,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:38:43,857 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 19:38:45,761 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:38:47,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:47,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1382826.6666666667, ans=0.125 2023-10-03 19:38:48,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:38:48,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:48,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:38:51,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:38:53,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:54,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:38:57,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:39:04,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=1382893.3333333333, ans=0.025 2023-10-03 19:39:05,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1382893.3333333333, ans=0.125 2023-10-03 19:39:07,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:39:10,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:39:12,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:39:16,359 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.944e+02 2.127e+02 2.472e+02 3.844e+02, threshold=4.253e+02, percent-clipped=0.0 2023-10-03 19:39:16,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 19:39:16,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:39:18,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:39:19,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:39:21,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:39:21,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:39:21,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:39:23,191 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.51 vs. limit=22.5 2023-10-03 19:39:24,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:39:24,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1382960.0, ans=0.025 2023-10-03 19:39:26,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1382960.0, ans=0.125 2023-10-03 19:39:27,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 19:39:27,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:39:28,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:39:28,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:39:28,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:39:29,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:39:31,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:39:31,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:39:32,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:39:34,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:39:34,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:39:34,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1383026.6666666667, ans=0.125 2023-10-03 19:39:40,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:39:42,370 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.12 vs. limit=6.0 2023-10-03 19:39:43,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:39:46,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:39:50,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:39:52,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:39:55,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 19:39:56,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:39:58,081 INFO [train.py:1046] (1/4) Epoch 40, batch 300, loss[loss=0.1323, simple_loss=0.1877, pruned_loss=0.03847, over 19244.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2375, pruned_loss=0.03836, over 3658027.61 frames. ], batch size: 388, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:39:58,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:39:59,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 19:39:59,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:40:00,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:40:00,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 19:40:03,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:40:05,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:40:06,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:40:08,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 19:40:09,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:40:11,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:40:11,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 19:40:11,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:40:16,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:40:19,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:40:19,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 19:40:21,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1383226.6666666667, ans=0.125 2023-10-03 19:40:25,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 19:40:25,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:40:26,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:40:28,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:40:28,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 19:40:28,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:40:28,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1383293.3333333333, ans=0.2 2023-10-03 19:40:31,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:40:32,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:40:33,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:40:36,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 19:40:36,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 19:40:38,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:40:41,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:40:44,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 19:40:45,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:40:46,348 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.78 vs. limit=15.0 2023-10-03 19:40:49,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:40:53,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:40:53,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 19:40:57,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:40:57,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:41:00,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:41:01,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:41:01,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 19:41:01,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:41:01,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:41:04,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 19:41:05,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:41:05,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:06,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1383426.6666666667, ans=0.0 2023-10-03 19:41:06,646 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.59 vs. limit=22.5 2023-10-03 19:41:07,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:41:08,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:41:08,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:11,610 INFO [train.py:1046] (1/4) Epoch 40, batch 350, loss[loss=0.1614, simple_loss=0.2345, pruned_loss=0.04421, over 23694.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2349, pruned_loss=0.03776, over 3892662.02 frames. ], batch size: 164, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:41:13,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:41:13,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 19:41:16,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:20,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:41:22,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:41:22,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:27,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 19:41:27,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1383560.0, ans=0.1 2023-10-03 19:41:28,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:41:29,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 19:41:32,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:32,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 19:41:32,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:41:34,716 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.31 vs. limit=15.0 2023-10-03 19:41:35,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 19:41:38,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:41:39,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:41:41,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:41:41,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1383626.6666666667, ans=0.125 2023-10-03 19:41:42,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:41:42,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:41:44,059 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 1.933e+02 2.130e+02 2.384e+02 3.625e+02, threshold=4.259e+02, percent-clipped=0.0 2023-10-03 19:41:44,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:41:44,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:41:45,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:41:45,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:41:47,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:49,219 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.21 vs. limit=10.0 2023-10-03 19:41:51,034 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.46 vs. limit=15.0 2023-10-03 19:41:52,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1383626.6666666667, ans=0.125 2023-10-03 19:41:53,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:41:54,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:41:54,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:41:54,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:42:00,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 19:42:00,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:42:03,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:42:03,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:42:04,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:42:06,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 19:42:08,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:10,811 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 19:42:12,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 19:42:12,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:42:14,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:42:14,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 19:42:16,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:19,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:42:19,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1383760.0, ans=0.0 2023-10-03 19:42:21,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:42:21,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1383760.0, ans=0.125 2023-10-03 19:42:22,936 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.58 vs. limit=10.0 2023-10-03 19:42:23,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:23,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:42:24,935 INFO [train.py:1046] (1/4) Epoch 40, batch 400, loss[loss=0.1541, simple_loss=0.2268, pruned_loss=0.04074, over 23559.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2346, pruned_loss=0.03825, over 4076062.43 frames. ], batch size: 256, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:42:25,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:42:25,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1383826.6666666667, ans=0.1 2023-10-03 19:42:28,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:42:29,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:42:29,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 19:42:29,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:31,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:42:32,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:42:34,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:42:35,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:42:36,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:42:39,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 19:42:40,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 19:42:40,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:42:40,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1383893.3333333333, ans=0.125 2023-10-03 19:42:40,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1383893.3333333333, ans=0.2 2023-10-03 19:42:41,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 19:42:42,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:42:45,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:42:45,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:42:45,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 19:42:47,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:42:47,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:42:47,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:42:47,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:50,523 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 19:42:50,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 19:42:54,159 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.04 vs. limit=15.0 2023-10-03 19:42:56,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:42:57,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:42:57,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 19:42:57,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 19:43:00,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:43:05,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:43:11,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 19:43:14,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:43:15,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 19:43:17,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:43:17,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:43:17,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 19:43:17,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1384026.6666666667, ans=0.125 2023-10-03 19:43:20,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:43:24,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:43:25,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:43:26,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:43:26,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 19:43:28,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:43:29,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 19:43:34,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:43:34,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:43:35,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 19:43:38,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:43:38,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:43:39,822 INFO [train.py:1046] (1/4) Epoch 40, batch 450, loss[loss=0.1399, simple_loss=0.2242, pruned_loss=0.0278, over 24460.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2356, pruned_loss=0.03853, over 4217343.74 frames. ], batch size: 63, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:43:40,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 19:43:40,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 19:43:41,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:43:41,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:43:43,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:43:43,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 19:43:43,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:43:44,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:43:47,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:43:56,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:43:57,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:43:59,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 19:44:00,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 19:44:02,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:44:05,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:44:06,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:44:08,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:44:10,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:44:11,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1384293.3333333333, ans=0.05 2023-10-03 19:44:12,666 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.927e+02 2.083e+02 2.312e+02 3.254e+02, threshold=4.166e+02, percent-clipped=0.0 2023-10-03 19:44:12,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 19:44:14,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 19:44:15,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 19:44:16,240 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.97 vs. limit=15.0 2023-10-03 19:44:16,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:44:16,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:44:18,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:44:18,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1384293.3333333333, ans=0.125 2023-10-03 19:44:19,084 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.38 vs. limit=15.0 2023-10-03 19:44:19,846 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 19:44:19,861 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 19:44:19,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:44:23,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:44:24,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 19:44:26,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1384360.0, ans=0.125 2023-10-03 19:44:27,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:44:27,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:44:29,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 19:44:29,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 19:44:31,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:44:34,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:44:34,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:44:37,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 19:44:42,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:44:43,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 19:44:43,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 19:44:45,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:44:45,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1384426.6666666667, ans=0.2 2023-10-03 19:44:49,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:44:50,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:44:50,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:44:50,886 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 19:44:53,563 INFO [train.py:1046] (1/4) Epoch 40, batch 500, loss[loss=0.1802, simple_loss=0.2466, pruned_loss=0.05692, over 23715.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2367, pruned_loss=0.03888, over 4328144.63 frames. ], batch size: 164, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:44:53,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:44:55,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:44:55,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:44:57,021 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 19:44:58,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 19:44:58,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:45:01,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:45:01,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1384493.3333333333, ans=0.125 2023-10-03 19:45:03,271 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.62 vs. limit=12.0 2023-10-03 19:45:03,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:45:05,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:45:08,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:45:08,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:45:09,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:15,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1384560.0, ans=0.125 2023-10-03 19:45:18,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:45:18,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 19:45:19,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:45:20,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:45:20,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 19:45:21,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:45:24,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:45:26,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:45:26,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:45:28,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:45:28,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 19:45:30,955 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 19:45:32,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1384626.6666666667, ans=0.2 2023-10-03 19:45:33,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:45:35,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:36,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:36,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:36,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:45:38,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 19:45:41,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:45:42,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:45:45,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:45:46,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:50,695 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.49 vs. limit=15.0 2023-10-03 19:45:52,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:45:56,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 19:45:56,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:45:56,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:46:00,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 19:46:00,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 19:46:02,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:46:06,877 INFO [train.py:1046] (1/4) Epoch 40, batch 550, loss[loss=0.1771, simple_loss=0.2581, pruned_loss=0.048, over 24050.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2386, pruned_loss=0.0395, over 4412822.72 frames. ], batch size: 80, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:46:06,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 19:46:10,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 19:46:10,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:46:10,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 19:46:10,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:46:11,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:46:13,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:14,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:14,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:46:15,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:46:17,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:46:19,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 19:46:19,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:46:23,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1384893.3333333333, ans=0.0 2023-10-03 19:46:24,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:46:24,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:27,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:46:29,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:33,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 19:46:33,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 19:46:35,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:46:40,536 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.820e+02 1.987e+02 2.259e+02 2.913e+02, threshold=3.975e+02, percent-clipped=0.0 2023-10-03 19:46:42,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:46:42,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:46:44,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:46:46,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:46:46,863 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 19:46:46,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:48,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 19:46:50,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1385026.6666666667, ans=0.125 2023-10-03 19:46:51,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:46:52,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:46:52,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:46:53,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:46:54,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 19:46:54,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 19:46:55,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:46:55,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:46:56,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1385026.6666666667, ans=0.125 2023-10-03 19:46:57,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:46:57,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:47:00,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:47:00,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:47:03,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:47:05,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:47:06,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 19:47:06,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:47:06,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1385093.3333333333, ans=0.125 2023-10-03 19:47:07,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:47:07,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:47:09,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:47:10,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:47:10,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 19:47:17,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1385093.3333333333, ans=0.1 2023-10-03 19:47:18,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 19:47:19,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 19:47:21,120 INFO [train.py:1046] (1/4) Epoch 40, batch 600, loss[loss=0.1625, simple_loss=0.242, pruned_loss=0.04153, over 23205.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2382, pruned_loss=0.039, over 4492214.05 frames. ], batch size: 105, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:47:21,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:47:21,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:47:21,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:47:28,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1385160.0, ans=0.0 2023-10-03 19:47:29,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:47:31,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:47:32,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 19:47:32,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1385160.0, ans=0.0 2023-10-03 19:47:34,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:47:34,582 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:47:37,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:47:38,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:47:40,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 19:47:40,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:47:40,362 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:47:43,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1385226.6666666667, ans=0.1 2023-10-03 19:47:46,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 19:47:49,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:47:49,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:47:49,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:47:55,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:47:56,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:47:56,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:48:02,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:48:07,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:48:07,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:48:07,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:48:14,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 19:48:20,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:48:20,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:48:23,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 19:48:25,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:48:26,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 19:48:26,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:48:28,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:48:29,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1385426.6666666667, ans=0.125 2023-10-03 19:48:33,181 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.22 vs. limit=22.5 2023-10-03 19:48:33,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 19:48:35,765 INFO [train.py:1046] (1/4) Epoch 40, batch 650, loss[loss=0.1557, simple_loss=0.2406, pruned_loss=0.03541, over 24659.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2377, pruned_loss=0.03922, over 4531386.77 frames. ], batch size: 68, lr: 2.55e-03, grad_scale: 4.0 2023-10-03 19:48:35,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:48:37,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:48:38,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:48:39,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:48:43,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 19:48:44,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:48:48,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1385493.3333333333, ans=0.125 2023-10-03 19:48:50,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:48:50,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:48:53,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:48:56,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 19:48:58,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:48:59,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:49:02,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:49:02,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 19:49:05,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:49:06,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:06,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:49:06,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:08,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:49:09,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:49:11,455 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.985e+02 2.192e+02 2.479e+02 3.880e+02, threshold=4.384e+02, percent-clipped=0.0 2023-10-03 19:49:11,549 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 19:49:11,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:49:11,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:49:14,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:14,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:49:16,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:49:16,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:49:17,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 19:49:19,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:49:20,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:49:20,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 19:49:21,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:49:22,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1385693.3333333333, ans=0.2 2023-10-03 19:49:22,238 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.57 vs. limit=22.5 2023-10-03 19:49:23,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 19:49:24,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 19:49:25,157 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.87 vs. limit=15.0 2023-10-03 19:49:26,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 19:49:26,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:26,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:49:26,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:49:26,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:49:27,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1385693.3333333333, ans=0.125 2023-10-03 19:49:28,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:49:33,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:33,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:49:34,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:49:37,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:49:37,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 19:49:38,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:49:47,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:49:47,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:49:47,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:49:48,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:49:49,723 INFO [train.py:1046] (1/4) Epoch 40, batch 700, loss[loss=0.1612, simple_loss=0.2579, pruned_loss=0.03222, over 24324.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.236, pruned_loss=0.03877, over 4560036.20 frames. ], batch size: 74, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:49:52,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 19:49:52,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 19:49:55,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 19:49:55,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:57,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:49:58,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 19:50:02,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:50:04,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:50:07,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:50:08,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 19:50:08,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:50:10,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:50:13,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 19:50:13,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:50:16,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 19:50:19,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 19:50:21,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:50:21,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:50:23,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:50:28,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:50:29,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 19:50:32,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1386026.6666666667, ans=0.125 2023-10-03 19:50:33,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:50:35,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:50:35,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 19:50:39,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:50:39,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:50:42,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:50:49,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:50:50,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 19:50:53,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 19:50:53,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 19:50:57,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:50:59,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:50:59,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:51:00,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:51:00,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 19:51:03,469 INFO [train.py:1046] (1/4) Epoch 40, batch 750, loss[loss=0.1641, simple_loss=0.2502, pruned_loss=0.03905, over 24107.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2356, pruned_loss=0.03841, over 4601512.14 frames. ], batch size: 80, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:51:05,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 19:51:05,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 19:51:06,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 19:51:06,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 19:51:06,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 19:51:08,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:51:08,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 19:51:09,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:51:10,562 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.35 vs. limit=15.0 2023-10-03 19:51:11,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:51:11,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1386160.0, ans=0.0 2023-10-03 19:51:12,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:51:13,540 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.56 vs. limit=15.0 2023-10-03 19:51:14,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:51:15,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:51:15,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:51:18,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:51:18,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:51:20,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:51:23,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:51:24,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:51:25,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 19:51:25,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1386226.6666666667, ans=0.0 2023-10-03 19:51:26,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:51:26,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:51:26,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1386226.6666666667, ans=0.2 2023-10-03 19:51:29,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:51:29,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 19:51:30,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 19:51:32,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:51:33,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 19:51:34,836 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 19:51:34,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 19:51:34,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:51:34,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:51:37,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:51:39,314 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.912e+02 2.088e+02 2.370e+02 3.919e+02, threshold=4.175e+02, percent-clipped=0.0 2023-10-03 19:51:42,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1386293.3333333333, ans=0.0 2023-10-03 19:51:42,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=1386293.3333333333, ans=15.0 2023-10-03 19:51:43,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:51:45,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:51:45,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:51:46,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:51:48,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:51:49,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 19:51:49,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:51:51,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 19:51:53,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:51:54,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:51:54,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 19:51:56,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:51:57,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1386360.0, ans=0.125 2023-10-03 19:52:01,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:52:04,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:52:04,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:52:06,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:52:06,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1386426.6666666667, ans=0.1 2023-10-03 19:52:08,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 19:52:10,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:52:10,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:52:14,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:52:16,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:52:18,029 INFO [train.py:1046] (1/4) Epoch 40, batch 800, loss[loss=0.1464, simple_loss=0.2294, pruned_loss=0.03172, over 24620.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.236, pruned_loss=0.03868, over 4629985.17 frames. ], batch size: 60, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:52:19,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:52:19,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:52:28,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:52:28,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:52:29,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:52:29,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:52:31,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:52:31,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:52:33,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:52:35,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:52:37,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:52:39,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 19:52:40,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:52:41,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:52:41,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:52:41,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:52:41,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 19:52:42,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:52:42,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 19:52:45,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:52:47,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:52:49,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:52:49,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:52:52,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:52:52,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:53:00,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:53:00,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:53:00,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 19:53:04,144 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 19:53:04,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 19:53:04,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:53:04,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:53:06,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:53:08,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:53:11,084 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 19:53:11,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 19:53:12,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:53:15,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:53:18,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:53:20,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:53:21,009 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.92 vs. limit=22.5 2023-10-03 19:53:22,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 19:53:22,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1386760.0, ans=0.125 2023-10-03 19:53:23,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:53:24,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 19:53:30,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:53:35,560 INFO [train.py:1046] (1/4) Epoch 40, batch 850, loss[loss=0.1445, simple_loss=0.2222, pruned_loss=0.03335, over 24418.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2369, pruned_loss=0.03891, over 4652309.18 frames. ], batch size: 58, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:53:35,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:53:37,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 19:53:37,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:53:38,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:53:38,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 19:53:38,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1386826.6666666667, ans=0.125 2023-10-03 19:53:39,286 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.80 vs. limit=6.0 2023-10-03 19:53:39,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:53:41,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:53:42,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:53:42,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1386826.6666666667, ans=0.0 2023-10-03 19:53:43,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:53:45,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:53:46,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 19:53:46,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 19:53:46,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 19:53:48,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:53:50,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:53:51,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:53:51,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:53:51,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:53:56,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:53:56,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:53:57,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 19:54:00,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1386893.3333333333, ans=0.2 2023-10-03 19:54:01,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 19:54:05,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:54:07,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 19:54:09,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 19:54:11,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 19:54:11,964 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.48 vs. limit=6.0 2023-10-03 19:54:12,669 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.964e+02 2.128e+02 2.466e+02 3.367e+02, threshold=4.255e+02, percent-clipped=0.0 2023-10-03 19:54:12,845 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 19:54:14,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:54:14,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:54:14,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 19:54:15,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:54:16,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:54:18,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 19:54:20,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:54:20,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:54:21,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:54:23,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:54:23,290 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1387026.6666666667, ans=0.0 2023-10-03 19:54:25,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:54:26,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:54:26,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 19:54:29,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:54:29,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:54:30,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:54:30,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:54:32,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:54:33,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:54:36,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:54:36,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 19:54:36,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:54:38,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:54:40,801 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.32 vs. limit=10.0 2023-10-03 19:54:45,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:54:46,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:54:46,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 19:54:46,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:54:48,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:54:50,062 INFO [train.py:1046] (1/4) Epoch 40, batch 900, loss[loss=0.1497, simple_loss=0.2258, pruned_loss=0.03687, over 24364.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2376, pruned_loss=0.03912, over 4673048.55 frames. ], batch size: 56, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:54:50,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 19:54:55,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1387160.0, ans=0.125 2023-10-03 19:54:56,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1387160.0, ans=0.125 2023-10-03 19:54:57,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:54:59,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:55:00,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 19:55:03,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:55:03,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 19:55:03,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 19:55:05,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:55:05,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:55:05,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:55:06,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:55:07,504 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.03 vs. limit=10.0 2023-10-03 19:55:10,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1387226.6666666667, ans=0.125 2023-10-03 19:55:16,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:55:16,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:55:16,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:55:19,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:55:24,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 19:55:26,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:55:29,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 19:55:30,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:55:32,031 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 19:55:32,809 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.52 vs. limit=15.0 2023-10-03 19:55:33,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 19:55:37,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:55:37,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:55:39,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:55:44,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:55:44,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:55:45,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 19:55:45,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:55:48,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 19:55:49,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:55:49,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:55:51,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:55:51,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:55:56,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 19:55:56,229 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 19:55:57,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 19:55:58,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 19:56:00,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:56:00,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1387426.6666666667, ans=0.125 2023-10-03 19:56:03,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 19:56:05,006 INFO [train.py:1046] (1/4) Epoch 40, batch 950, loss[loss=0.1634, simple_loss=0.2509, pruned_loss=0.03794, over 24337.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2379, pruned_loss=0.03906, over 4685163.40 frames. ], batch size: 77, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:56:07,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:56:11,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:56:12,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:56:12,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:56:13,886 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 19:56:16,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:56:18,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:56:19,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:56:19,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:56:19,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 19:56:21,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 19:56:22,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:56:24,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 19:56:26,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:56:30,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:56:30,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:56:30,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:56:31,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 19:56:34,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:56:34,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:56:35,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:56:41,687 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 1.970e+02 2.172e+02 2.506e+02 3.661e+02, threshold=4.343e+02, percent-clipped=0.0 2023-10-03 19:56:41,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:56:41,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:56:43,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 19:56:47,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 19:56:47,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:56:48,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:56:48,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:56:48,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:56:52,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 19:56:54,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:56:56,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:56:57,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:56:57,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 19:56:57,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:56:57,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:56:58,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 19:57:01,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:57:05,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:57:10,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:57:11,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 19:57:11,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 19:57:13,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:57:19,403 INFO [train.py:1046] (1/4) Epoch 40, batch 1000, loss[loss=0.1519, simple_loss=0.2411, pruned_loss=0.03139, over 24307.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2376, pruned_loss=0.03885, over 4705795.32 frames. ], batch size: 74, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:57:20,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 19:57:20,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:57:25,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:57:25,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 19:57:25,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 19:57:31,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:57:31,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:57:32,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:57:34,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 19:57:35,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1387893.3333333333, ans=0.125 2023-10-03 19:57:38,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 19:57:38,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1387893.3333333333, ans=0.125 2023-10-03 19:57:40,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 19:57:40,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:57:43,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 19:57:45,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 19:57:45,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 19:57:46,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:57:47,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:57:57,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:57:57,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:57:58,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:57:58,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:57:58,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 19:57:58,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:58:00,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:58:00,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:58:01,259 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 19:58:05,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 19:58:06,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 19:58:08,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 19:58:09,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:58:14,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:58:14,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:58:14,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:58:16,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:58:17,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 19:58:20,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:58:20,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 19:58:22,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 19:58:23,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:58:23,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:58:26,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:58:28,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:58:30,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:58:31,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1388093.3333333333, ans=0.0 2023-10-03 19:58:32,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:58:32,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:58:33,948 INFO [train.py:1046] (1/4) Epoch 40, batch 1050, loss[loss=0.1552, simple_loss=0.2249, pruned_loss=0.04275, over 23561.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2364, pruned_loss=0.03841, over 4705199.58 frames. ], batch size: 256, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:58:35,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:58:36,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:58:39,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:58:39,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1388160.0, ans=0.09899494936611666 2023-10-03 19:58:42,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:58:42,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1388160.0, ans=0.125 2023-10-03 19:58:44,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:58:46,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:58:48,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:58:48,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:58:48,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:58:50,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 19:58:52,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:58:52,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 19:58:54,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:58:55,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1388226.6666666667, ans=0.1 2023-10-03 19:58:56,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 19:58:56,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 19:58:57,892 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1388226.6666666667, ans=0.1 2023-10-03 19:58:57,911 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:59:01,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:59:01,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 19:59:01,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:59:04,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 19:59:04,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 19:59:04,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:59:09,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 19:59:10,269 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.893e+02 2.073e+02 2.289e+02 3.386e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 19:59:10,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 19:59:11,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:59:15,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 19:59:18,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 19:59:18,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:59:19,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:59:22,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:59:27,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 19:59:28,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 19:59:28,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 19:59:28,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:59:30,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:59:30,920 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.40 vs. limit=15.0 2023-10-03 19:59:31,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 19:59:35,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:59:36,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:59:36,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:59:37,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:59:37,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:59:40,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:59:40,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 19:59:43,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:59:43,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 19:59:43,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 19:59:44,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:59:47,896 INFO [train.py:1046] (1/4) Epoch 40, batch 1100, loss[loss=0.1537, simple_loss=0.2295, pruned_loss=0.03896, over 23785.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2362, pruned_loss=0.03853, over 4688560.93 frames. ], batch size: 212, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:59:49,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:59:51,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1388493.3333333333, ans=0.125 2023-10-03 19:59:54,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:59:57,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:00:00,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:00:00,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:00:01,256 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.14 vs. limit=22.5 2023-10-03 20:00:01,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 20:00:01,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:00:04,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 20:00:06,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:00:07,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:00:09,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 20:00:10,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 20:00:10,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:00:10,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:00:14,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:00:16,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:00:20,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:00:24,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 20:00:25,906 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 20:00:27,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:00:27,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1388626.6666666667, ans=0.0 2023-10-03 20:00:28,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:00:31,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:00:31,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:00:32,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1388693.3333333333, ans=0.125 2023-10-03 20:00:32,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1388693.3333333333, ans=0.125 2023-10-03 20:00:34,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 20:00:35,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:00:35,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:00:35,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:00:35,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:00:35,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 20:00:39,821 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:00:41,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:00:41,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 20:00:44,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:00:46,274 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.82 vs. limit=10.0 2023-10-03 20:00:46,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:00:50,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 20:00:50,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:00:51,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:00:53,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:00:55,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:00:55,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 20:00:55,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:00:55,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:00:58,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 20:00:58,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:00:58,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 20:00:59,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:00:59,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:01:01,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:01:02,414 INFO [train.py:1046] (1/4) Epoch 40, batch 1150, loss[loss=0.1581, simple_loss=0.234, pruned_loss=0.04114, over 22758.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2366, pruned_loss=0.0382, over 4696212.71 frames. ], batch size: 322, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:01:06,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:01:09,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:01:10,874 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.64 vs. limit=15.0 2023-10-03 20:01:11,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:01:11,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:01:11,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 20:01:12,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:01:14,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 20:01:15,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:01:15,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:01:21,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 20:01:25,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:01:29,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:01:30,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:01:30,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 20:01:30,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:01:31,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:01:35,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 20:01:36,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:01:36,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:01:39,029 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 2.015e+02 2.225e+02 2.485e+02 5.014e+02, threshold=4.451e+02, percent-clipped=1.0 2023-10-03 20:01:45,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:01:45,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1389026.6666666667, ans=0.2 2023-10-03 20:01:51,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:01:51,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 20:01:52,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:01:52,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:01:57,332 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 20:01:57,935 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.36 vs. limit=15.0 2023-10-03 20:01:59,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:01:59,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1389026.6666666667, ans=0.1 2023-10-03 20:02:04,955 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 20:02:09,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:02:09,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:02:11,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:02:11,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:02:13,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:02:16,509 INFO [train.py:1046] (1/4) Epoch 40, batch 1200, loss[loss=0.1647, simple_loss=0.2517, pruned_loss=0.03887, over 23168.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2373, pruned_loss=0.03859, over 4708558.33 frames. ], batch size: 105, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:02:18,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:02:18,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:02:21,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:02:21,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:02:21,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1389160.0, ans=0.125 2023-10-03 20:02:22,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:02:25,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:02:27,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:02:28,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:02:28,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:02:28,720 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:02:31,849 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 20:02:34,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 20:02:36,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:02:37,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:02:39,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1389226.6666666667, ans=0.125 2023-10-03 20:02:40,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:02:41,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:02:41,793 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 20:02:43,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:02:49,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:02:49,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:02:49,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 20:02:50,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:02:52,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1389293.3333333333, ans=0.0 2023-10-03 20:02:54,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 20:02:56,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1389293.3333333333, ans=0.125 2023-10-03 20:03:00,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 20:03:00,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:03:02,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:03:03,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:03:04,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:03:06,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:03:06,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:03:06,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:03:07,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 20:03:07,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:03:09,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:03:09,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:03:11,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:03:11,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:03:16,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 20:03:17,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:03:18,695 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.40 vs. limit=15.0 2023-10-03 20:03:20,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 20:03:21,076 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:03:21,399 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.54 vs. limit=15.0 2023-10-03 20:03:24,901 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.61 vs. limit=22.5 2023-10-03 20:03:25,510 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 20:03:26,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:03:28,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:03:30,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:03:31,679 INFO [train.py:1046] (1/4) Epoch 40, batch 1250, loss[loss=0.1586, simple_loss=0.2537, pruned_loss=0.0317, over 24568.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2377, pruned_loss=0.0387, over 4712849.31 frames. ], batch size: 71, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:03:31,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:03:35,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 20:03:37,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:03:39,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:03:40,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 20:03:42,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:03:42,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:03:43,796 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:03:48,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:03:48,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:03:49,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:03:49,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:03:52,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:03:54,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1389560.0, ans=0.125 2023-10-03 20:03:55,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:03:55,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 20:03:55,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:03:55,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1389560.0, ans=0.125 2023-10-03 20:03:55,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1389560.0, ans=0.125 2023-10-03 20:03:58,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:03:58,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:04:01,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:04:03,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 20:04:09,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 20:04:09,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:04:10,318 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.900e+02 2.073e+02 2.356e+02 3.253e+02, threshold=4.146e+02, percent-clipped=0.0 2023-10-03 20:04:13,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:04:13,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 20:04:14,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:04:14,637 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 20:04:14,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:04:14,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:04:15,619 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.43 vs. limit=12.0 2023-10-03 20:04:19,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:04:22,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:04:23,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:04:23,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 20:04:23,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 20:04:24,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 20:04:25,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1389693.3333333333, ans=0.1 2023-10-03 20:04:27,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:04:29,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 20:04:29,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:04:31,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 20:04:32,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:04:34,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 20:04:34,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 20:04:34,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:04:34,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:04:35,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:04:38,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 20:04:40,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:04:41,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:04:41,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1389760.0, ans=0.125 2023-10-03 20:04:43,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:04:43,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1389760.0, ans=0.1 2023-10-03 20:04:44,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1389826.6666666667, ans=0.125 2023-10-03 20:04:44,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1389826.6666666667, ans=0.125 2023-10-03 20:04:45,735 INFO [train.py:1046] (1/4) Epoch 40, batch 1300, loss[loss=0.1479, simple_loss=0.2381, pruned_loss=0.02878, over 24654.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2383, pruned_loss=0.03896, over 4711005.12 frames. ], batch size: 68, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:04:45,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 20:04:47,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:04:49,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 20:04:50,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1389826.6666666667, ans=0.05 2023-10-03 20:04:54,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:04:55,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:04:57,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:04:58,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:05:00,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:05:00,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 20:05:06,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:05:06,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:05:09,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 20:05:11,369 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.17 vs. limit=22.5 2023-10-03 20:05:12,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:05:16,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:05:16,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:05:18,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:05:19,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:05:20,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:05:20,492 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:05:21,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 20:05:22,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 20:05:27,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:05:27,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:05:29,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 20:05:29,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 20:05:31,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:05:34,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:05:34,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 20:05:35,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:05:35,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 20:05:37,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:05:37,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1390026.6666666667, ans=0.125 2023-10-03 20:05:41,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:05:41,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:05:42,219 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.81 vs. limit=15.0 2023-10-03 20:05:42,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 20:05:44,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 20:05:44,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 20:05:50,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:05:53,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 20:05:54,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:05:55,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1390093.3333333333, ans=0.0 2023-10-03 20:05:59,087 INFO [train.py:1046] (1/4) Epoch 40, batch 1350, loss[loss=0.1622, simple_loss=0.2476, pruned_loss=0.03842, over 23875.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.238, pruned_loss=0.03863, over 4729526.56 frames. ], batch size: 86, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:06:00,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 20:06:04,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:06:05,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:06:07,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:06:08,129 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.21 vs. limit=15.0 2023-10-03 20:06:08,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:06:10,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:06:11,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:06:16,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:06:17,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 20:06:17,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 20:06:17,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:06:20,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 20:06:22,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:06:23,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:06:23,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 20:06:26,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 20:06:26,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 20:06:27,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:06:27,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 20:06:31,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1390293.3333333333, ans=0.0 2023-10-03 20:06:38,029 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.449e+02 1.913e+02 2.157e+02 2.390e+02 3.072e+02, threshold=4.314e+02, percent-clipped=0.0 2023-10-03 20:06:39,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:06:49,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:06:49,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:06:51,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 20:06:54,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:06:54,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 20:06:54,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 20:06:55,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:06:58,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:07:01,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 20:07:04,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:07:07,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 20:07:10,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 20:07:13,511 INFO [train.py:1046] (1/4) Epoch 40, batch 1400, loss[loss=0.1649, simple_loss=0.237, pruned_loss=0.04644, over 23409.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2366, pruned_loss=0.03843, over 4719900.37 frames. ], batch size: 134, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:07:16,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 20:07:18,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:07:21,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:07:22,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:07:26,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 20:07:28,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 20:07:36,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:07:39,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:07:40,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:07:40,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:07:44,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:07:46,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 20:07:49,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1390626.6666666667, ans=0.035 2023-10-03 20:07:54,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:07:55,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:07:58,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1390693.3333333333, ans=0.1 2023-10-03 20:07:59,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 20:08:00,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:08:01,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:08:02,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:08:02,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:08:04,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:08:04,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:08:04,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:08:06,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 20:08:06,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:08:06,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1390693.3333333333, ans=0.1 2023-10-03 20:08:10,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:08:13,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:08:22,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 20:08:22,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 20:08:23,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:08:25,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1390760.0, ans=10.0 2023-10-03 20:08:26,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 20:08:26,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:08:27,593 INFO [train.py:1046] (1/4) Epoch 40, batch 1450, loss[loss=0.1681, simple_loss=0.2548, pruned_loss=0.04066, over 24319.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.235, pruned_loss=0.03823, over 4699290.56 frames. ], batch size: 74, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:08:29,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:08:31,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1390826.6666666667, ans=0.125 2023-10-03 20:08:33,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:08:35,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:08:35,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:08:36,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 20:08:41,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:08:41,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:08:42,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:08:42,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 20:08:43,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:08:44,321 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.72 vs. limit=12.0 2023-10-03 20:08:44,477 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.77 vs. limit=22.5 2023-10-03 20:08:45,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 20:08:45,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:08:46,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:08:46,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 20:08:47,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:08:47,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:08:49,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 20:08:49,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:08:49,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1390893.3333333333, ans=0.125 2023-10-03 20:08:50,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:08:53,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:08:55,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:08:55,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1390960.0, ans=0.0 2023-10-03 20:08:59,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:08:59,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:09:01,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:09:01,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:09:03,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1390960.0, ans=0.1 2023-10-03 20:09:04,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:09:04,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:09:04,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:09:05,923 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.904e+02 2.065e+02 2.334e+02 4.319e+02, threshold=4.131e+02, percent-clipped=1.0 2023-10-03 20:09:05,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:09:09,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 20:09:10,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:09:12,177 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 20:09:14,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:09:16,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:09:17,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:09:18,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 20:09:22,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:09:23,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 20:09:23,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1391026.6666666667, ans=0.0 2023-10-03 20:09:24,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 20:09:26,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:09:28,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:09:28,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:09:29,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 20:09:32,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 20:09:34,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 20:09:34,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1391093.3333333333, ans=0.125 2023-10-03 20:09:36,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:09:36,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1391093.3333333333, ans=0.125 2023-10-03 20:09:37,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:09:40,855 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.74 vs. limit=15.0 2023-10-03 20:09:41,641 INFO [train.py:1046] (1/4) Epoch 40, batch 1500, loss[loss=0.1849, simple_loss=0.2489, pruned_loss=0.06043, over 22723.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2359, pruned_loss=0.0383, over 4705023.03 frames. ], batch size: 322, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:09:47,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 20:09:47,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:09:47,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:09:48,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:09:48,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:09:49,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:09:51,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 20:09:51,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:09:53,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 20:09:53,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:09:53,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:09:56,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:09:57,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:10:02,539 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.80 vs. limit=15.0 2023-10-03 20:10:03,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:10:03,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 20:10:05,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:10:05,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:10:07,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:10:09,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 20:10:14,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 20:10:15,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:10:16,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 20:10:18,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:10:20,588 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.19 vs. limit=22.5 2023-10-03 20:10:21,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:10:22,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:10:22,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:10:24,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 20:10:24,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:10:24,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:10:24,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1391360.0, ans=0.125 2023-10-03 20:10:25,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 20:10:25,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:10:31,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:10:31,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 20:10:35,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:10:36,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:10:40,939 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 20:10:41,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:10:41,005 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 20:10:42,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:10:43,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:10:43,850 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 20:10:45,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:10:47,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 20:10:49,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:10:49,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1391426.6666666667, ans=0.0 2023-10-03 20:10:51,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:10:51,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:10:52,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:10:53,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:10:53,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:10:54,972 INFO [train.py:1046] (1/4) Epoch 40, batch 1550, loss[loss=0.211, simple_loss=0.2831, pruned_loss=0.0695, over 19261.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2368, pruned_loss=0.03828, over 4692394.73 frames. ], batch size: 388, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:10:55,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 20:10:56,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 20:10:56,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:10:58,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 20:10:58,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 20:10:59,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:11:01,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:11:02,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:11:02,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:11:05,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:11:05,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:11:09,110 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 20:11:10,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:11:10,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:11:10,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:11:11,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:11:11,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 20:11:13,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1391560.0, ans=0.0 2023-10-03 20:11:14,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:11:14,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 20:11:17,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 20:11:17,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 20:11:17,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:11:18,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:11:22,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:11:24,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 20:11:24,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 20:11:26,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1391626.6666666667, ans=0.125 2023-10-03 20:11:29,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1391626.6666666667, ans=0.125 2023-10-03 20:11:33,604 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.912e+02 2.147e+02 2.431e+02 4.744e+02, threshold=4.295e+02, percent-clipped=1.0 2023-10-03 20:11:33,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:11:38,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1391693.3333333333, ans=0.1 2023-10-03 20:11:39,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:11:39,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:11:39,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:11:39,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 20:11:45,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:11:46,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:11:49,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:11:52,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:11:52,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:11:52,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 20:11:52,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:11:54,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:11:54,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:11:54,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1391760.0, ans=0.0 2023-10-03 20:11:55,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 20:11:55,606 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 20:11:57,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:12:03,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 20:12:09,090 INFO [train.py:1046] (1/4) Epoch 40, batch 1600, loss[loss=0.1353, simple_loss=0.2236, pruned_loss=0.02353, over 24674.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2367, pruned_loss=0.03802, over 4709518.53 frames. ], batch size: 65, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:12:09,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:12:09,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:12:09,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 20:12:11,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:12:12,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:12:12,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:12:12,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:12:13,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:12:17,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:12:17,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 20:12:17,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 20:12:19,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 20:12:20,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:12:22,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 20:12:22,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:12:25,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:12:25,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1391893.3333333333, ans=0.04949747468305833 2023-10-03 20:12:29,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:12:30,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1391893.3333333333, ans=0.2 2023-10-03 20:12:34,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 20:12:36,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:12:38,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 20:12:38,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:12:38,980 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.89 vs. limit=15.0 2023-10-03 20:12:40,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 20:12:45,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 20:12:45,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1391960.0, ans=0.09899494936611666 2023-10-03 20:12:51,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:12:51,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 20:12:52,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:12:52,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:12:52,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:12:55,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 20:12:59,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 20:13:02,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:13:02,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:13:03,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1392026.6666666667, ans=0.125 2023-10-03 20:13:04,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:13:04,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:13:06,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:13:08,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:13:10,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:13:16,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:13:16,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:13:18,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 20:13:18,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:13:20,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 20:13:23,586 INFO [train.py:1046] (1/4) Epoch 40, batch 1650, loss[loss=0.1518, simple_loss=0.2232, pruned_loss=0.04021, over 23654.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2369, pruned_loss=0.03829, over 4710759.40 frames. ], batch size: 232, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:13:23,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:13:26,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:13:26,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:13:27,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 20:13:27,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 20:13:27,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 20:13:27,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 20:13:33,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:13:33,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:13:33,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:13:33,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:13:36,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:13:38,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 20:13:41,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:13:41,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:13:41,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:13:41,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:13:41,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 20:13:43,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 20:13:48,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:13:48,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1392226.6666666667, ans=0.125 2023-10-03 20:13:50,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:13:58,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 20:13:58,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:13:59,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 20:14:01,154 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.943e+02 2.116e+02 2.383e+02 3.563e+02, threshold=4.232e+02, percent-clipped=0.0 2023-10-03 20:14:01,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:14:04,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:14:04,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:14:04,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1392293.3333333333, ans=0.2 2023-10-03 20:14:06,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:14:08,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:14:08,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:14:10,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:14:11,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:14:11,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:14:12,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:14:14,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:14:15,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:14:16,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1392360.0, ans=0.1 2023-10-03 20:14:17,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:14:17,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 20:14:18,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:14:18,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 20:14:20,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 20:14:21,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 20:14:21,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:14:21,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:14:21,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1392426.6666666667, ans=0.1 2023-10-03 20:14:22,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:14:22,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:14:22,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 20:14:24,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1392426.6666666667, ans=0.0 2023-10-03 20:14:25,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:14:28,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:14:28,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1392426.6666666667, ans=0.125 2023-10-03 20:14:29,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:14:30,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1392426.6666666667, ans=0.125 2023-10-03 20:14:31,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 20:14:35,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:14:35,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:14:35,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 20:14:37,330 INFO [train.py:1046] (1/4) Epoch 40, batch 1700, loss[loss=0.1624, simple_loss=0.2308, pruned_loss=0.047, over 23807.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2362, pruned_loss=0.03837, over 4714274.32 frames. ], batch size: 150, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:14:37,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:14:37,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:14:37,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:14:40,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:14:40,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:14:41,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 20:14:44,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:14:50,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:14:53,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:14:54,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1392560.0, ans=0.0 2023-10-03 20:15:00,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:15:00,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:15:02,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:15:02,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:15:03,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 20:15:05,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:15:06,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:15:06,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:15:09,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 20:15:10,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 20:15:12,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 20:15:12,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:15:14,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 20:15:15,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:15:22,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:15:23,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:15:23,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:15:26,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:15:26,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 20:15:26,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:15:30,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:15:30,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 20:15:30,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:15:30,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:15:31,897 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.81 vs. limit=22.5 2023-10-03 20:15:32,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:15:32,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:15:32,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1392693.3333333333, ans=0.125 2023-10-03 20:15:34,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:15:34,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:15:35,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:15:35,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:15:35,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:15:37,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1392760.0, ans=0.0 2023-10-03 20:15:41,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:15:41,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 20:15:44,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:15:44,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1392760.0, ans=0.125 2023-10-03 20:15:46,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:15:48,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 20:15:50,206 INFO [train.py:1046] (1/4) Epoch 40, batch 1750, loss[loss=0.1477, simple_loss=0.2234, pruned_loss=0.03601, over 24418.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2354, pruned_loss=0.03802, over 4720234.80 frames. ], batch size: 58, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:15:50,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1392826.6666666667, ans=0.2 2023-10-03 20:15:53,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:15:54,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1392826.6666666667, ans=0.125 2023-10-03 20:15:55,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:15:57,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:15:57,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 20:15:58,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:16:00,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:16:00,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:16:00,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1392826.6666666667, ans=0.125 2023-10-03 20:16:06,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 20:16:06,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1392893.3333333333, ans=0.025 2023-10-03 20:16:06,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1392893.3333333333, ans=0.2 2023-10-03 20:16:07,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:16:09,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 20:16:09,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:16:10,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:16:14,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 20:16:15,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 20:16:16,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:16:18,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 20:16:25,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:16:26,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:16:26,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:16:29,453 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 1.982e+02 2.207e+02 2.647e+02 3.651e+02, threshold=4.414e+02, percent-clipped=0.0 2023-10-03 20:16:29,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:16:29,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:16:31,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:16:34,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:16:37,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:16:38,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:16:39,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 20:16:42,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:16:43,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 20:16:45,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:16:46,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:16:46,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:16:50,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:16:50,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 20:16:50,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:16:52,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:16:53,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1393093.3333333333, ans=0.1 2023-10-03 20:16:53,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1393093.3333333333, ans=0.0 2023-10-03 20:16:54,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:16:59,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:16:59,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:16:59,819 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.70 vs. limit=15.0 2023-10-03 20:17:02,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 20:17:02,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:17:03,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.44 vs. limit=22.5 2023-10-03 20:17:03,618 INFO [train.py:1046] (1/4) Epoch 40, batch 1800, loss[loss=0.163, simple_loss=0.2541, pruned_loss=0.03599, over 24021.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2353, pruned_loss=0.03788, over 4722884.47 frames. ], batch size: 80, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:17:03,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:17:03,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:17:03,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:17:03,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:17:03,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:17:07,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:17:07,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1393160.0, ans=0.035 2023-10-03 20:17:08,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:17:11,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 20:17:13,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:17:16,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:17:17,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:17:20,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:17:21,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:17:23,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:17:23,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:17:25,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:17:25,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 20:17:27,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:17:29,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:17:34,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 20:17:36,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 20:17:37,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 20:17:37,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:17:39,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:17:39,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:17:39,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:17:40,053 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.89 vs. limit=15.0 2023-10-03 20:17:46,133 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 20:17:47,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:17:50,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:17:51,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 20:17:52,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 20:17:52,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:17:53,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:17:54,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:17:56,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1393360.0, ans=0.09899494936611666 2023-10-03 20:17:59,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 20:18:05,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:18:07,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 20:18:07,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:18:07,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:18:08,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:18:08,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 20:18:10,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:18:10,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:18:13,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 20:18:13,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:18:15,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:18:16,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:18:16,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:18:16,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:18:18,161 INFO [train.py:1046] (1/4) Epoch 40, batch 1850, loss[loss=0.171, simple_loss=0.237, pruned_loss=0.05246, over 19486.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2361, pruned_loss=0.03829, over 4725707.90 frames. ], batch size: 388, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:18:18,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:18:19,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:18:19,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:18:22,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:18:22,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:18:29,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:18:30,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 20:18:33,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 20:18:33,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1393560.0, ans=0.0 2023-10-03 20:18:36,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 20:18:37,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1393560.0, ans=0.125 2023-10-03 20:18:39,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:18:41,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 20:18:41,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 20:18:52,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:18:54,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 20:18:57,064 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.932e+02 2.118e+02 2.522e+02 3.488e+02, threshold=4.237e+02, percent-clipped=0.0 2023-10-03 20:18:57,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:18:57,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:18:59,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 20:19:01,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:01,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:19:02,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:19:02,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:19:03,009 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1393693.3333333333, ans=0.125 2023-10-03 20:19:05,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:19:08,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1393693.3333333333, ans=0.1 2023-10-03 20:19:09,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:19:10,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:10,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 20:19:10,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:19:12,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1393693.3333333333, ans=0.125 2023-10-03 20:19:13,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:19:14,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:19:17,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 20:19:19,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:19:22,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:19:23,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:19:24,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 20:19:24,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 20:19:25,398 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 20:19:25,495 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 20:19:26,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:19:28,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:19:28,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:19:28,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:28,255 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 20:19:29,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:19:29,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:29,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:19:29,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1393826.6666666667, ans=0.1 2023-10-03 20:19:30,887 INFO [train.py:1046] (1/4) Epoch 40, batch 1900, loss[loss=0.1685, simple_loss=0.245, pruned_loss=0.04603, over 23697.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2366, pruned_loss=0.03835, over 4724061.95 frames. ], batch size: 212, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:19:30,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:19:31,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:19:31,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 20:19:33,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:33,787 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 20:19:33,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:19:33,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:19:35,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1393826.6666666667, ans=0.125 2023-10-03 20:19:39,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:19:42,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:19:42,161 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 20:19:43,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 20:19:44,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:19:44,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:19:44,931 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 20:19:45,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1393893.3333333333, ans=0.1 2023-10-03 20:19:46,900 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 20:19:50,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 20:19:50,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:19:54,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1393893.3333333333, ans=0.0 2023-10-03 20:19:55,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 20:19:55,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1393893.3333333333, ans=0.0 2023-10-03 20:19:58,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 20:19:58,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1393893.3333333333, ans=0.125 2023-10-03 20:20:02,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1393960.0, ans=0.0 2023-10-03 20:20:05,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1393960.0, ans=0.125 2023-10-03 20:20:10,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 20:20:11,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 20:20:13,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:20:13,520 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 20:20:13,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 20:20:13,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 20:20:14,087 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.10 vs. limit=22.5 2023-10-03 20:20:14,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 20:20:14,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:20:17,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 20:20:22,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:20:25,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:20:25,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 20:20:26,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:20:29,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 20:20:31,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:20:35,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1394093.3333333333, ans=0.1 2023-10-03 20:20:36,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:20:36,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:20:36,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:20:38,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:20:40,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:20:40,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 20:20:40,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:20:43,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:20:43,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:20:46,360 INFO [train.py:1046] (1/4) Epoch 40, batch 1950, loss[loss=0.1625, simple_loss=0.245, pruned_loss=0.04001, over 23090.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2375, pruned_loss=0.03851, over 4714756.20 frames. ], batch size: 105, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:20:46,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:20:46,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:20:46,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:20:47,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:20:51,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:20:52,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:20:53,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:20:53,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:20:55,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 20:20:57,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 20:20:57,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:20:58,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:21:01,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:21:02,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:02,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:05,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:21:08,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:21:08,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:21:08,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:21:08,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:13,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:16,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:21:16,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:16,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:21:16,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 20:21:17,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:21:17,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:21:19,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:21:20,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:23,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:21:26,797 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.972e+02 2.252e+02 2.613e+02 4.035e+02, threshold=4.504e+02, percent-clipped=0.0 2023-10-03 20:21:26,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:21:27,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1394293.3333333333, ans=0.1 2023-10-03 20:21:29,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:21:31,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:21:31,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 20:21:31,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:21:35,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:21:35,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:21:36,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:21:45,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:48,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:49,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:52,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:21:54,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:21:54,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:21:55,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 20:21:55,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:21:57,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:59,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 20:22:00,503 INFO [train.py:1046] (1/4) Epoch 40, batch 2000, loss[loss=0.1786, simple_loss=0.2654, pruned_loss=0.04589, over 24045.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2378, pruned_loss=0.03871, over 4703432.45 frames. ], batch size: 80, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:22:00,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:22:04,173 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.90 vs. limit=15.0 2023-10-03 20:22:04,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:22:04,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:22:06,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:22:07,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:22:07,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1394493.3333333333, ans=0.2 2023-10-03 20:22:09,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:22:11,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 20:22:13,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:22:14,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:22:16,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 20:22:17,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:22:18,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:22:20,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:22:22,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=1394560.0, ans=10.0 2023-10-03 20:22:23,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 20:22:24,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:24,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1394560.0, ans=0.0 2023-10-03 20:22:27,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:27,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:27,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 20:22:28,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:22:30,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 20:22:30,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:22:30,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1394626.6666666667, ans=0.2 2023-10-03 20:22:33,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:22:34,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 20:22:34,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:34,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:22:35,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:22:37,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 20:22:38,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 20:22:38,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:22:38,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:22:44,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:22:46,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:22:46,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:22:47,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:22:49,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:22:49,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:22:49,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:22:49,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:22:52,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:55,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:22:57,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 20:23:01,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:23:03,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:07,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:07,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:23:08,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:11,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:23:11,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:12,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:23:12,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:23:14,231 INFO [train.py:1046] (1/4) Epoch 40, batch 2050, loss[loss=0.1524, simple_loss=0.2144, pruned_loss=0.04515, over 23536.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2363, pruned_loss=0.0382, over 4707245.56 frames. ], batch size: 256, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:23:15,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:15,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:18,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:23:20,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:22,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1394826.6666666667, ans=0.125 2023-10-03 20:23:25,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:23:26,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:23:26,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:28,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:23:30,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 20:23:30,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:23:31,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:23:31,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:23:39,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:23:39,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:41,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 20:23:43,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1394960.0, ans=0.0 2023-10-03 20:23:45,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:47,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 20:23:47,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:23:47,847 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=8.15 vs. limit=12.0 2023-10-03 20:23:50,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:23:53,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:23:54,427 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.914e+02 2.146e+02 2.312e+02 3.091e+02, threshold=4.293e+02, percent-clipped=0.0 2023-10-03 20:23:54,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:23:54,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:23:55,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:23:56,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1394960.0, ans=0.125 2023-10-03 20:23:57,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:23:57,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:24:00,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:24:01,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:24:04,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:24:04,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:24:08,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:24:13,742 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.29 vs. limit=12.0 2023-10-03 20:24:14,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:24:15,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 20:24:20,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:24:21,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:24:22,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:24:24,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 20:24:28,259 INFO [train.py:1046] (1/4) Epoch 40, batch 2100, loss[loss=0.1465, simple_loss=0.2235, pruned_loss=0.03477, over 23850.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2347, pruned_loss=0.03807, over 4687804.26 frames. ], batch size: 212, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:24:28,324 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 20:24:28,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:24:28,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:24:28,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:24:30,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:24:31,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 20:24:31,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 20:24:33,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:24:34,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:24:35,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:24:38,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:24:39,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:24:39,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 20:24:41,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:24:42,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 20:24:42,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 20:24:44,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:24:44,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:24:44,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 20:24:44,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 20:24:47,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1395226.6666666667, ans=0.0 2023-10-03 20:24:50,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 20:24:50,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:24:52,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1395226.6666666667, ans=0.0 2023-10-03 20:24:53,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:24:53,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:24:56,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:24:58,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 20:24:58,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:24:58,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 20:24:59,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 20:25:01,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:25:01,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 20:25:01,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 20:25:02,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 20:25:02,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1395293.3333333333, ans=0.1 2023-10-03 20:25:05,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:25:06,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:25:08,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:25:09,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:25:09,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:10,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1395293.3333333333, ans=0.2 2023-10-03 20:25:11,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:25:11,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 20:25:12,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:25:12,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:25:12,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:12,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 20:25:12,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1395360.0, ans=0.125 2023-10-03 20:25:14,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 20:25:15,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 20:25:20,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:25:21,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.19 vs. limit=15.0 2023-10-03 20:25:23,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:25:24,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 20:25:27,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:25:31,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:25:31,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:25:32,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:25:32,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 20:25:32,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:25:32,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1395426.6666666667, ans=0.0 2023-10-03 20:25:33,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:25:33,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:25:35,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:25:35,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:36,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 20:25:36,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1395426.6666666667, ans=0.125 2023-10-03 20:25:38,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 20:25:38,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:25:40,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:25:40,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:25:40,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:25:42,008 INFO [train.py:1046] (1/4) Epoch 40, batch 2150, loss[loss=0.1619, simple_loss=0.2526, pruned_loss=0.03562, over 24505.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2341, pruned_loss=0.0378, over 4688715.80 frames. ], batch size: 66, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:25:42,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:25:45,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 20:25:45,796 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:25:47,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:25:48,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:51,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:25:51,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:25:51,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:25:55,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:55,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:25:55,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:25:59,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:25:59,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 20:26:04,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:26:05,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:26:05,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:05,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:26:07,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:07,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:26:07,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:26:07,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:26:08,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:26:08,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 20:26:10,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:26:11,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:26:13,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:26:15,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:26:16,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:26:19,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:26:19,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:26:21,711 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.888e+02 2.102e+02 2.294e+02 3.502e+02, threshold=4.204e+02, percent-clipped=0.0 2023-10-03 20:26:21,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:26:21,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 20:26:21,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:26:24,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:26:24,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1395693.3333333333, ans=0.0 2023-10-03 20:26:26,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:27,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:26:27,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:26:29,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:26:29,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1395693.3333333333, ans=0.125 2023-10-03 20:26:30,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:30,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 20:26:32,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 20:26:32,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:26:32,163 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 20:26:32,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:26:32,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:26:34,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 20:26:34,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:26:34,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 20:26:34,209 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 20:26:34,209 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 20:26:35,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 20:26:36,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:26:36,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:26:38,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:26:39,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:39,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 20:26:41,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:26:41,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:50,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:26:51,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 20:26:51,974 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:26:54,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:26:56,444 INFO [train.py:1046] (1/4) Epoch 40, batch 2200, loss[loss=0.1852, simple_loss=0.249, pruned_loss=0.06072, over 19501.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2349, pruned_loss=0.03816, over 4686330.80 frames. ], batch size: 388, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:27:01,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:27:01,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:27:03,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:27:03,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:27:06,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:27:06,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:27:06,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 20:27:10,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 20:27:14,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:27:21,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 20:27:22,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:27:24,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:27:24,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:27:27,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:27:27,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 20:27:29,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1395960.0, ans=0.0 2023-10-03 20:27:29,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1395960.0, ans=0.125 2023-10-03 20:27:30,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:27:30,943 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.35 vs. limit=15.0 2023-10-03 20:27:31,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:27:31,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 20:27:34,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:27:36,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:27:36,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1395960.0, ans=0.0 2023-10-03 20:27:37,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1395960.0, ans=0.125 2023-10-03 20:27:38,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:27:39,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:27:41,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 20:27:43,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1396026.6666666667, ans=0.1 2023-10-03 20:27:44,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:27:45,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 20:27:48,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:27:48,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:27:50,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:27:51,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:27:52,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:27:52,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:27:52,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:27:54,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:27:55,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:27:57,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:27:58,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 20:28:00,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:28:01,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:28:03,092 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 20:28:05,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:28:05,853 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 20:28:07,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:28:07,314 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 20:28:08,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:28:09,907 INFO [train.py:1046] (1/4) Epoch 40, batch 2250, loss[loss=0.147, simple_loss=0.2336, pruned_loss=0.03025, over 24477.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2357, pruned_loss=0.03808, over 4699052.44 frames. ], batch size: 63, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:28:09,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 20:28:12,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:28:13,506 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 20:28:14,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:28:18,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:28:21,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:28:22,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:28:25,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:28:27,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:28:28,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:28:29,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 20:28:31,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:28:31,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:28:32,283 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.23 vs. limit=15.0 2023-10-03 20:28:34,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 20:28:35,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:28:35,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:28:37,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:28:37,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1396226.6666666667, ans=0.0 2023-10-03 20:28:43,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:28:44,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:28:44,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:28:46,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 20:28:46,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.whiten.whitening_limit, batch_count=1396293.3333333333, ans=12.0 2023-10-03 20:28:47,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:28:50,639 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.442e+02 1.901e+02 2.050e+02 2.328e+02 3.368e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-03 20:28:50,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:28:53,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1396360.0, ans=0.1 2023-10-03 20:28:55,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:28:56,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:28:58,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:28:58,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:28:59,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:29:01,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:29:05,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:29:06,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1396360.0, ans=0.125 2023-10-03 20:29:07,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 20:29:10,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:29:10,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:29:11,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:29:16,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 20:29:19,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:29:19,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 20:29:19,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:29:19,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:29:23,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 20:29:25,044 INFO [train.py:1046] (1/4) Epoch 40, batch 2300, loss[loss=0.1985, simple_loss=0.2678, pruned_loss=0.06462, over 19327.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2367, pruned_loss=0.03817, over 4711927.68 frames. ], batch size: 388, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:29:25,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:29:25,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:29:29,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:29:30,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:29:33,989 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 20:29:35,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:29:40,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:29:40,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 20:29:42,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:29:42,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:29:42,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 20:29:42,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:29:45,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:29:46,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:29:49,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:29:52,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:29:53,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1396626.6666666667, ans=0.0 2023-10-03 20:29:55,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:30:00,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:30:00,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:30:04,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:30:06,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:30:06,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1396626.6666666667, ans=0.125 2023-10-03 20:30:09,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:30:10,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:30:12,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:30:12,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 20:30:12,492 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:30:15,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1396693.3333333333, ans=0.125 2023-10-03 20:30:18,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 20:30:18,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:30:18,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:30:18,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:30:18,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:30:18,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 20:30:19,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:30:19,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 20:30:19,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:30:19,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:30:21,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 20:30:21,943 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.51 vs. limit=22.5 2023-10-03 20:30:28,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:30:32,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:30:34,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:30:35,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:30:35,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:30:35,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1396760.0, ans=0.0 2023-10-03 20:30:36,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:30:38,907 INFO [train.py:1046] (1/4) Epoch 40, batch 2350, loss[loss=0.1604, simple_loss=0.2363, pruned_loss=0.04228, over 23410.00 frames. ], tot_loss[loss=0.158, simple_loss=0.238, pruned_loss=0.03898, over 4708460.17 frames. ], batch size: 119, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:30:38,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:30:39,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:30:39,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1396826.6666666667, ans=0.2 2023-10-03 20:30:40,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 20:30:45,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:30:45,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 20:30:49,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 20:30:52,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:30:56,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:30:56,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:30:56,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:30:56,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:30:58,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 20:30:59,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:31:07,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 20:31:08,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:31:10,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:31:10,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:31:12,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:31:14,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 20:31:14,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:31:17,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:31:17,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:31:17,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:31:19,436 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 2.056e+02 2.202e+02 2.513e+02 4.106e+02, threshold=4.405e+02, percent-clipped=1.0 2023-10-03 20:31:21,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:31:22,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1397026.6666666667, ans=0.1 2023-10-03 20:31:23,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 20:31:23,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1397026.6666666667, ans=0.125 2023-10-03 20:31:24,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:31:26,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:31:26,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:31:29,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 20:31:30,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:31:31,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1397026.6666666667, ans=0.125 2023-10-03 20:31:32,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 20:31:32,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:31:38,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 20:31:38,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1397093.3333333333, ans=0.1 2023-10-03 20:31:40,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 20:31:40,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1397093.3333333333, ans=0.0 2023-10-03 20:31:41,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:31:41,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 20:31:41,423 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 20:31:41,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 20:31:44,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 20:31:46,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:31:50,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:31:53,171 INFO [train.py:1046] (1/4) Epoch 40, batch 2400, loss[loss=0.1371, simple_loss=0.2139, pruned_loss=0.0301, over 24345.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2381, pruned_loss=0.03885, over 4714395.76 frames. ], batch size: 56, lr: 2.54e-03, grad_scale: 32.0 2023-10-03 20:31:54,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:31:56,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1397160.0, ans=0.0 2023-10-03 20:31:57,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:31:59,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 20:31:59,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 20:32:02,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1397160.0, ans=0.2 2023-10-03 20:32:06,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 20:32:06,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:32:08,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 20:32:08,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:32:09,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:32:09,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 20:32:15,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1397226.6666666667, ans=0.2 2023-10-03 20:32:16,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:32:17,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 20:32:22,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:32:24,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1397293.3333333333, ans=0.125 2023-10-03 20:32:26,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 20:32:28,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:32:30,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:32:33,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:32:33,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 20:32:34,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:32:39,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1397360.0, ans=0.0 2023-10-03 20:32:41,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:32:44,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:32:44,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1397360.0, ans=0.09899494936611666 2023-10-03 20:32:47,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:32:48,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:32:48,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 20:32:48,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:32:48,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:32:50,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:32:50,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:32:54,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:32:55,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:32:55,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 20:32:56,236 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.72 vs. limit=12.0 2023-10-03 20:32:58,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 20:33:01,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:33:01,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:33:01,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 20:33:02,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 20:33:02,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 20:33:02,838 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 20:33:04,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 20:33:04,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:33:05,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:33:05,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:33:06,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=1397493.3333333333, ans=15.0 2023-10-03 20:33:06,992 INFO [train.py:1046] (1/4) Epoch 40, batch 2450, loss[loss=0.1645, simple_loss=0.2434, pruned_loss=0.04283, over 23312.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.237, pruned_loss=0.03842, over 4708992.60 frames. ], batch size: 105, lr: 2.54e-03, grad_scale: 32.0 2023-10-03 20:33:07,069 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 20:33:07,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:33:08,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 20:33:10,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1397493.3333333333, ans=0.0 2023-10-03 20:33:11,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:33:11,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:33:15,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:33:15,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:33:17,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 20:33:22,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:33:22,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:33:26,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:33:27,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:33:27,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:33:27,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 20:33:32,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:33:34,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:33:34,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:33:34,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1397560.0, ans=0.2 2023-10-03 20:33:35,743 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1397626.6666666667, ans=0.2 2023-10-03 20:33:36,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:33:36,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:33:38,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:33:38,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:33:40,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 20:33:41,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:33:46,905 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.910e+02 2.084e+02 2.402e+02 3.633e+02, threshold=4.168e+02, percent-clipped=0.0 2023-10-03 20:33:48,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1397626.6666666667, ans=0.0 2023-10-03 20:33:49,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:33:50,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:33:50,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:33:51,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:33:51,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:33:53,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:33:54,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 20:33:57,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:33:57,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:33:59,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:34:00,069 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.51 vs. limit=6.0 2023-10-03 20:34:01,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:34:04,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:34:04,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 20:34:05,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:34:06,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:34:08,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 20:34:08,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:34:08,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:34:10,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1397760.0, ans=0.125 2023-10-03 20:34:13,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1397760.0, ans=0.125 2023-10-03 20:34:14,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:34:15,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:34:15,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:34:20,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 20:34:21,408 INFO [train.py:1046] (1/4) Epoch 40, batch 2500, loss[loss=0.1709, simple_loss=0.2554, pruned_loss=0.04321, over 24301.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2359, pruned_loss=0.03802, over 4707602.28 frames. ], batch size: 77, lr: 2.54e-03, grad_scale: 32.0 2023-10-03 20:34:21,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:34:29,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:34:32,715 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.72 vs. limit=15.0 2023-10-03 20:34:36,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:34:36,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:34:38,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:34:38,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 20:34:45,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:34:45,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:34:45,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 20:34:45,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 20:34:47,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 20:34:48,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:34:48,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:34:49,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 20:34:49,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:34:50,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 20:34:50,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:34:52,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1397960.0, ans=0.125 2023-10-03 20:34:56,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:34:57,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:34:57,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1397960.0, ans=0.125 2023-10-03 20:34:59,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:35:00,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 20:35:00,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:35:04,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:35:04,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1397960.0, ans=0.125 2023-10-03 20:35:04,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1397960.0, ans=0.0 2023-10-03 20:35:06,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:35:09,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:35:11,713 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.99 vs. limit=22.5 2023-10-03 20:35:12,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:35:16,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:35:19,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 20:35:19,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:35:19,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 20:35:19,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1398093.3333333333, ans=0.09899494936611666 2023-10-03 20:35:21,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:35:21,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:35:23,173 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 20:35:23,173 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 20:35:23,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 20:35:24,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:35:27,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 20:35:27,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 20:35:28,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:35:28,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 20:35:33,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 20:35:34,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:35:34,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:35:35,916 INFO [train.py:1046] (1/4) Epoch 40, batch 2550, loss[loss=0.1486, simple_loss=0.2277, pruned_loss=0.03479, over 24471.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2364, pruned_loss=0.03844, over 4703816.45 frames. ], batch size: 58, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:35:35,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:35:38,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:35:38,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 20:35:40,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:35:43,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 20:35:44,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:35:47,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:35:48,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:35:48,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 20:35:49,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:35:49,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1398226.6666666667, ans=0.125 2023-10-03 20:35:49,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1398226.6666666667, ans=0.125 2023-10-03 20:35:50,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:35:50,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:35:53,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:35:53,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 20:35:53,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 20:35:53,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:35:53,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 20:36:07,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:36:11,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:36:11,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:36:11,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:36:11,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1398293.3333333333, ans=0.125 2023-10-03 20:36:13,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:36:16,969 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.930e+02 2.153e+02 2.349e+02 3.303e+02, threshold=4.307e+02, percent-clipped=0.0 2023-10-03 20:36:21,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:36:22,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:36:22,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:36:22,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:36:23,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:36:23,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:36:24,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1398360.0, ans=0.125 2023-10-03 20:36:26,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1398360.0, ans=0.0 2023-10-03 20:36:27,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:36:27,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:36:32,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:36:32,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 20:36:32,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:36:33,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:36:33,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:36:34,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:36:36,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:36:37,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1398426.6666666667, ans=0.0 2023-10-03 20:36:42,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1398426.6666666667, ans=0.125 2023-10-03 20:36:43,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:36:43,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1398426.6666666667, ans=0.2 2023-10-03 20:36:46,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:36:48,923 INFO [train.py:1046] (1/4) Epoch 40, batch 2600, loss[loss=0.1494, simple_loss=0.2373, pruned_loss=0.03069, over 24506.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2369, pruned_loss=0.03839, over 4706215.25 frames. ], batch size: 66, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:36:48,971 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 20:36:50,473 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 20:36:50,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:36:51,843 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 20:36:51,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 20:36:51,934 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 20:36:54,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:36:54,710 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 20:36:56,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 20:36:58,471 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.65 vs. limit=10.0 2023-10-03 20:36:59,219 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 20:37:01,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:37:02,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 20:37:04,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 20:37:05,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:37:05,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 20:37:08,020 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.63 vs. limit=15.0 2023-10-03 20:37:08,625 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 20:37:08,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 20:37:16,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:37:17,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:37:17,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:37:17,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 20:37:18,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:37:24,275 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 20:37:29,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:37:31,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:37:32,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 20:37:32,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:37:32,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:37:33,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 20:37:37,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:37:37,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:37:38,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:37:41,589 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 20:37:41,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:37:42,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:37:47,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:37:47,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:37:47,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 20:37:49,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:37:50,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:37:51,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:37:54,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 20:37:56,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:37:58,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:38:02,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 20:38:02,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:38:03,473 INFO [train.py:1046] (1/4) Epoch 40, batch 2650, loss[loss=0.1789, simple_loss=0.2501, pruned_loss=0.05381, over 23745.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2376, pruned_loss=0.03874, over 4718182.02 frames. ], batch size: 179, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:38:03,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:38:04,891 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 20:38:04,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:38:07,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:38:09,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 20:38:11,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:38:14,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:38:15,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 20:38:15,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:38:16,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:38:19,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 20:38:21,157 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 20:38:23,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:38:25,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 20:38:25,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:38:26,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 20:38:31,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:38:31,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:38:32,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:38:32,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:38:35,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 20:38:35,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 20:38:37,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1398960.0, ans=0.2 2023-10-03 20:38:38,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:38:41,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 20:38:41,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:38:43,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:38:43,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:38:45,164 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.003e+02 2.143e+02 2.550e+02 3.121e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 20:38:45,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:38:45,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:38:46,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:38:48,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:38:50,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:38:52,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:38:53,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:38:56,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:38:56,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:38:57,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:38:59,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:39:00,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 20:39:04,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:39:04,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:39:04,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:39:04,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1399093.3333333333, ans=0.0 2023-10-03 20:39:05,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 20:39:08,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:39:10,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:39:11,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1399093.3333333333, ans=0.2 2023-10-03 20:39:12,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:39:14,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:14,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:39:15,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:17,265 INFO [train.py:1046] (1/4) Epoch 40, batch 2700, loss[loss=0.1675, simple_loss=0.253, pruned_loss=0.04106, over 23943.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2383, pruned_loss=0.03912, over 4718695.80 frames. ], batch size: 80, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 20:39:18,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:39:18,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 20:39:21,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:39:23,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 20:39:24,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:39:26,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:26,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:26,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:39:26,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:39:27,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:39:27,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:39:27,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 20:39:28,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:39:30,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:39:31,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:39:31,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:39:36,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:39:36,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 20:39:37,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1399226.6666666667, ans=0.09899494936611666 2023-10-03 20:39:38,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:39:42,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:39:42,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:39:48,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:39:48,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:39:48,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:39:48,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:39:52,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:39:55,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:39:55,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:39:55,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:39:57,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:57,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:40:06,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:40:06,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:40:09,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1399360.0, ans=0.125 2023-10-03 20:40:10,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:40:10,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:40:13,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:40:15,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:40:17,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:40:18,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:19,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:40:20,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:40:20,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1399426.6666666667, ans=0.1 2023-10-03 20:40:22,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:40:22,867 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:40:24,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:40:24,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:40:24,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1399426.6666666667, ans=0.1 2023-10-03 20:40:26,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1399426.6666666667, ans=0.05 2023-10-03 20:40:28,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 20:40:30,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:40:30,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1399493.3333333333, ans=0.125 2023-10-03 20:40:31,498 INFO [train.py:1046] (1/4) Epoch 40, batch 2750, loss[loss=0.1559, simple_loss=0.2315, pruned_loss=0.04013, over 23492.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2378, pruned_loss=0.03888, over 4723681.42 frames. ], batch size: 134, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 20:40:31,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:40:31,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 20:40:32,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 20:40:33,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:40:36,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:40:38,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:40:38,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1399493.3333333333, ans=0.1 2023-10-03 20:40:38,941 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.43 vs. limit=15.0 2023-10-03 20:40:39,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:39,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:40:40,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:43,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:40:43,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 20:40:45,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:40:45,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:45,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 20:40:45,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:40:45,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:40:46,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1399560.0, ans=0.125 2023-10-03 20:40:50,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 20:40:52,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:40:52,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:53,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:40:53,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 20:40:55,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:40:56,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:40:56,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:40:56,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:41:01,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:41:01,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 20:41:01,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:41:03,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:41:05,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:41:13,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:41:14,587 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.978e+02 2.171e+02 2.705e+02 3.940e+02, threshold=4.342e+02, percent-clipped=0.0 2023-10-03 20:41:14,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:41:14,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:41:15,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1399693.3333333333, ans=0.0 2023-10-03 20:41:16,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1399693.3333333333, ans=0.0 2023-10-03 20:41:18,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:41:18,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:41:19,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:41:25,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:41:25,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:41:25,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 20:41:29,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:41:31,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 20:41:36,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 20:41:38,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:41:40,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 20:41:40,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:41:42,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:41:42,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 20:41:42,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:41:46,240 INFO [train.py:1046] (1/4) Epoch 40, batch 2800, loss[loss=0.1755, simple_loss=0.2574, pruned_loss=0.04678, over 23928.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2362, pruned_loss=0.0382, over 4716619.54 frames. ], batch size: 86, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:41:46,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 20:41:46,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:41:47,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:41:47,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 20:41:47,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:41:49,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:41:51,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:41:51,063 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 20:41:51,063 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 20:41:53,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:41:54,623 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=5.53 vs. limit=12.0 2023-10-03 20:41:55,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:41:55,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:41:58,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:42:01,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 20:42:03,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 20:42:05,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 20:42:05,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:42:07,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:42:07,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:42:07,744 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.45 vs. limit=15.0 2023-10-03 20:42:11,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:42:11,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:42:11,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:42:13,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:42:21,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:42:23,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:42:25,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:42:25,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:42:26,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:42:31,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:42:31,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 20:42:32,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:42:32,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:42:32,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:42:37,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:42:37,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:42:40,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1400026.6666666667, ans=0.125 2023-10-03 20:42:42,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:42:43,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:42:43,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:42:43,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:42:43,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:42:44,110 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.21 vs. limit=10.0 2023-10-03 20:42:44,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:42:46,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:42:46,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 20:42:46,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:42:47,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:42:47,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:42:49,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 20:42:50,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:42:50,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:42:52,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:42:53,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 20:42:59,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:43:00,986 INFO [train.py:1046] (1/4) Epoch 40, batch 2850, loss[loss=0.1542, simple_loss=0.2426, pruned_loss=0.03284, over 24130.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2353, pruned_loss=0.03816, over 4713987.89 frames. ], batch size: 80, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:43:01,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:43:01,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:43:02,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:43:04,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1400160.0, ans=0.2 2023-10-03 20:43:05,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:43:06,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:43:06,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:43:09,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:43:11,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:43:12,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1400160.0, ans=0.0 2023-10-03 20:43:13,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:43:13,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 20:43:19,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 20:43:19,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:43:20,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 20:43:21,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:43:23,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 20:43:23,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1400226.6666666667, ans=0.1 2023-10-03 20:43:24,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 20:43:25,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:43:37,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:43:39,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:43:39,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:43:39,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:43:40,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:43:40,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:43:42,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:43:44,038 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 1.957e+02 2.192e+02 2.497e+02 4.123e+02, threshold=4.384e+02, percent-clipped=0.0 2023-10-03 20:43:44,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 20:43:45,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:43:45,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:43:46,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:43:46,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:43:50,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:43:50,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:43:50,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:43:51,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:43:51,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1400360.0, ans=0.1 2023-10-03 20:43:52,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:43:54,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:43:55,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:43:56,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:43:59,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:44:02,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 20:44:02,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 20:44:05,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 20:44:05,911 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.73 vs. limit=15.0 2023-10-03 20:44:06,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:44:06,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 20:44:08,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:44:08,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:44:08,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:44:08,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:44:08,235 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 20:44:10,136 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 20:44:10,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:44:10,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:44:10,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1400426.6666666667, ans=10.0 2023-10-03 20:44:14,245 INFO [train.py:1046] (1/4) Epoch 40, batch 2900, loss[loss=0.1474, simple_loss=0.2311, pruned_loss=0.03186, over 24670.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2352, pruned_loss=0.03793, over 4719489.38 frames. ], batch size: 65, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:44:16,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:44:16,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:44:16,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:44:16,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 20:44:16,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1400493.3333333333, ans=0.125 2023-10-03 20:44:20,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:44:20,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 20:44:22,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 20:44:23,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:44:23,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:44:24,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:44:26,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:44:29,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:44:30,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:44:32,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:44:32,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 20:44:32,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:44:33,228 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.01 vs. limit=15.0 2023-10-03 20:44:35,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:44:37,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 20:44:39,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 20:44:43,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:44:43,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 20:44:43,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:44:44,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1400626.6666666667, ans=0.125 2023-10-03 20:44:46,608 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.71 vs. limit=10.0 2023-10-03 20:44:47,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:44:47,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:44:48,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:44:50,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:44:53,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:44:55,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:44:56,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 20:44:56,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 20:44:57,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:45:01,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:45:01,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1400693.3333333333, ans=0.125 2023-10-03 20:45:04,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 20:45:06,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:45:10,922 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.78 vs. limit=22.5 2023-10-03 20:45:12,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:45:21,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:45:21,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:45:22,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1400760.0, ans=0.0 2023-10-03 20:45:23,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 20:45:27,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:45:27,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 20:45:28,537 INFO [train.py:1046] (1/4) Epoch 40, batch 2950, loss[loss=0.1497, simple_loss=0.2372, pruned_loss=0.03113, over 24446.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2353, pruned_loss=0.03776, over 4735229.59 frames. ], batch size: 66, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:45:28,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:45:30,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:45:34,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:45:35,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 20:45:35,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:45:35,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:45:37,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:45:38,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:45:40,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 20:45:40,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 20:45:42,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:45:42,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:45:48,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:45:50,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:45:52,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:45:53,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:45:55,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:45:55,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:45:56,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:45:57,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:45:57,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:45:59,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1400960.0, ans=0.0 2023-10-03 20:46:00,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 20:46:05,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1400960.0, ans=0.2 2023-10-03 20:46:06,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 20:46:06,344 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 20:46:07,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:46:09,063 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 20:46:10,282 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.938e+02 2.092e+02 2.435e+02 3.514e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-03 20:46:11,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 20:46:11,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:46:11,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:46:11,835 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 20:46:11,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 20:46:15,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 20:46:15,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:46:16,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:46:19,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:46:19,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:46:19,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:46:19,509 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 20:46:20,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:46:20,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 20:46:24,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:46:25,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:46:26,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 20:46:26,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:46:28,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 20:46:29,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:46:30,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1401093.3333333333, ans=0.07 2023-10-03 20:46:31,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:46:32,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:46:33,111 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.11 vs. limit=15.0 2023-10-03 20:46:34,374 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.40 vs. limit=22.5 2023-10-03 20:46:35,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:46:35,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 20:46:37,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:46:37,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:46:37,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:46:39,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:46:39,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:46:40,689 INFO [train.py:1046] (1/4) Epoch 40, batch 3000, loss[loss=0.154, simple_loss=0.2432, pruned_loss=0.03239, over 24647.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2369, pruned_loss=0.03837, over 4733464.98 frames. ], batch size: 73, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:46:40,690 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 20:46:52,707 INFO [train.py:1078] (1/4) Epoch 40, validation: loss=0.3553, simple_loss=0.2798, pruned_loss=0.2154, over 1125622.00 frames. 2023-10-03 20:46:52,707 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 20:46:52,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:46:54,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:46:54,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 20:46:55,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:46:59,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:46:59,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:47:02,435 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 20:47:02,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 20:47:05,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:47:06,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:47:06,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 20:47:06,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:47:13,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:47:16,353 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.85 vs. limit=15.0 2023-10-03 20:47:21,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:47:26,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1401293.3333333333, ans=10.0 2023-10-03 20:47:29,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 20:47:29,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:47:32,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:47:32,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:47:32,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:47:33,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:47:33,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 20:47:36,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 20:47:37,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:47:37,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:47:41,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:47:41,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:47:42,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:47:42,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:47:45,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:47:46,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:47:46,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:47:48,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:47:50,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 20:47:50,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1401426.6666666667, ans=0.125 2023-10-03 20:47:51,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:47:51,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:47:51,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:47:52,510 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.74 vs. limit=10.0 2023-10-03 20:47:56,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:47:56,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:47:57,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 20:47:59,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 20:47:59,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:47:59,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 20:48:00,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:48:01,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 20:48:05,799 INFO [train.py:1046] (1/4) Epoch 40, batch 3050, loss[loss=0.1495, simple_loss=0.2335, pruned_loss=0.03274, over 24502.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.238, pruned_loss=0.03886, over 4734784.81 frames. ], batch size: 63, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:48:05,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:48:05,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 20:48:07,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 20:48:07,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 20:48:07,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 20:48:08,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:48:10,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:48:10,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:48:10,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:11,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:48:13,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 20:48:13,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1401493.3333333333, ans=0.035 2023-10-03 20:48:13,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1401493.3333333333, ans=0.1 2023-10-03 20:48:14,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:48:17,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:48:18,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:48:20,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:24,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 20:48:29,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 20:48:29,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 20:48:31,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:48:34,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:48:37,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:37,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:48:38,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:48:39,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:48:41,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:48:41,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:48:41,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:48:41,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:48:44,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:45,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:48:48,606 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.898e+02 2.071e+02 2.314e+02 3.328e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-03 20:48:48,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:48:48,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 20:48:50,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:50,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:48:53,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:48:53,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:48:55,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:48:55,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:49:01,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:49:01,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:49:06,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:06,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:49:06,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:49:08,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:49:08,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 20:49:09,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:49:09,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 20:49:10,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:49:11,586 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.96 vs. limit=15.0 2023-10-03 20:49:12,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:12,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 20:49:15,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:49:20,218 INFO [train.py:1046] (1/4) Epoch 40, batch 3100, loss[loss=0.1565, simple_loss=0.226, pruned_loss=0.04347, over 23755.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2373, pruned_loss=0.03871, over 4735007.36 frames. ], batch size: 179, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:49:20,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:49:21,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:49:23,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:49:25,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 20:49:27,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 20:49:30,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 20:49:31,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:49:32,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1401826.6666666667, ans=0.09899494936611666 2023-10-03 20:49:33,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:49:33,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:36,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 20:49:39,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:44,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 20:49:45,724 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:49:50,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 20:49:50,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:49:50,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:49:50,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:49:51,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 20:49:51,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1401960.0, ans=0.125 2023-10-03 20:49:52,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:49:53,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 20:49:53,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:49:55,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:56,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 20:49:57,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:49:57,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1401960.0, ans=0.125 2023-10-03 20:50:00,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:50:01,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 20:50:03,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 20:50:04,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:04,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:50:06,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1402026.6666666667, ans=0.2 2023-10-03 20:50:07,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:07,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:07,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:50:07,881 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:50:08,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:50:08,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:50:10,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:50:10,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:50:12,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:12,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 20:50:16,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:50:19,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 20:50:21,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:50:21,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 20:50:22,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:22,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:24,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 20:50:27,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.01 vs. limit=15.0 2023-10-03 20:50:32,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 20:50:33,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1402160.0, ans=0.2 2023-10-03 20:50:34,337 INFO [train.py:1046] (1/4) Epoch 40, batch 3150, loss[loss=0.168, simple_loss=0.2502, pruned_loss=0.0429, over 24390.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2365, pruned_loss=0.03863, over 4724547.07 frames. ], batch size: 77, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 20:50:34,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:50:34,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:37,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:50:37,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:50:37,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 20:50:38,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:50:38,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 20:50:40,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 20:50:41,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:44,754 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 20:50:48,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 20:50:49,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:50:49,550 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 20:50:50,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 20:50:54,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 20:50:54,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 20:50:54,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 20:50:54,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:54,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:50:54,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=1402226.6666666667, ans=6.0 2023-10-03 20:50:55,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:56,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 20:50:58,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:50:58,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:50:58,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1402226.6666666667, ans=0.125 2023-10-03 20:51:00,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:51:02,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:51:05,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 20:51:07,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:51:08,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:51:08,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1402293.3333333333, ans=0.0 2023-10-03 20:51:09,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:51:10,723 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.71 vs. limit=15.0 2023-10-03 20:51:11,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 20:51:11,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1402293.3333333333, ans=0.04949747468305833 2023-10-03 20:51:14,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 20:51:14,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:51:15,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 20:51:15,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 20:51:16,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:51:16,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:51:17,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:51:17,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 20:51:18,690 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.948e+02 2.114e+02 2.510e+02 3.900e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-03 20:51:18,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 20:51:18,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:51:18,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:20,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:51:22,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:51:22,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 20:51:22,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:51:23,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 20:51:25,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:25,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 20:51:25,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 20:51:25,781 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1402360.0, ans=0.125 2023-10-03 20:51:28,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:51:28,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:51:30,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 20:51:30,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 20:51:30,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:51:34,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:51:34,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:35,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:51:36,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1402426.6666666667, ans=0.2 2023-10-03 20:51:40,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1402426.6666666667, ans=0.1 2023-10-03 20:51:41,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:51:41,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:44,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 20:51:48,683 INFO [train.py:1046] (1/4) Epoch 40, batch 3200, loss[loss=0.1625, simple_loss=0.245, pruned_loss=0.04001, over 24048.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.235, pruned_loss=0.0383, over 4705215.78 frames. ], batch size: 80, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:51:48,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:51:48,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:51:49,476 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.65 vs. limit=12.0 2023-10-03 20:51:53,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:54,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:51:54,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 20:52:00,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:52:02,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:52:03,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1402560.0, ans=0.125 2023-10-03 20:52:05,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:52:14,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:52:23,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 20:52:24,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:52:27,330 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.61 vs. limit=15.0 2023-10-03 20:52:28,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 20:52:29,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:52:32,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:52:32,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:52:34,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:52:38,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 20:52:39,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 20:52:40,275 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.74 vs. limit=15.0 2023-10-03 20:52:42,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 20:52:45,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 20:52:45,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:52:49,332 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.00 vs. limit=15.0 2023-10-03 20:52:51,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:52:51,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1402760.0, ans=0.125 2023-10-03 20:52:52,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:52:52,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:52:52,995 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 20:52:52,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 20:52:53,290 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:52:56,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1402760.0, ans=0.0 2023-10-03 20:52:58,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:52:59,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 20:53:00,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 20:53:00,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 20:53:02,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 20:53:04,232 INFO [train.py:1046] (1/4) Epoch 40, batch 3250, loss[loss=0.1751, simple_loss=0.2616, pruned_loss=0.04428, over 24000.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2347, pruned_loss=0.03796, over 4713347.33 frames. ], batch size: 80, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:53:04,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:53:06,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:53:06,969 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 20:53:06,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:53:06,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:09,569 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 20:53:12,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:53:14,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1402826.6666666667, ans=0.125 2023-10-03 20:53:15,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:53:21,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:53:22,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 20:53:23,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:53:23,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:53:23,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:53:27,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:53:27,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:53:29,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1402893.3333333333, ans=0.2 2023-10-03 20:53:30,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:30,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:53:30,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:53:30,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:30,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:30,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:53:33,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:53:34,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:53:38,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:53:38,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:39,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:53:40,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:53:40,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:53:45,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 20:53:45,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:53:45,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:53:47,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:53:48,328 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.926e+02 2.163e+02 2.567e+02 5.244e+02, threshold=4.326e+02, percent-clipped=4.0 2023-10-03 20:53:48,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:53:55,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:53:58,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1403026.6666666667, ans=0.0 2023-10-03 20:54:01,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:54:01,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:01,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 20:54:01,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:54:01,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 20:54:01,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:05,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 20:54:05,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 20:54:05,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:54:07,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:54:09,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:54:09,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 20:54:10,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:54:10,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1403093.3333333333, ans=0.1 2023-10-03 20:54:14,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:54:14,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:54:15,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 20:54:15,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:54:17,218 INFO [train.py:1046] (1/4) Epoch 40, batch 3300, loss[loss=0.1615, simple_loss=0.2429, pruned_loss=0.04004, over 23343.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2363, pruned_loss=0.03847, over 4721492.97 frames. ], batch size: 93, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:54:18,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:54:18,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 20:54:21,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:54:21,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 20:54:22,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 20:54:24,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 20:54:24,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:54:27,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:54:29,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:54:29,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:31,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 20:54:31,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:54:32,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1403226.6666666667, ans=0.125 2023-10-03 20:54:33,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:54:34,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:54:39,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 20:54:40,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:54:40,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:54:41,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:43,336 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 20:54:43,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:54:43,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1403226.6666666667, ans=0.0 2023-10-03 20:54:44,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 20:54:44,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:54:44,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:54:46,100 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 20:54:50,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:54:50,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:54:52,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:52,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 20:54:53,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 20:54:53,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:54,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:54:56,354 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 20:54:57,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 20:54:59,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:55:02,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 20:55:03,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:55:05,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1403360.0, ans=0.125 2023-10-03 20:55:05,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1403360.0, ans=0.0 2023-10-03 20:55:06,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:55:07,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:55:11,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:55:12,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:55:12,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:55:12,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:55:13,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:55:13,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:55:15,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:55:16,670 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 20:55:18,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 20:55:19,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 20:55:20,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:55:20,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:55:22,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:55:22,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:55:22,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:55:24,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:55:24,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:55:25,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:55:26,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:55:29,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 20:55:29,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:55:31,102 INFO [train.py:1046] (1/4) Epoch 40, batch 3350, loss[loss=0.1781, simple_loss=0.2549, pruned_loss=0.05068, over 23901.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2372, pruned_loss=0.03865, over 4728849.89 frames. ], batch size: 195, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:55:31,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:55:32,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:55:33,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:55:36,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:55:37,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:55:37,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:55:40,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:55:43,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:55:43,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:55:46,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:55:46,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=1403560.0, ans=0.05 2023-10-03 20:55:47,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:55:49,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:55:50,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:55:51,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 20:55:53,724 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 20:55:53,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:55:56,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 20:55:56,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 20:55:56,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:55:56,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:55:56,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1403560.0, ans=0.1 2023-10-03 20:55:58,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:55:58,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 20:55:59,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:55:59,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:56:00,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:56:02,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:56:02,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:56:04,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:56:09,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:56:10,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:56:10,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:56:14,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:56:16,073 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.915e+02 2.091e+02 2.361e+02 5.355e+02, threshold=4.181e+02, percent-clipped=1.0 2023-10-03 20:56:16,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:56:17,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:56:17,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:56:19,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:56:20,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 20:56:21,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:56:21,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 20:56:21,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:56:21,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 20:56:23,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:56:25,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:56:32,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:56:33,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 20:56:33,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:56:35,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:56:37,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:56:37,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1403760.0, ans=0.09899494936611666 2023-10-03 20:56:41,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:56:44,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 20:56:46,171 INFO [train.py:1046] (1/4) Epoch 40, batch 3400, loss[loss=0.162, simple_loss=0.2363, pruned_loss=0.04386, over 23946.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2376, pruned_loss=0.03875, over 4728927.24 frames. ], batch size: 180, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:56:46,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:56:46,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:56:47,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:56:47,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 20:56:49,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:56:49,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 20:56:49,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1403826.6666666667, ans=0.125 2023-10-03 20:56:50,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:56:50,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:56:52,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:56:53,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:56:53,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 20:56:57,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 20:56:57,970 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 20:56:57,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:57:02,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:57:02,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:57:02,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:57:03,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:57:09,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:57:11,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 20:57:14,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:57:17,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:57:17,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:57:17,469 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:57:18,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 20:57:23,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:57:26,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 20:57:31,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:57:33,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:57:33,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 20:57:34,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:57:34,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:57:36,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:57:36,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:57:39,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:57:39,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1404026.6666666667, ans=0.125 2023-10-03 20:57:43,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:57:43,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:57:47,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1404093.3333333333, ans=0.0 2023-10-03 20:57:48,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:57:49,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 20:57:55,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 20:57:57,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1404093.3333333333, ans=0.2 2023-10-03 20:57:59,225 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.66 vs. limit=12.0 2023-10-03 20:57:59,855 INFO [train.py:1046] (1/4) Epoch 40, batch 3450, loss[loss=0.1393, simple_loss=0.2046, pruned_loss=0.03702, over 22705.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2384, pruned_loss=0.0392, over 4713494.31 frames. ], batch size: 322, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:57:59,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 20:58:04,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 20:58:05,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:58:07,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:58:07,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 20:58:07,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:58:12,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:58:15,601 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.07 vs. limit=22.5 2023-10-03 20:58:16,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:58:18,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:58:18,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:58:18,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:58:20,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:58:27,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 20:58:29,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 20:58:31,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:58:31,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:58:32,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:58:38,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 20:58:38,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:58:42,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:58:42,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:58:43,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:58:44,697 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.925e+02 2.071e+02 2.347e+02 3.387e+02, threshold=4.143e+02, percent-clipped=0.0 2023-10-03 20:58:46,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:58:47,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 20:58:47,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:58:47,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:58:50,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:58:53,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 20:58:58,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:58:58,774 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.07 vs. limit=12.0 2023-10-03 20:59:03,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:59:04,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:08,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:59:13,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:13,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:59:13,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:59:14,339 INFO [train.py:1046] (1/4) Epoch 40, batch 3500, loss[loss=0.1699, simple_loss=0.2599, pruned_loss=0.03992, over 23968.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2371, pruned_loss=0.03846, over 4721506.61 frames. ], batch size: 86, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 20:59:14,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:59:18,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:59:20,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:59:21,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 20:59:23,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 20:59:24,953 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:59:26,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 20:59:28,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:59:29,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 20:59:30,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1404560.0, ans=0.09899494936611666 2023-10-03 20:59:32,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:59:33,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:59:33,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:59:33,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:59:33,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:59:34,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:34,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:59:35,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 20:59:38,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:40,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:59:41,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:59:44,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:44,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 20:59:45,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:59:50,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:59:50,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1404626.6666666667, ans=0.125 2023-10-03 20:59:51,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:59:53,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:54,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:59:56,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:59:57,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 20:59:58,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 20:59:58,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 20:59:58,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1404693.3333333333, ans=0.125 2023-10-03 20:59:59,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:00:00,159 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.37 vs. limit=15.0 2023-10-03 21:00:00,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:00:02,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:00:02,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:00:03,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1404693.3333333333, ans=0.125 2023-10-03 21:00:04,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 21:00:06,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:00:08,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:00:09,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1404693.3333333333, ans=0.1 2023-10-03 21:00:10,516 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.84 vs. limit=22.5 2023-10-03 21:00:11,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 21:00:11,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 21:00:11,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:00:13,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:00:15,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:00:16,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:00:19,224 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.95 vs. limit=22.5 2023-10-03 21:00:19,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 21:00:19,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:00:21,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:00:23,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 21:00:24,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 21:00:27,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:00:29,206 INFO [train.py:1046] (1/4) Epoch 40, batch 3550, loss[loss=0.161, simple_loss=0.2318, pruned_loss=0.04505, over 23812.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2354, pruned_loss=0.03822, over 4714285.85 frames. ], batch size: 212, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 21:00:29,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:00:29,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:00:30,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:00:33,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:00:38,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1404826.6666666667, ans=0.125 2023-10-03 21:00:42,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:00:42,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 21:00:42,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1404893.3333333333, ans=0.125 2023-10-03 21:00:45,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:00:46,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:00:46,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1404893.3333333333, ans=0.0 2023-10-03 21:00:47,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:00:49,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:00:49,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:00:53,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:00:53,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:00:53,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:00:53,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 21:00:55,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:00:59,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:00:59,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:01:01,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:01:01,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:01:01,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:01:01,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 21:01:01,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:01:03,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:01:04,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 21:01:09,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:10,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:01:10,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:13,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 21:01:13,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:01:14,494 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.948e+02 2.132e+02 2.372e+02 3.710e+02, threshold=4.263e+02, percent-clipped=0.0 2023-10-03 21:01:14,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 21:01:15,181 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.50 vs. limit=15.0 2023-10-03 21:01:15,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:01:17,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:01:18,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:01:22,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 21:01:22,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:01:27,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:01:29,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 21:01:29,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:01:30,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:01:32,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 21:01:38,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 21:01:38,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:01:39,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:01:41,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:01:42,644 INFO [train.py:1046] (1/4) Epoch 40, batch 3600, loss[loss=0.1651, simple_loss=0.2441, pruned_loss=0.04299, over 23795.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2349, pruned_loss=0.03811, over 4701985.29 frames. ], batch size: 179, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:01:42,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:01:44,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:01:46,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:01:48,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:48,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:01:49,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:01:49,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:49,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 21:01:49,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1405160.0, ans=0.125 2023-10-03 21:01:54,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:01:54,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:57,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:01:59,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:01:59,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:02:00,261 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.18 vs. limit=15.0 2023-10-03 21:02:01,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:02:01,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 21:02:02,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:02:05,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:02:05,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:02:06,656 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.90 vs. limit=10.0 2023-10-03 21:02:07,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:02:10,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:02:10,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:02:13,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 21:02:20,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:02:22,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:02:23,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 21:02:26,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1405360.0, ans=0.2 2023-10-03 21:02:27,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:02:31,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:02:32,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1405360.0, ans=0.0 2023-10-03 21:02:34,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:02:42,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:02:42,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:02:42,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 21:02:43,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 21:02:44,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 21:02:46,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:02:47,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:02:47,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 21:02:49,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:02:49,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:02:49,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:02:50,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 21:02:51,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 21:02:56,512 INFO [train.py:1046] (1/4) Epoch 40, batch 3650, loss[loss=0.1548, simple_loss=0.2472, pruned_loss=0.0312, over 24310.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2362, pruned_loss=0.03842, over 4702826.51 frames. ], batch size: 74, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:02:56,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:02:56,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 21:02:56,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1405493.3333333333, ans=0.0 2023-10-03 21:02:57,280 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.68 vs. limit=15.0 2023-10-03 21:03:01,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 21:03:01,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:03:05,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 21:03:07,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 21:03:13,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:03:13,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:03:13,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:03:13,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1405560.0, ans=0.125 2023-10-03 21:03:16,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 21:03:16,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:03:17,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 21:03:17,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:03:19,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:03:20,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 21:03:20,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 21:03:22,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:03:22,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:03:23,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:03:24,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 21:03:26,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 21:03:27,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:03:29,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 21:03:30,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:03:30,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:03:35,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 21:03:37,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:03:37,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:03:39,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:03:41,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:03:42,371 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.999e+02 2.151e+02 2.370e+02 3.014e+02, threshold=4.301e+02, percent-clipped=0.0 2023-10-03 21:03:43,562 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.35 vs. limit=8.0 2023-10-03 21:03:44,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:03:45,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:03:46,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1405693.3333333333, ans=0.125 2023-10-03 21:03:47,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:03:47,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:03:48,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 21:03:49,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:03:49,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:03:55,227 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 21:03:59,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:03:59,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:04:01,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:04:01,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:04:03,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:04:03,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:04:04,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 21:04:04,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:04:07,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 21:04:10,592 INFO [train.py:1046] (1/4) Epoch 40, batch 3700, loss[loss=0.1509, simple_loss=0.2356, pruned_loss=0.03304, over 23374.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2369, pruned_loss=0.03867, over 4714238.79 frames. ], batch size: 93, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:04:10,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:04:10,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:04:12,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:04:12,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 21:04:12,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:04:13,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:04:15,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:04:16,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:04:18,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1405826.6666666667, ans=0.125 2023-10-03 21:04:19,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:04:20,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:04:21,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:04:22,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:04:22,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 21:04:25,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:04:26,982 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 21:04:34,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:04:35,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 21:04:35,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:04:37,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 21:04:37,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:04:39,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:04:40,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 21:04:40,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1405960.0, ans=0.1 2023-10-03 21:04:43,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:04:45,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:04:47,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:04:49,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:04:50,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:04:54,143 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.94 vs. limit=15.0 2023-10-03 21:04:54,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:04:54,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 21:04:56,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:04:56,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 21:05:02,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:05:02,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:05:05,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:05:06,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 21:05:08,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:05:08,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 21:05:08,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:05:08,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:05:10,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:05:12,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 21:05:13,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 21:05:14,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:05:14,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:05:16,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:05:17,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:05:20,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:05:23,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:05:23,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:05:24,767 INFO [train.py:1046] (1/4) Epoch 40, batch 3750, loss[loss=0.1494, simple_loss=0.2423, pruned_loss=0.02827, over 24352.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2379, pruned_loss=0.03902, over 4715197.85 frames. ], batch size: 77, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:05:24,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 21:05:27,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 21:05:28,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:05:29,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 21:05:31,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:05:32,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:05:32,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:05:33,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:05:38,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:05:41,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:05:41,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:05:44,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:05:46,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:05:46,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 21:05:48,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:05:49,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:05:50,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:05:53,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 21:05:56,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 21:05:59,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:05:59,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:06:01,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:06:06,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:06:07,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 21:06:11,032 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.980e+02 2.164e+02 2.553e+02 4.062e+02, threshold=4.329e+02, percent-clipped=0.0 2023-10-03 21:06:12,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 21:06:15,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:06:18,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:06:18,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:06:21,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:06:25,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 21:06:27,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:06:27,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1406426.6666666667, ans=0.2 2023-10-03 21:06:28,017 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.35 vs. limit=15.0 2023-10-03 21:06:29,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:06:31,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:06:32,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1406426.6666666667, ans=0.2 2023-10-03 21:06:34,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 21:06:38,475 INFO [train.py:1046] (1/4) Epoch 40, batch 3800, loss[loss=0.1551, simple_loss=0.2277, pruned_loss=0.04125, over 23646.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2379, pruned_loss=0.03876, over 4717374.57 frames. ], batch size: 149, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:06:40,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:06:42,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1406493.3333333333, ans=0.95 2023-10-03 21:06:43,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1406493.3333333333, ans=0.1 2023-10-03 21:06:44,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:06:46,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 21:06:47,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 21:06:49,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:06:52,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:06:52,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 21:06:55,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 21:06:55,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:06:56,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:06:58,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:06:59,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:06:59,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:06:59,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 21:07:02,923 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.86 vs. limit=10.0 2023-10-03 21:07:04,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 21:07:05,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:07:06,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:07:09,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:07:09,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:07:10,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 21:07:10,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:07:11,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1406626.6666666667, ans=0.2 2023-10-03 21:07:11,790 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.16 vs. limit=15.0 2023-10-03 21:07:12,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:07:14,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:07:19,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 21:07:19,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 21:07:20,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:07:27,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:07:30,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1406693.3333333333, ans=0.2 2023-10-03 21:07:32,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:07:33,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 21:07:34,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 21:07:34,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:07:36,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:07:37,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:07:39,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1406760.0, ans=0.125 2023-10-03 21:07:40,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 21:07:43,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 21:07:45,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 21:07:45,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:07:45,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:07:51,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:07:51,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:07:52,909 INFO [train.py:1046] (1/4) Epoch 40, batch 3850, loss[loss=0.1539, simple_loss=0.2411, pruned_loss=0.03335, over 24691.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2364, pruned_loss=0.03829, over 4722057.60 frames. ], batch size: 73, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:07:56,800 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=8.25 vs. limit=15.0 2023-10-03 21:07:57,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:07:58,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 21:07:58,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:08:00,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:08:03,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 21:08:05,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:08:07,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 21:08:08,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 21:08:16,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:17,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:08:19,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:08:19,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:08:24,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:24,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:08:24,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1406960.0, ans=0.0 2023-10-03 21:08:25,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:08:25,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:08:27,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:08:28,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:08:30,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:30,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:08:31,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 21:08:31,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 21:08:31,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:08:32,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:34,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:08:36,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:36,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 21:08:39,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 21:08:40,421 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 1.946e+02 2.190e+02 2.397e+02 4.110e+02, threshold=4.380e+02, percent-clipped=0.0 2023-10-03 21:08:40,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:08:41,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 21:08:43,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 21:08:49,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:08:50,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:50,663 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.99 vs. limit=15.0 2023-10-03 21:08:54,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:08:54,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 21:08:57,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 21:09:00,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:09:01,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:09:04,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:09:04,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 21:09:05,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:05,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:05,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:09:05,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 21:09:05,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:09:07,609 INFO [train.py:1046] (1/4) Epoch 40, batch 3900, loss[loss=0.1563, simple_loss=0.2368, pruned_loss=0.03787, over 24324.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2353, pruned_loss=0.03832, over 4711093.53 frames. ], batch size: 61, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 21:09:09,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 21:09:09,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:09,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:09:11,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:09:11,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:13,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:09:14,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:09:14,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:09:14,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:09:14,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 21:09:14,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:19,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:09:21,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 21:09:21,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:09:22,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:09:23,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 21:09:23,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:24,885 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.71 vs. limit=15.0 2023-10-03 21:09:25,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:09:27,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 21:09:27,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:09:28,058 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.22 vs. limit=15.0 2023-10-03 21:09:29,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 21:09:30,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:31,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 21:09:31,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 21:09:35,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:09:37,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:09:38,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:09:38,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:09:41,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:09:43,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:09:45,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:09:45,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:09:47,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:09:53,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:09:53,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:09:59,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:10:01,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:10:12,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:10:14,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:10:14,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 21:10:15,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 21:10:15,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:10:15,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1407426.6666666667, ans=0.125 2023-10-03 21:10:17,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 21:10:18,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:10:20,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 21:10:21,450 INFO [train.py:1046] (1/4) Epoch 40, batch 3950, loss[loss=0.1621, simple_loss=0.2169, pruned_loss=0.05371, over 19815.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.235, pruned_loss=0.03832, over 4719248.61 frames. ], batch size: 388, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:10:25,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1407493.3333333333, ans=0.0 2023-10-03 21:10:26,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:10:26,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 21:10:27,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:10:29,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:10:32,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:10:36,497 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 21:10:37,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:10:37,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 21:10:38,054 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 21:10:39,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:10:41,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:10:42,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:10:42,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:10:45,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 21:10:48,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:10:48,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:10:48,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:10:50,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:10:50,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1407626.6666666667, ans=0.0 2023-10-03 21:10:51,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:11:03,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:11:03,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:11:09,378 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.960e+02 2.144e+02 2.428e+02 3.730e+02, threshold=4.288e+02, percent-clipped=0.0 2023-10-03 21:11:09,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 21:11:14,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 21:11:14,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 21:11:14,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:11:15,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:11:23,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:11:23,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:11:23,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1407760.0, ans=0.0 2023-10-03 21:11:24,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:11:24,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:11:24,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 21:11:29,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:11:30,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:11:35,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 21:11:36,502 INFO [train.py:1046] (1/4) Epoch 40, batch 4000, loss[loss=0.1295, simple_loss=0.2051, pruned_loss=0.02697, over 24473.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2356, pruned_loss=0.0383, over 4721613.78 frames. ], batch size: 58, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:11:42,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:11:48,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:11:51,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:11:51,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:11:53,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:11:53,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 21:11:54,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:11:54,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 21:11:54,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:11:54,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 21:11:57,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:12:02,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:12:02,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:12:02,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:12:02,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:12:02,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 21:12:03,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:12:06,388 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 21:12:07,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:12:07,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:12:09,274 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 21:12:11,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 21:12:11,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:12:15,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1407960.0, ans=0.125 2023-10-03 21:12:17,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 21:12:19,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:12:19,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1408026.6666666667, ans=0.1 2023-10-03 21:12:20,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:12:22,000 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 21:12:23,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:12:25,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 21:12:25,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:12:25,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:12:26,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:12:28,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:12:28,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:12:28,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:12:29,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 21:12:29,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:12:31,609 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 21:12:37,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:12:40,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 21:12:44,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:12:44,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:12:46,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:12:47,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:12:50,430 INFO [train.py:1046] (1/4) Epoch 40, batch 4050, loss[loss=0.1559, simple_loss=0.2356, pruned_loss=0.03817, over 24629.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2355, pruned_loss=0.03817, over 4729553.07 frames. ], batch size: 65, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:12:50,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:12:53,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 21:12:54,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 21:12:57,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:12:57,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:12:59,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:13:00,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:13:00,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:13:05,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:13:06,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:13:08,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 21:13:09,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:13:09,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:13:11,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1408226.6666666667, ans=0.2 2023-10-03 21:13:14,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:13:15,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:13:17,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 21:13:20,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 21:13:21,688 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 21:13:23,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:13:30,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 21:13:30,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:13:33,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:13:36,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:13:36,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:13:36,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:13:37,894 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.894e+02 2.100e+02 2.406e+02 4.274e+02, threshold=4.201e+02, percent-clipped=0.0 2023-10-03 21:13:41,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:13:41,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1408360.0, ans=0.125 2023-10-03 21:13:43,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 21:13:43,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 21:13:45,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:13:46,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 21:13:50,591 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.85 vs. limit=15.0 2023-10-03 21:13:51,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:13:58,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 21:13:58,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:13:58,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:14:01,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 21:14:01,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 21:14:01,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:14:03,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1408493.3333333333, ans=0.05 2023-10-03 21:14:04,141 INFO [train.py:1046] (1/4) Epoch 40, batch 4100, loss[loss=0.1509, simple_loss=0.2329, pruned_loss=0.03445, over 24349.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2369, pruned_loss=0.03847, over 4729093.64 frames. ], batch size: 61, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:14:04,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:14:05,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:05,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:14:05,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1408493.3333333333, ans=0.2 2023-10-03 21:14:12,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1408493.3333333333, ans=0.125 2023-10-03 21:14:13,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 21:14:14,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 21:14:14,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 21:14:16,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 21:14:16,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:14:18,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:18,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:18,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:14:19,654 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 21:14:22,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:14:23,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:14:23,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:14:25,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:14:25,670 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.42 vs. limit=15.0 2023-10-03 21:14:27,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:14:29,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:14:30,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:14:31,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 21:14:32,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:32,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:14:32,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:14:32,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:14:32,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 21:14:33,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1408626.6666666667, ans=0.0 2023-10-03 21:14:35,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:14:37,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 21:14:40,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:14:41,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:14:41,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 21:14:44,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:14:44,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:14:44,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:14:47,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 21:14:49,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:14:49,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:14:52,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 21:14:52,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:53,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:14:55,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1408693.3333333333, ans=0.0 2023-10-03 21:14:55,621 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.17 vs. limit=15.0 2023-10-03 21:14:56,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:14:59,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:15:02,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:15:02,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:15:10,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:15:10,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:15:10,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1408760.0, ans=0.125 2023-10-03 21:15:13,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:15:16,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:15:19,456 INFO [train.py:1046] (1/4) Epoch 40, batch 4150, loss[loss=0.1729, simple_loss=0.2396, pruned_loss=0.05311, over 23748.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2371, pruned_loss=0.03914, over 4716782.41 frames. ], batch size: 179, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:15:19,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:15:19,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:15:20,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:15:20,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:15:21,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1408826.6666666667, ans=0.125 2023-10-03 21:15:22,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 21:15:23,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:15:23,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 21:15:25,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 21:15:25,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 21:15:26,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:15:29,992 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.98 vs. limit=15.0 2023-10-03 21:15:32,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:15:32,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:15:35,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:15:35,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:15:36,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:15:38,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:15:38,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:15:40,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 21:15:45,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:15:50,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:15:51,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 21:15:53,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 21:15:53,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:15:54,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 21:15:54,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:15:54,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:15:54,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1408960.0, ans=0.125 2023-10-03 21:15:57,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:15:59,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:16:01,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 21:16:02,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1409026.6666666667, ans=0.125 2023-10-03 21:16:04,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:16:06,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:16:07,373 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 2.015e+02 2.186e+02 2.538e+02 3.737e+02, threshold=4.372e+02, percent-clipped=0.0 2023-10-03 21:16:07,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 21:16:07,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:16:09,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 21:16:10,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:16:12,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:16:14,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:16:16,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 21:16:16,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:16:16,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 21:16:16,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 21:16:20,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 21:16:20,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:16:20,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:16:20,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:16:20,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1409093.3333333333, ans=0.0 2023-10-03 21:16:21,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 21:16:21,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:16:21,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 21:16:23,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:16:24,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:16:24,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 21:16:24,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:16:26,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1409093.3333333333, ans=0.0 2023-10-03 21:16:30,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:16:31,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 21:16:33,344 INFO [train.py:1046] (1/4) Epoch 40, batch 4200, loss[loss=0.1716, simple_loss=0.2536, pruned_loss=0.04479, over 24347.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2366, pruned_loss=0.03879, over 4715614.92 frames. ], batch size: 77, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:16:33,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:16:36,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:16:37,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:16:38,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:16:38,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:16:40,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 21:16:42,845 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.43 vs. limit=6.0 2023-10-03 21:16:44,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 21:16:44,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:16:45,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:16:48,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:16:48,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1409226.6666666667, ans=0.125 2023-10-03 21:16:51,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 21:16:51,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_ff3.min_abs, batch_count=1409226.6666666667, ans=0.2 2023-10-03 21:16:54,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:16:54,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:16:55,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 21:16:55,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:16:56,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:16:56,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:16:56,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:16:58,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:17:01,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 21:17:02,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:17:03,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1409293.3333333333, ans=0.125 2023-10-03 21:17:05,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 21:17:07,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:17:09,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:17:11,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:17:13,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:17:13,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 21:17:13,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:17:15,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:17:20,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:17:22,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:17:26,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:17:29,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 21:17:31,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:17:35,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 21:17:37,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:17:38,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 21:17:44,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:17:48,467 INFO [train.py:1046] (1/4) Epoch 40, batch 4250, loss[loss=0.1372, simple_loss=0.1876, pruned_loss=0.04336, over 19368.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2355, pruned_loss=0.03833, over 4713951.13 frames. ], batch size: 388, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:17:50,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:17:50,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:17:51,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:17:52,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1409493.3333333333, ans=0.125 2023-10-03 21:17:56,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:17:56,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 21:17:56,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:17:59,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:18:02,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:18:07,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:18:07,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:18:08,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:18:08,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:18:11,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:18:11,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:18:13,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:18:15,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:18:15,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:18:16,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 21:18:17,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1409626.6666666667, ans=0.125 2023-10-03 21:18:20,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1409626.6666666667, ans=0.1 2023-10-03 21:18:21,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 21:18:21,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:18:22,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:18:22,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:18:24,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:18:24,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:18:24,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:18:25,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1409626.6666666667, ans=0.125 2023-10-03 21:18:28,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1409626.6666666667, ans=0.0 2023-10-03 21:18:29,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 21:18:29,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:18:34,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:18:35,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:18:35,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1409693.3333333333, ans=0.2 2023-10-03 21:18:36,953 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.902e+02 2.076e+02 2.432e+02 4.125e+02, threshold=4.152e+02, percent-clipped=0.0 2023-10-03 21:18:37,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 21:18:37,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:18:38,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 21:18:39,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:18:41,390 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:18:41,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1409693.3333333333, ans=0.0 2023-10-03 21:18:41,702 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.78 vs. limit=10.0 2023-10-03 21:18:42,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:18:43,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:18:43,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:18:47,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 21:18:48,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:18:49,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:18:53,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:18:55,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:18:56,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:18:57,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:18:57,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:19:01,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:19:01,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:19:01,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 21:19:02,326 INFO [train.py:1046] (1/4) Epoch 40, batch 4300, loss[loss=0.1626, simple_loss=0.2366, pruned_loss=0.04435, over 23748.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2346, pruned_loss=0.03803, over 4705615.21 frames. ], batch size: 212, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:19:02,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:19:05,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1409826.6666666667, ans=0.125 2023-10-03 21:19:06,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:19:06,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:19:09,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:19:18,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:19:18,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 21:19:20,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:19:21,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:19:21,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:19:23,417 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 21:19:25,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 21:19:26,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:19:30,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 21:19:32,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:19:32,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 21:19:33,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 21:19:35,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:19:37,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:19:37,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:19:40,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:19:40,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:19:40,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1409960.0, ans=0.0 2023-10-03 21:19:41,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:19:41,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 21:19:43,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 21:19:46,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:19:49,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:19:49,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 21:19:49,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:19:49,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:19:49,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 21:19:49,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 21:19:49,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 21:19:51,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:19:51,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 21:19:52,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 21:19:54,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1410026.6666666667, ans=0.0 2023-10-03 21:19:56,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:19:58,537 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 21:20:00,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:20:01,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:20:01,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:20:03,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 21:20:03,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:20:03,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:20:04,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:20:04,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:20:05,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:20:07,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:20:08,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:20:10,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:20:10,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:20:15,528 INFO [train.py:1046] (1/4) Epoch 40, batch 4350, loss[loss=0.1795, simple_loss=0.2501, pruned_loss=0.05452, over 23765.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2353, pruned_loss=0.03864, over 4699576.34 frames. ], batch size: 212, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:20:17,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 21:20:17,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:20:21,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:20:23,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:20:27,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:20:27,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:20:33,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:20:36,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:20:38,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:20:38,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:20:41,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:20:43,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:20:45,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:20:50,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 21:20:50,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:20:52,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:20:56,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:20:58,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 21:21:02,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:21:02,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:21:04,098 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.055e+02 2.301e+02 2.803e+02 4.605e+02, threshold=4.602e+02, percent-clipped=1.0 2023-10-03 21:21:08,378 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 21:21:08,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:21:09,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:21:11,090 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 21:21:11,865 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.88 vs. limit=15.0 2023-10-03 21:21:13,778 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 21:21:13,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:21:13,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:21:15,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:21:15,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:21:16,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:21:16,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:21:19,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 21:21:20,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:20,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:21:20,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:20,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 21:21:22,483 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 21:21:22,494 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 21:21:22,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 21:21:25,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:21:25,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:21:25,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:21:26,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:21:27,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 21:21:28,572 INFO [train.py:1046] (1/4) Epoch 40, batch 4400, loss[loss=0.1575, simple_loss=0.2396, pruned_loss=0.03768, over 23270.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2363, pruned_loss=0.03868, over 4703837.94 frames. ], batch size: 105, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:21:28,739 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 21:21:30,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:33,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:21:33,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:35,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:21:38,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 21:21:38,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 21:21:38,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 21:21:38,112 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 21:21:39,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:21:39,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:21:40,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 21:21:43,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:45,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:21:45,032 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 21:21:47,174 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.48 vs. limit=10.0 2023-10-03 21:21:47,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:21:47,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 21:21:47,797 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 21:21:49,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 21:21:51,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 21:21:51,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 21:21:51,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:21:53,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:21:54,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:21:54,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:21:55,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 21:21:57,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 21:21:57,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:21:59,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:21:59,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:22:02,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:22:03,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:22:03,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 21:22:03,679 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 21:22:06,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:22:08,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1410626.6666666667, ans=0.125 2023-10-03 21:22:12,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:22:15,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 21:22:19,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:22:23,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:22:25,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:22:25,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 21:22:25,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:22:25,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:22:25,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:22:27,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:22:29,154 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.00 vs. limit=10.0 2023-10-03 21:22:30,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 21:22:31,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1410760.0, ans=0.0 2023-10-03 21:22:32,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 21:22:33,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1410760.0, ans=0.125 2023-10-03 21:22:35,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 21:22:35,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:22:35,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 21:22:36,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:22:37,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1410760.0, ans=0.95 2023-10-03 21:22:39,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:22:40,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 21:22:43,745 INFO [train.py:1046] (1/4) Epoch 40, batch 4450, loss[loss=0.1957, simple_loss=0.268, pruned_loss=0.06171, over 19783.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2375, pruned_loss=0.03889, over 4712561.72 frames. ], batch size: 388, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:22:43,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:22:46,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:22:46,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:22:49,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1410826.6666666667, ans=0.0 2023-10-03 21:22:52,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:22:52,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:22:57,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:22:58,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:23:00,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:23:00,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:23:03,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 21:23:03,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:23:04,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:23:04,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:23:04,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:23:05,798 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.41 vs. limit=22.5 2023-10-03 21:23:07,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 21:23:10,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:23:11,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:23:13,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:23:13,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:23:14,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:23:17,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 21:23:18,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 21:23:18,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 21:23:18,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:23:22,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:23:23,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 21:23:24,347 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.51 vs. limit=22.5 2023-10-03 21:23:26,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:23:27,247 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.62 vs. limit=10.0 2023-10-03 21:23:31,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:23:32,400 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.956e+02 2.159e+02 2.589e+02 3.505e+02, threshold=4.319e+02, percent-clipped=0.0 2023-10-03 21:23:32,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 21:23:32,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:23:32,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:23:32,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:23:32,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:23:33,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:23:38,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:23:39,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 21:23:41,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:23:42,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:23:43,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:23:46,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:23:47,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 21:23:50,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:23:52,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 21:23:53,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:23:57,170 INFO [train.py:1046] (1/4) Epoch 40, batch 4500, loss[loss=0.1474, simple_loss=0.2273, pruned_loss=0.03373, over 24343.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2385, pruned_loss=0.03925, over 4707391.07 frames. ], batch size: 56, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:23:59,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:24:01,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 21:24:01,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 21:24:02,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:24:03,380 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.11 vs. limit=6.0 2023-10-03 21:24:06,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:24:07,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:24:07,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:24:09,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:24:09,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:24:10,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:24:16,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1411226.6666666667, ans=0.1 2023-10-03 21:24:19,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1411226.6666666667, ans=0.125 2023-10-03 21:24:20,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:24:21,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:24:23,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:24:25,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:24:27,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:24:27,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1411293.3333333333, ans=0.07 2023-10-03 21:24:32,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 21:24:36,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:24:37,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1411293.3333333333, ans=0.125 2023-10-03 21:24:40,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:24:40,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1411360.0, ans=0.0 2023-10-03 21:24:43,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:24:43,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 21:24:43,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:24:43,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:24:46,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:24:47,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:24:50,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:24:50,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 21:24:50,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:24:50,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:24:54,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:24:54,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:24:58,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:25:02,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:25:02,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:25:03,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 21:25:05,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 21:25:05,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 21:25:08,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 21:25:11,776 INFO [train.py:1046] (1/4) Epoch 40, batch 4550, loss[loss=0.1644, simple_loss=0.2522, pruned_loss=0.03833, over 24022.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2372, pruned_loss=0.03901, over 4712025.62 frames. ], batch size: 80, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:25:11,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 21:25:13,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:25:17,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:25:17,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:25:18,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1411493.3333333333, ans=0.0 2023-10-03 21:25:20,369 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:25:21,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:25:24,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:25:26,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:25:27,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:25:27,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:25:27,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:25:29,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:25:30,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:25:31,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1411560.0, ans=0.0 2023-10-03 21:25:33,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:25:36,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 21:25:36,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 21:25:38,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:25:39,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 21:25:44,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 21:25:44,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:25:46,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 21:25:48,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:25:52,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:25:52,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:25:53,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:25:55,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 21:25:57,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:25:59,968 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.936e+02 2.088e+02 2.339e+02 3.946e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-03 21:26:00,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:26:00,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:26:01,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:26:03,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 21:26:03,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 21:26:03,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:26:04,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 21:26:06,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 21:26:06,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:26:08,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:08,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:26:09,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:26:09,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:26:12,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 21:26:12,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 21:26:14,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:26:14,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 21:26:15,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 21:26:15,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:26:15,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 21:26:18,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:26:18,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:26:21,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:26:21,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1411760.0, ans=0.95 2023-10-03 21:26:22,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:26:22,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 21:26:23,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:26:25,682 INFO [train.py:1046] (1/4) Epoch 40, batch 4600, loss[loss=0.1634, simple_loss=0.2531, pruned_loss=0.03685, over 24570.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2358, pruned_loss=0.03869, over 4700535.70 frames. ], batch size: 71, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:26:25,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:26:27,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1411826.6666666667, ans=0.0 2023-10-03 21:26:29,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:30,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:26:31,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:26:33,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:26:33,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:26:34,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 21:26:35,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1411826.6666666667, ans=0.04949747468305833 2023-10-03 21:26:36,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:26:40,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:26:40,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:26:40,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1411893.3333333333, ans=0.125 2023-10-03 21:26:42,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:45,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1411893.3333333333, ans=0.1 2023-10-03 21:26:49,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 21:26:50,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:54,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:56,939 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.68 vs. limit=10.0 2023-10-03 21:26:58,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:26:58,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:27:03,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 21:27:03,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 21:27:03,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:27:08,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:27:08,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:27:10,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:27:14,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 21:27:16,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 21:27:19,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:27:19,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1412026.6666666667, ans=0.0 2023-10-03 21:27:19,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1412026.6666666667, ans=0.125 2023-10-03 21:27:20,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:27:23,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:27:23,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 21:27:24,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:27:24,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 21:27:24,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:27:24,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:27:26,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:27:28,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:27:28,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:27:29,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 21:27:29,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 21:27:30,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 21:27:30,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:27:32,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:27:33,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:27:33,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:27:34,507 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.51 vs. limit=15.0 2023-10-03 21:27:39,903 INFO [train.py:1046] (1/4) Epoch 40, batch 4650, loss[loss=0.1652, simple_loss=0.246, pruned_loss=0.04223, over 23459.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2357, pruned_loss=0.03835, over 4709757.74 frames. ], batch size: 106, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:27:40,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1412160.0, ans=0.0 2023-10-03 21:27:41,466 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.84 vs. limit=15.0 2023-10-03 21:27:41,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:27:44,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:27:45,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:27:47,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:27:47,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:27:47,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:27:49,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:27:50,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 21:27:53,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:27:56,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 21:27:56,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:27:57,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 21:27:58,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:27:59,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 21:27:59,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 21:27:59,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:28:00,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:28:00,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1412226.6666666667, ans=0.2 2023-10-03 21:28:03,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:28:06,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:28:06,591 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 21:28:09,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:28:10,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1412293.3333333333, ans=0.125 2023-10-03 21:28:11,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 21:28:13,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:28:13,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:28:15,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 21:28:15,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:28:19,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:28:24,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:28:28,248 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.889e+02 2.069e+02 2.261e+02 3.657e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-03 21:28:28,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:28:31,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:28:31,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:28:32,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:28:34,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 21:28:34,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 21:28:35,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 21:28:35,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 21:28:35,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1412360.0, ans=0.0 2023-10-03 21:28:38,151 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.84 vs. limit=15.0 2023-10-03 21:28:38,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:28:46,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:28:46,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:28:46,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 21:28:47,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:28:50,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:28:50,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:28:50,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:28:51,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:28:52,974 INFO [train.py:1046] (1/4) Epoch 40, batch 4700, loss[loss=0.1712, simple_loss=0.2577, pruned_loss=0.04238, over 24010.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.235, pruned_loss=0.03792, over 4704862.60 frames. ], batch size: 80, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:28:53,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:28:53,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:28:55,432 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.71 vs. limit=15.0 2023-10-03 21:28:56,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:28:56,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:28:56,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:28:57,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 21:28:59,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:29:00,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 21:29:07,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:29:09,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:29:09,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:29:10,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:29:11,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 21:29:12,437 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.56 vs. limit=15.0 2023-10-03 21:29:16,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 21:29:16,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 21:29:17,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1412560.0, ans=0.1 2023-10-03 21:29:18,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:29:19,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:29:19,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:29:21,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:29:22,580 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:29:27,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 21:29:27,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 21:29:30,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:29:33,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1412626.6666666667, ans=0.2 2023-10-03 21:29:36,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 21:29:37,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:29:39,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:29:39,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1412693.3333333333, ans=0.1 2023-10-03 21:29:40,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1412693.3333333333, ans=0.2 2023-10-03 21:29:43,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 21:29:45,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:29:45,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1412693.3333333333, ans=0.1 2023-10-03 21:29:49,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:29:49,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 21:29:51,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:29:51,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:29:55,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:29:55,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:29:55,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 21:29:57,001 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 21:29:58,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:29:58,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:29:58,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:29:58,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 21:29:59,407 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.98 vs. limit=22.5 2023-10-03 21:29:59,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:30:05,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 21:30:07,645 INFO [train.py:1046] (1/4) Epoch 40, batch 4750, loss[loss=0.1693, simple_loss=0.2509, pruned_loss=0.04389, over 23370.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2359, pruned_loss=0.038, over 4718902.17 frames. ], batch size: 93, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:30:07,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:30:09,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:30:13,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:30:13,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:30:14,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 21:30:16,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:30:17,408 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=10.89 vs. limit=22.5 2023-10-03 21:30:18,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 21:30:20,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:30:20,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:30:20,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:30:21,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1412893.3333333333, ans=0.0 2023-10-03 21:30:25,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 21:30:29,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:30:30,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1412893.3333333333, ans=0.125 2023-10-03 21:30:31,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 21:30:32,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:30:35,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:30:35,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:30:35,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1412960.0, ans=0.0 2023-10-03 21:30:36,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:30:36,735 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 21:30:36,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 21:30:39,210 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.26 vs. limit=15.0 2023-10-03 21:30:44,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 21:30:47,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:30:48,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:30:50,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:30:50,388 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 21:30:50,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:30:50,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1413026.6666666667, ans=0.125 2023-10-03 21:30:53,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:30:53,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1413026.6666666667, ans=0.0 2023-10-03 21:30:56,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:30:57,304 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.943e+02 2.101e+02 2.308e+02 3.375e+02, threshold=4.202e+02, percent-clipped=0.0 2023-10-03 21:30:59,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 21:30:59,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 21:30:59,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1413026.6666666667, ans=0.125 2023-10-03 21:31:00,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:31:00,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:31:00,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:31:00,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1413026.6666666667, ans=0.125 2023-10-03 21:31:02,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:31:03,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 21:31:05,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1413093.3333333333, ans=0.125 2023-10-03 21:31:06,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 21:31:09,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:12,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:31:12,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 21:31:12,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:31:12,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:31:15,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:31:15,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:31:17,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 21:31:19,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1413093.3333333333, ans=0.125 2023-10-03 21:31:20,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:31:21,564 INFO [train.py:1046] (1/4) Epoch 40, batch 4800, loss[loss=0.1426, simple_loss=0.2253, pruned_loss=0.02999, over 21169.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2367, pruned_loss=0.03841, over 4711471.78 frames. ], batch size: 46, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:31:21,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 21:31:21,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 21:31:22,714 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.35 vs. limit=5.0 2023-10-03 21:31:22,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 21:31:24,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:31:24,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1413160.0, ans=0.125 2023-10-03 21:31:25,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:31:27,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 21:31:31,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:31:33,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:37,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:31:39,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:31:39,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:31:40,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 21:31:40,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:31:42,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:31:43,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:31:48,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:31:48,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:31:48,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:31:49,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:31:49,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 21:31:49,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:49,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1413293.3333333333, ans=0.125 2023-10-03 21:31:51,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:31:53,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:31:55,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:58,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:58,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:32:02,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 21:32:02,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:32:04,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 21:32:04,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 21:32:06,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:32:06,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:32:07,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:32:07,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:32:07,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:32:10,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:32:11,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:32:13,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:32:15,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:19,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:32:22,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 21:32:22,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:32:22,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:22,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:32:22,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1413426.6666666667, ans=0.125 2023-10-03 21:32:24,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:32:26,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:32:28,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:32:28,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:28,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:32:29,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:32:29,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:32:34,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:32:34,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:34,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:32:35,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 21:32:37,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 21:32:37,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:32:37,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:32:38,331 INFO [train.py:1046] (1/4) Epoch 40, batch 4850, loss[loss=0.1597, simple_loss=0.2551, pruned_loss=0.03219, over 24563.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2369, pruned_loss=0.03863, over 4721005.76 frames. ], batch size: 71, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:32:38,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:32:38,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:41,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:32:47,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 21:32:48,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:32:52,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:32:53,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 21:32:53,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:57,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:33:00,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:33:01,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:33:01,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 21:33:01,718 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.50 vs. limit=15.0 2023-10-03 21:33:04,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:33:05,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1413560.0, ans=0.125 2023-10-03 21:33:07,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:33:07,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 21:33:07,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1413626.6666666667, ans=0.1 2023-10-03 21:33:08,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:33:08,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 21:33:11,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1413626.6666666667, ans=0.125 2023-10-03 21:33:12,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:33:12,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:33:16,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:33:16,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 21:33:16,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 21:33:18,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:33:25,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:33:27,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 21:33:27,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:33:27,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:33:28,687 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 1.943e+02 2.210e+02 2.558e+02 3.580e+02, threshold=4.421e+02, percent-clipped=0.0 2023-10-03 21:33:28,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:33:29,096 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:33:30,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 21:33:30,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:33:30,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 21:33:31,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:33:33,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:33:33,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 21:33:37,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1413760.0, ans=0.0 2023-10-03 21:33:41,931 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.76 vs. limit=15.0 2023-10-03 21:33:43,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:33:45,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1413760.0, ans=0.0 2023-10-03 21:33:48,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:33:48,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1413760.0, ans=0.0 2023-10-03 21:33:49,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:33:52,392 INFO [train.py:1046] (1/4) Epoch 40, batch 4900, loss[loss=0.1631, simple_loss=0.2529, pruned_loss=0.03664, over 24551.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2375, pruned_loss=0.03854, over 4723940.62 frames. ], batch size: 71, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:33:55,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 21:33:55,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:33:58,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1413826.6666666667, ans=0.1 2023-10-03 21:33:59,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:34:01,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:34:01,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:34:05,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 21:34:09,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 21:34:13,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 21:34:14,196 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.58 vs. limit=15.0 2023-10-03 21:34:14,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 21:34:14,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:34:14,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:34:14,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:34:14,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:34:14,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:34:15,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 21:34:17,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 21:34:18,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:34:20,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:34:20,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:34:23,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:34:24,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:34:27,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:34:27,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 21:34:28,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:34:29,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:34:29,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 21:34:29,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 21:34:32,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 21:34:34,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:34:34,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:34:36,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:34:37,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:34:37,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 21:34:37,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:34:39,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 21:34:41,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:34:43,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 21:34:44,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1414026.6666666667, ans=0.125 2023-10-03 21:34:46,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:34:47,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 21:34:49,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:34:50,021 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.01 vs. limit=15.0 2023-10-03 21:34:50,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 21:34:50,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 21:34:58,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:34:59,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:35:00,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 21:35:01,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 21:35:01,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:35:02,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:35:06,477 INFO [train.py:1046] (1/4) Epoch 40, batch 4950, loss[loss=0.1612, simple_loss=0.2478, pruned_loss=0.03727, over 24409.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2363, pruned_loss=0.03845, over 4714869.61 frames. ], batch size: 77, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:35:06,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:35:06,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:35:07,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:35:07,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 21:35:08,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 21:35:08,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1414160.0, ans=0.2 2023-10-03 21:35:13,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:35:13,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 21:35:14,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 21:35:14,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1414160.0, ans=10.0 2023-10-03 21:35:15,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 21:35:15,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:35:15,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 21:35:17,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:17,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:35:17,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:35:17,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:35:20,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:35:20,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:35:20,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:35:21,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:35:24,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:24,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:35:25,617 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.31 vs. limit=5.0 2023-10-03 21:35:29,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:35:33,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:35,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1414293.3333333333, ans=0.1 2023-10-03 21:35:36,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:35:37,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:37,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:35:39,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:35:41,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1414293.3333333333, ans=0.0 2023-10-03 21:35:42,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 21:35:43,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 21:35:45,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:35:46,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:35:46,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:35:48,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:35:48,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:35:48,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1414293.3333333333, ans=0.0 2023-10-03 21:35:50,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:35:52,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:35:54,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:35:56,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:35:58,276 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.018e+02 2.178e+02 2.432e+02 3.766e+02, threshold=4.356e+02, percent-clipped=0.0 2023-10-03 21:35:58,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:58,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:35:59,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 21:35:59,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:36:00,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:36:03,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:36:04,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:36:04,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:36:04,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:36:05,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:36:07,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:36:08,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:36:09,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:36:10,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:36:11,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 21:36:16,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:36:20,748 INFO [train.py:1046] (1/4) Epoch 40, batch 5000, loss[loss=0.1787, simple_loss=0.2527, pruned_loss=0.05235, over 23823.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2354, pruned_loss=0.03832, over 4714012.77 frames. ], batch size: 164, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:36:20,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 21:36:20,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 21:36:21,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1414493.3333333333, ans=0.125 2023-10-03 21:36:28,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:36:28,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:36:29,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 21:36:30,437 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.47 vs. limit=6.0 2023-10-03 21:36:31,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 21:36:34,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:36:35,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 21:36:35,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:36:35,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:36:35,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 21:36:36,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:36:37,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:36:37,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1414560.0, ans=0.125 2023-10-03 21:36:38,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 21:36:38,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:36:38,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:36:39,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 21:36:41,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 21:36:41,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:36:41,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 21:36:41,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:36:41,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1414560.0, ans=0.125 2023-10-03 21:36:43,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:36:43,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:36:43,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 21:36:43,974 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.56 vs. limit=15.0 2023-10-03 21:36:44,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 21:36:47,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 21:36:47,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:36:47,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:36:48,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 21:36:48,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:36:51,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:36:51,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:36:53,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 21:36:54,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 21:36:54,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:36:55,723 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.00 vs. limit=22.5 2023-10-03 21:36:57,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:37:02,576 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 21:37:04,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:37:05,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:37:05,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:06,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 21:37:08,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:37:08,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:37:08,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:37:10,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 21:37:12,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:37:14,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:37:16,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:37:22,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 21:37:24,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:33,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:37:34,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:34,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:37:35,415 INFO [train.py:1046] (1/4) Epoch 40, batch 5050, loss[loss=0.1663, simple_loss=0.2416, pruned_loss=0.04552, over 23690.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2352, pruned_loss=0.03841, over 4710754.34 frames. ], batch size: 179, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:37:35,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:37:35,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:37:36,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:37:36,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:40,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:40,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 21:37:42,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:37:42,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1414826.6666666667, ans=0.125 2023-10-03 21:37:43,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:37:45,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:37:45,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 21:37:47,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:37:47,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1414826.6666666667, ans=0.125 2023-10-03 21:37:48,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:37:49,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:37:51,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:37:51,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:37:57,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1414893.3333333333, ans=0.0 2023-10-03 21:37:59,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 21:37:59,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 21:38:01,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:38:01,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 21:38:02,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:38:03,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:38:03,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:38:04,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:38:04,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 21:38:04,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 21:38:06,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:38:07,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:38:10,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:38:11,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 21:38:14,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:38:18,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 21:38:18,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:38:18,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:38:19,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:38:19,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:38:21,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:38:24,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:38:24,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:38:24,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:38:24,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:38:24,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 21:38:27,411 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.936e+02 2.098e+02 2.306e+02 3.294e+02, threshold=4.196e+02, percent-clipped=0.0 2023-10-03 21:38:27,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:38:27,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:38:32,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:38:32,327 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 21:38:32,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 21:38:33,208 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.27 vs. limit=15.0 2023-10-03 21:38:35,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:38:36,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:38:36,377 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 21:38:39,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:38:39,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 21:38:39,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:38:41,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:38:42,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:38:42,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 21:38:43,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 21:38:46,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:38:46,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:38:48,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:38:49,736 INFO [train.py:1046] (1/4) Epoch 40, batch 5100, loss[loss=0.1708, simple_loss=0.2363, pruned_loss=0.05266, over 22689.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2367, pruned_loss=0.03877, over 4723997.46 frames. ], batch size: 322, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:38:51,116 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 21:38:53,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:38:53,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1415160.0, ans=0.0 2023-10-03 21:38:55,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 21:38:57,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 21:38:57,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:38:58,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:39:01,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:39:03,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 21:39:03,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 21:39:07,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:39:08,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:39:08,998 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:39:11,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:39:13,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1415226.6666666667, ans=0.125 2023-10-03 21:39:14,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 21:39:15,035 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.73 vs. limit=12.0 2023-10-03 21:39:15,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:39:17,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:39:17,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 21:39:20,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:39:22,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:39:22,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 21:39:23,923 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 21:39:25,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:39:26,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 21:39:26,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 21:39:26,892 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1415293.3333333333, ans=0.125 2023-10-03 21:39:28,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1415293.3333333333, ans=0.125 2023-10-03 21:39:30,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:39:35,117 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.53 vs. limit=15.0 2023-10-03 21:39:39,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:39:41,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 21:39:42,453 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 21:39:42,467 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 21:39:45,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 21:39:45,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:39:46,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 21:39:50,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 21:39:52,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:39:54,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:39:55,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 21:39:59,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 21:39:59,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 21:40:02,319 INFO [train.py:1046] (1/4) Epoch 40, batch 5150, loss[loss=0.1577, simple_loss=0.2428, pruned_loss=0.03625, over 24459.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2369, pruned_loss=0.03853, over 4733257.86 frames. ], batch size: 63, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:40:05,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:40:05,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:40:05,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:40:06,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:40:06,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 21:40:07,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:40:08,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 21:40:08,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 21:40:08,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 21:40:09,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:40:09,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 21:40:11,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:40:12,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 21:40:13,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:40:15,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:40:18,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:40:19,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 21:40:19,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1415560.0, ans=0.0 2023-10-03 21:40:21,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:40:21,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:40:22,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:40:22,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:40:22,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:40:22,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1415560.0, ans=0.0 2023-10-03 21:40:24,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:40:24,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:40:25,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 21:40:27,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:40:27,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:40:30,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:40:30,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 21:40:32,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:40:38,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:40:39,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 21:40:42,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:40:44,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1415626.6666666667, ans=0.2 2023-10-03 21:40:48,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1415693.3333333333, ans=10.0 2023-10-03 21:40:49,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:40:49,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:40:52,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:40:53,868 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.916e+02 2.105e+02 2.538e+02 3.935e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-03 21:40:53,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:40:55,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 21:41:00,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:41:01,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:41:01,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:41:01,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1415760.0, ans=0.0 2023-10-03 21:41:05,175 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.80 vs. limit=15.0 2023-10-03 21:41:05,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:41:07,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:41:07,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 21:41:11,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:41:11,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 21:41:13,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:41:13,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:41:14,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 21:41:14,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:41:14,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:41:14,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:41:16,089 INFO [train.py:1046] (1/4) Epoch 40, batch 5200, loss[loss=0.1528, simple_loss=0.2447, pruned_loss=0.03047, over 24308.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2372, pruned_loss=0.03878, over 4734044.23 frames. ], batch size: 74, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:41:18,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:41:21,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:41:23,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:41:26,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 21:41:26,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:41:28,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:41:31,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:41:31,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:41:31,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:41:33,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 21:41:36,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:41:36,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:41:39,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 21:41:41,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:41:43,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:41:43,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 21:41:44,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 21:41:47,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 21:41:47,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:41:47,405 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 21:41:48,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:41:50,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:41:50,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:41:50,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 21:41:51,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:41:54,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:41:56,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 21:41:56,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 21:41:57,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 21:41:58,344 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.28 vs. limit=22.5 2023-10-03 21:42:02,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 21:42:03,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:42:09,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:42:09,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:42:10,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 21:42:10,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1416026.6666666667, ans=0.0 2023-10-03 21:42:11,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:42:11,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 21:42:11,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:42:12,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:42:14,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:42:15,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:42:18,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:42:18,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:42:18,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:42:25,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:42:25,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 21:42:26,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:42:26,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:42:27,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:42:29,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 21:42:30,676 INFO [train.py:1046] (1/4) Epoch 40, batch 5250, loss[loss=0.1527, simple_loss=0.2388, pruned_loss=0.03325, over 24473.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2367, pruned_loss=0.03872, over 4719209.34 frames. ], batch size: 66, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:42:30,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:42:33,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:42:34,348 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.00 vs. limit=15.0 2023-10-03 21:42:37,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:42:37,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:42:39,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:42:43,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:42:44,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:42:47,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:42:47,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:42:50,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 21:42:51,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:42:52,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:43:10,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1416293.3333333333, ans=0.0 2023-10-03 21:43:18,734 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.664e+02 1.994e+02 2.255e+02 2.624e+02 3.979e+02, threshold=4.511e+02, percent-clipped=0.0 2023-10-03 21:43:22,816 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.73 vs. limit=22.5 2023-10-03 21:43:28,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1416426.6666666667, ans=0.125 2023-10-03 21:43:33,822 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1416426.6666666667, ans=0.0 2023-10-03 21:43:38,756 INFO [train.py:1046] (1/4) Epoch 40, batch 5300, loss[loss=0.1472, simple_loss=0.2102, pruned_loss=0.04208, over 23632.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.235, pruned_loss=0.03871, over 4708489.97 frames. ], batch size: 256, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:43:39,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1416493.3333333333, ans=0.125 2023-10-03 21:43:45,651 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.67 vs. limit=6.0 2023-10-03 21:43:46,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1416493.3333333333, ans=0.0 2023-10-03 21:43:52,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:43:52,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 21:43:52,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 21:43:52,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:43:52,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:43:52,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:43:52,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:43:52,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:43:52,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:43:52,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:43:53,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 21:43:53,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:43:53,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 21:43:53,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 21:43:53,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 21:43:53,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 21:43:53,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 21:43:54,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 21:43:54,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:43:54,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:43:54,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:43:54,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:43:54,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:43:54,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:43:54,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:43:55,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:43:55,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:43:55,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:43:55,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:43:55,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:43:55,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:43:55,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 21:43:55,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:43:56,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:43:56,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 21:43:56,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 21:43:56,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:43:56,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:43:56,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 21:43:56,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 21:43:56,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:43:57,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:43:57,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:43:57,451 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 21:43:57,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 21:43:57,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:43:57,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:43:57,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 21:43:57,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 21:43:57,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 21:43:57,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:44:05,020 INFO [train.py:1046] (1/4) Epoch 41, batch 0, loss[loss=0.1529, simple_loss=0.2478, pruned_loss=0.02902, over 24641.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2478, pruned_loss=0.02902, over 24641.00 frames. ], batch size: 73, lr: 2.50e-03, grad_scale: 32.0 2023-10-03 21:44:05,021 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 21:44:17,638 INFO [train.py:1078] (1/4) Epoch 41, validation: loss=0.3341, simple_loss=0.2655, pruned_loss=0.2013, over 1125622.00 frames. 2023-10-03 21:44:17,639 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 21:44:18,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1416573.3333333333, ans=0.125 2023-10-03 21:44:19,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 21:44:21,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:44:23,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:44:28,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:44:28,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:44:29,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:44:29,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 21:44:29,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1416573.3333333333, ans=0.125 2023-10-03 21:44:32,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 21:44:35,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:44:35,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:44:35,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1416640.0, ans=0.0 2023-10-03 21:44:38,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:44:38,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:44:39,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:44:39,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:44:41,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 21:44:44,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:44:50,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:44:50,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:44:52,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 21:44:56,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:44:56,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:44:58,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:45:02,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:45:03,437 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.05 vs. limit=15.0 2023-10-03 21:45:05,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:45:13,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 21:45:15,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 21:45:15,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:45:15,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:45:17,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:45:18,093 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.46 vs. limit=15.0 2023-10-03 21:45:18,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:45:18,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 21:45:20,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:45:21,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:45:24,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:45:24,856 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.76 vs. limit=10.0 2023-10-03 21:45:27,703 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 21:45:29,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:45:31,720 INFO [train.py:1046] (1/4) Epoch 41, batch 50, loss[loss=0.1625, simple_loss=0.2538, pruned_loss=0.03558, over 24528.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2398, pruned_loss=0.03865, over 1072997.32 frames. ], batch size: 71, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 21:45:33,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:45:36,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:45:36,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 21:45:36,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:45:37,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:45:37,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:45:40,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:45:42,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:45:45,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 21:45:45,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:45:50,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:45:50,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 21:45:53,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 21:45:53,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1416973.3333333333, ans=0.125 2023-10-03 21:45:55,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:45:55,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:45:55,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:45:57,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:45:58,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 21:45:59,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 21:45:59,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:46:05,816 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.939e+02 2.175e+02 2.475e+02 4.066e+02, threshold=4.350e+02, percent-clipped=0.0 2023-10-03 21:46:07,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:46:08,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:46:08,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:46:08,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 21:46:10,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:46:12,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:46:12,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1417040.0, ans=0.1 2023-10-03 21:46:13,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 21:46:13,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:46:16,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 21:46:18,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1417106.6666666667, ans=0.1 2023-10-03 21:46:23,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:46:23,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:46:24,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:46:25,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:46:25,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:46:27,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 21:46:27,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 21:46:29,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1417173.3333333333, ans=0.1 2023-10-03 21:46:30,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:46:30,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:46:33,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:46:33,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:46:34,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 21:46:35,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 21:46:36,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 21:46:38,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:46:38,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:46:39,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 21:46:39,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 21:46:41,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:46:41,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:46:44,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:46:44,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:46:45,744 INFO [train.py:1046] (1/4) Epoch 41, batch 100, loss[loss=0.1549, simple_loss=0.2276, pruned_loss=0.04113, over 23488.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2399, pruned_loss=0.03961, over 1886413.19 frames. ], batch size: 134, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:46:45,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:46:48,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:46:51,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:46:51,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 21:46:51,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:46:55,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:46:55,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:46:55,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:46:55,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:46:55,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:46:56,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 21:47:00,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:47:00,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:47:00,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:47:00,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:47:04,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 21:47:06,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:47:06,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:47:07,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:47:09,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:47:11,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1417306.6666666667, ans=0.2 2023-10-03 21:47:14,091 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 21:47:15,322 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 21:47:15,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:47:15,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:47:15,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1417373.3333333333, ans=0.0 2023-10-03 21:47:19,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:47:21,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:47:21,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1417373.3333333333, ans=0.1 2023-10-03 21:47:22,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:22,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1417373.3333333333, ans=0.125 2023-10-03 21:47:27,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:29,195 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 21:47:30,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 21:47:33,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:47:35,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:47:37,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:40,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:47:41,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:47:44,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:47:45,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:47,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:47:48,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:47:48,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:47:49,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:49,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 21:47:49,751 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 21:47:51,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:47:51,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:47:52,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:47:52,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:47:52,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 21:47:53,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:47:53,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:47:53,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:47:55,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:47:56,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:47:56,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:47:57,820 INFO [train.py:1046] (1/4) Epoch 41, batch 150, loss[loss=0.1724, simple_loss=0.2623, pruned_loss=0.04127, over 23973.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2398, pruned_loss=0.03999, over 2511137.53 frames. ], batch size: 80, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:47:57,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:48:01,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:48:05,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:48:05,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:48:05,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:09,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:48:09,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:12,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:48:12,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:14,662 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.69 vs. limit=22.5 2023-10-03 21:48:16,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 21:48:16,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 21:48:16,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 21:48:20,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:48:20,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:48:21,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:48:22,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:48:22,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:48:22,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:22,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:24,214 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 21:48:24,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1417640.0, ans=0.125 2023-10-03 21:48:25,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:48:29,121 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.52 vs. limit=15.0 2023-10-03 21:48:29,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:48:32,371 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.887e+02 2.079e+02 2.290e+02 3.335e+02, threshold=4.157e+02, percent-clipped=0.0 2023-10-03 21:48:32,700 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1417706.6666666667, ans=0.125 2023-10-03 21:48:34,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:48:35,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 21:48:37,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:48:37,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:48:39,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:48:41,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:48:42,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:48:42,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1417773.3333333333, ans=0.0 2023-10-03 21:48:44,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:48:45,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:48:46,736 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.58 vs. limit=6.0 2023-10-03 21:48:47,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 21:48:51,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:48:51,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:48:52,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:48:52,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:48:55,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:48:56,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 21:48:59,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:49:00,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:49:01,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:49:03,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:49:03,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 21:49:03,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:49:03,656 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 21:49:07,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:49:11,706 INFO [train.py:1046] (1/4) Epoch 41, batch 200, loss[loss=0.178, simple_loss=0.2543, pruned_loss=0.05083, over 22766.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2403, pruned_loss=0.0404, over 2995786.66 frames. ], batch size: 322, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:49:12,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:49:13,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:49:16,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 21:49:17,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:49:17,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:49:19,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 21:49:21,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 21:49:22,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:49:23,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:49:28,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:49:28,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:49:28,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:49:37,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1417973.3333333333, ans=0.125 2023-10-03 21:49:42,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1418040.0, ans=0.125 2023-10-03 21:49:47,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:49:47,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:49:48,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:49:48,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:49:50,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 21:49:50,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:49:50,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1418040.0, ans=0.125 2023-10-03 21:49:52,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:49:52,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:49:54,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:49:54,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:49:57,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 21:49:57,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:49:57,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:50:02,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:50:07,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:50:10,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1418173.3333333333, ans=0.0 2023-10-03 21:50:14,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:16,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:50:22,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:22,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1418173.3333333333, ans=0.125 2023-10-03 21:50:23,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 21:50:25,602 INFO [train.py:1046] (1/4) Epoch 41, batch 250, loss[loss=0.1513, simple_loss=0.2198, pruned_loss=0.04139, over 23626.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2403, pruned_loss=0.03952, over 3394808.31 frames. ], batch size: 256, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:50:25,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:50:25,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:50:25,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:50:25,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:50:28,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 21:50:28,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:50:29,722 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 21:50:31,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:31,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:50:32,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:33,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:50:35,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:50:35,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:37,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:50:38,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:50:48,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:50:49,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:50:50,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:50:57,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1418373.3333333333, ans=0.125 2023-10-03 21:50:58,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 21:50:58,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:50:59,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:50:59,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:51:00,896 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.902e+02 2.095e+02 2.391e+02 3.475e+02, threshold=4.190e+02, percent-clipped=0.0 2023-10-03 21:51:01,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:51:01,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:51:02,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:51:05,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:51:08,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 21:51:08,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:51:09,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:51:11,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:51:11,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:51:12,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:51:12,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:51:12,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:51:15,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:51:15,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:51:16,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:51:21,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:51:24,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:51:26,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:51:29,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:51:30,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:51:35,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 21:51:37,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:51:37,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:51:38,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 21:51:39,832 INFO [train.py:1046] (1/4) Epoch 41, batch 300, loss[loss=0.1468, simple_loss=0.2277, pruned_loss=0.03292, over 24654.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2375, pruned_loss=0.03872, over 3679736.71 frames. ], batch size: 65, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:51:39,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:51:41,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:51:41,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 21:51:45,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:51:47,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:51:48,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1418573.3333333333, ans=0.125 2023-10-03 21:51:52,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:51:53,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 21:51:53,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:51:56,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 21:51:56,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 21:51:56,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:52:00,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:52:05,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:52:05,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 21:52:08,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 21:52:08,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:11,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:52:13,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:13,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 21:52:13,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:52:15,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:52:18,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:52:19,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:52:22,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 21:52:22,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 21:52:23,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:52:25,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:26,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 21:52:28,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:52:31,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:52:36,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:52:36,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 21:52:40,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:40,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:52:42,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:44,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:52:44,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 21:52:44,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:52:46,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:52:46,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 21:52:47,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1418840.0, ans=0.035 2023-10-03 21:52:49,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:49,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:52:49,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1418840.0, ans=0.125 2023-10-03 21:52:50,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:52:50,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:52:51,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:52:55,531 INFO [train.py:1046] (1/4) Epoch 41, batch 350, loss[loss=0.148, simple_loss=0.2283, pruned_loss=0.03383, over 23458.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2355, pruned_loss=0.03811, over 3904865.39 frames. ], batch size: 120, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:52:55,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:52:55,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 21:52:58,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:52:58,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1418906.6666666667, ans=0.125 2023-10-03 21:53:02,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:53:06,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:53:06,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:53:08,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 21:53:10,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:53:11,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 21:53:12,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:53:12,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 21:53:14,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:53:17,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 21:53:17,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:53:20,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:53:20,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:53:21,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:53:21,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:53:21,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:53:21,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:53:23,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:53:25,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:53:25,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:53:31,194 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.922e+02 2.071e+02 2.368e+02 3.597e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-03 21:53:32,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:53:32,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:53:32,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1419040.0, ans=0.1 2023-10-03 21:53:34,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:53:34,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:53:38,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 21:53:38,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:53:44,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:53:44,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:53:44,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:53:45,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 21:53:47,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:53:48,510 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 21:53:50,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 21:53:50,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:53:53,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:53:53,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 21:53:56,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:53:59,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:53:59,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:54:01,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1419173.3333333333, ans=0.125 2023-10-03 21:54:02,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:54:02,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:54:04,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:54:07,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:54:09,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:54:10,470 INFO [train.py:1046] (1/4) Epoch 41, batch 400, loss[loss=0.1398, simple_loss=0.2137, pruned_loss=0.033, over 24446.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2354, pruned_loss=0.03783, over 4087067.02 frames. ], batch size: 58, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 21:54:10,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 21:54:10,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:54:10,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:54:12,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:54:13,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:54:14,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:54:15,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1419240.0, ans=0.125 2023-10-03 21:54:16,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:54:17,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 21:54:20,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 21:54:20,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:54:21,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 21:54:21,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:54:27,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:54:27,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:54:27,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 21:54:28,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:54:28,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:54:28,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:54:28,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:54:31,408 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 21:54:31,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 21:54:37,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:54:38,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:54:38,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 21:54:40,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 21:54:42,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1419373.3333333333, ans=0.05 2023-10-03 21:54:43,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:54:44,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:54:47,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1419373.3333333333, ans=0.1 2023-10-03 21:54:50,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 21:54:54,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 21:54:55,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 21:54:59,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:55:00,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:55:00,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 21:55:03,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:55:06,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:55:06,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:55:10,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:55:10,998 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.50 vs. limit=15.0 2023-10-03 21:55:11,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 21:55:13,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:55:14,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1419506.6666666667, ans=0.125 2023-10-03 21:55:15,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 21:55:16,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:55:16,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:55:18,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 21:55:21,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:55:22,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:55:22,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 21:55:22,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 21:55:24,091 INFO [train.py:1046] (1/4) Epoch 41, batch 450, loss[loss=0.1672, simple_loss=0.2464, pruned_loss=0.04401, over 23796.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.236, pruned_loss=0.03791, over 4228912.69 frames. ], batch size: 195, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:55:24,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:55:25,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:55:26,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:55:26,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 21:55:26,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:55:29,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:55:31,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:55:42,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:55:43,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:55:44,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 21:55:46,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 21:55:48,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:55:52,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:55:52,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1419706.6666666667, ans=0.1 2023-10-03 21:55:53,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:55:56,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:55:58,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:55:59,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 21:56:00,961 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.852e+02 2.077e+02 2.381e+02 3.655e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-03 21:56:01,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 21:56:01,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1419706.6666666667, ans=0.0 2023-10-03 21:56:01,690 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.25 vs. limit=15.0 2023-10-03 21:56:04,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 21:56:04,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:56:05,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:56:05,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:56:07,736 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 21:56:07,743 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 21:56:07,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:56:09,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:56:10,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 21:56:13,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 21:56:13,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:56:14,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 21:56:14,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 21:56:16,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:56:18,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:56:18,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 21:56:20,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 21:56:23,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:56:25,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 21:56:25,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 21:56:26,469 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.64 vs. limit=22.5 2023-10-03 21:56:27,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:56:30,540 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.69 vs. limit=22.5 2023-10-03 21:56:33,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:56:33,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:56:35,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:56:35,582 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 21:56:38,159 INFO [train.py:1046] (1/4) Epoch 41, batch 500, loss[loss=0.1457, simple_loss=0.2253, pruned_loss=0.03302, over 24590.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2366, pruned_loss=0.03769, over 4333517.05 frames. ], batch size: 60, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:56:39,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:56:39,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:56:41,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:56:41,087 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 21:56:42,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 21:56:42,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:56:45,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:56:46,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1419906.6666666667, ans=0.0 2023-10-03 21:56:48,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1419906.6666666667, ans=0.0 2023-10-03 21:56:49,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:56:51,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:56:53,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:56:53,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:56:55,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:05,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:05,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 21:57:05,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:57:07,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:07,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 21:57:07,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:57:10,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:57:11,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:57:11,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:57:11,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:11,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 21:57:15,507 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 21:57:16,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:57:19,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:19,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:19,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:20,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 21:57:22,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 21:57:24,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:57:26,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:57:27,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1420106.6666666667, ans=0.2 2023-10-03 21:57:30,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:57:33,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:39,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:57:39,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1420173.3333333333, ans=0.125 2023-10-03 21:57:42,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 21:57:42,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:57:42,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:57:42,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1420173.3333333333, ans=0.2 2023-10-03 21:57:43,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 21:57:45,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 21:57:47,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:57:49,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1420240.0, ans=0.125 2023-10-03 21:57:50,627 INFO [train.py:1046] (1/4) Epoch 41, batch 550, loss[loss=0.1915, simple_loss=0.2646, pruned_loss=0.05915, over 19563.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2375, pruned_loss=0.03808, over 4415539.24 frames. ], batch size: 388, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:57:52,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 21:57:55,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 21:57:55,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:57:55,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 21:57:55,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:57:55,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:57:55,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1420240.0, ans=0.1 2023-10-03 21:57:57,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:57,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:57,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:57:57,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1420240.0, ans=0.0 2023-10-03 21:57:58,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:58:01,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:58:03,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 21:58:04,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:58:07,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:07,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:58:12,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:58:12,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:58:16,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 21:58:16,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 21:58:19,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:58:23,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:58:25,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:58:26,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:58:28,302 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.959e+02 2.274e+02 2.623e+02 4.132e+02, threshold=4.548e+02, percent-clipped=0.0 2023-10-03 21:58:31,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:31,094 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 21:58:31,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:58:32,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 21:58:35,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:58:36,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:58:36,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:58:37,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:39,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 21:58:41,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 21:58:41,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:58:42,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:58:42,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:58:42,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:58:45,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:58:46,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:58:48,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:58:48,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:50,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 21:58:50,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:58:51,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:58:53,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:58:54,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:55,301 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.72 vs. limit=15.0 2023-10-03 21:58:56,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:58:56,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 21:59:02,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 21:59:05,222 INFO [train.py:1046] (1/4) Epoch 41, batch 600, loss[loss=0.1581, simple_loss=0.2346, pruned_loss=0.04082, over 24329.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2379, pruned_loss=0.0383, over 4487461.80 frames. ], batch size: 56, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:59:05,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1420573.3333333333, ans=0.125 2023-10-03 21:59:06,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 21:59:08,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:59:08,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 21:59:08,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:59:16,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:59:16,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:59:17,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 21:59:20,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:59:21,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:59:22,606 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.35 vs. limit=10.0 2023-10-03 21:59:24,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:59:27,298 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.15 vs. limit=15.0 2023-10-03 21:59:27,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 21:59:27,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:59:29,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1420640.0, ans=0.125 2023-10-03 21:59:33,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 21:59:36,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:59:36,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:59:36,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:59:38,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1420706.6666666667, ans=0.0 2023-10-03 21:59:41,034 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:59:42,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:59:42,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:59:44,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:59:50,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:59:54,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:59:54,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:59:54,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:00:03,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 22:00:06,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 22:00:06,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:00:10,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 22:00:12,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:00:15,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 22:00:17,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:00:17,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:00:20,153 INFO [train.py:1046] (1/4) Epoch 41, batch 650, loss[loss=0.142, simple_loss=0.2285, pruned_loss=0.02773, over 24613.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2373, pruned_loss=0.03814, over 4539538.71 frames. ], batch size: 60, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:00:21,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 22:00:21,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 22:00:23,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1420906.6666666667, ans=0.0 2023-10-03 22:00:24,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:00:25,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:00:26,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1420906.6666666667, ans=0.2 2023-10-03 22:00:27,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:00:30,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 22:00:32,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:00:36,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:00:36,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:00:40,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:00:43,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 22:00:44,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:00:46,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:00:49,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:00:49,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 22:00:52,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:00:53,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:00:53,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:00:54,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:00:56,662 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.942e+02 2.236e+02 2.477e+02 3.530e+02, threshold=4.472e+02, percent-clipped=0.0 2023-10-03 22:00:56,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:00:58,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 22:00:58,351 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 22:00:58,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:00:58,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:01:02,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:02,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:01:02,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:01:03,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:01:05,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 22:01:05,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:01:05,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:01:06,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:01:06,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:01:09,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:01:11,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 22:01:12,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 22:01:13,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:13,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:01:13,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:01:13,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:01:16,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:01:21,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:21,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:01:22,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:01:25,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:01:25,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:01:27,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:01:33,467 INFO [train.py:1046] (1/4) Epoch 41, batch 700, loss[loss=0.1608, simple_loss=0.2341, pruned_loss=0.04372, over 23594.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2365, pruned_loss=0.0378, over 4578134.61 frames. ], batch size: 256, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:01:33,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:01:33,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:01:33,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:01:34,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:01:37,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 22:01:37,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 22:01:41,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 22:01:41,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:44,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:01:46,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 22:01:51,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:01:52,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:01:54,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:54,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:01:54,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:01:59,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:02:01,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 22:02:01,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:02:04,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 22:02:06,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 22:02:08,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:02:08,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:02:10,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:02:14,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:02:14,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 22:02:20,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:02:20,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:02:20,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 22:02:22,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1421440.0, ans=0.0 2023-10-03 22:02:25,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:02:26,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:02:29,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:02:34,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:02:34,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 22:02:34,729 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.59 vs. limit=22.5 2023-10-03 22:02:35,059 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.86 vs. limit=15.0 2023-10-03 22:02:36,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 22:02:36,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 22:02:39,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:02:39,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1421506.6666666667, ans=0.0 2023-10-03 22:02:41,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:02:42,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:02:43,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1421506.6666666667, ans=0.1 2023-10-03 22:02:45,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:02:45,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 22:02:47,444 INFO [train.py:1046] (1/4) Epoch 41, batch 750, loss[loss=0.1554, simple_loss=0.2375, pruned_loss=0.03671, over 24656.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2356, pruned_loss=0.03783, over 4601209.69 frames. ], batch size: 65, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:02:48,195 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.24 vs. limit=10.0 2023-10-03 22:02:48,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 22:02:48,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 22:02:50,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 22:02:50,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 22:02:50,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 22:02:52,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:02:53,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 22:02:54,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:02:56,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:02:56,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:02:57,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:02:57,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:02:58,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:03:00,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:03:02,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:03:02,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1421640.0, ans=0.125 2023-10-03 22:03:06,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:03:07,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:03:07,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:03:08,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 22:03:10,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:03:10,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:03:10,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1421640.0, ans=0.0 2023-10-03 22:03:11,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:03:13,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 22:03:14,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 22:03:14,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:03:16,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 22:03:16,495 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 22:03:17,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 22:03:18,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 22:03:18,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:03:18,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1421706.6666666667, ans=0.0 2023-10-03 22:03:21,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:03:24,237 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.879e+02 2.063e+02 2.265e+02 2.874e+02, threshold=4.127e+02, percent-clipped=0.0 2023-10-03 22:03:24,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1421706.6666666667, ans=0.0 2023-10-03 22:03:28,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:03:28,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:03:28,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:03:31,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:03:31,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:03:31,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 22:03:33,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:03:34,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1421773.3333333333, ans=0.125 2023-10-03 22:03:35,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 22:03:37,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:03:38,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:03:39,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 22:03:40,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:03:44,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:03:45,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:03:45,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:03:48,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:03:50,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 22:03:52,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:03:52,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:03:53,270 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.25 vs. limit=15.0 2023-10-03 22:03:54,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:03:54,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1421840.0, ans=0.125 2023-10-03 22:03:55,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:03:58,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:03:58,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:04:00,893 INFO [train.py:1046] (1/4) Epoch 41, batch 800, loss[loss=0.1876, simple_loss=0.2533, pruned_loss=0.06093, over 19502.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2365, pruned_loss=0.03794, over 4639326.56 frames. ], batch size: 388, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:04:06,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:04:06,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:09,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:04:09,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:04:11,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:11,908 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.21 vs. limit=6.0 2023-10-03 22:04:12,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:12,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:17,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:04:19,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:04:22,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 22:04:22,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:23,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:04:24,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:04:24,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:04:24,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 22:04:24,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:04:24,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1421973.3333333333, ans=0.125 2023-10-03 22:04:25,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 22:04:28,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:30,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:04:32,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:04:32,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:04:35,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:35,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:38,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:04:38,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:04:38,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 22:04:40,340 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 22:04:40,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 22:04:40,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:04:40,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:04:41,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:41,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:04:46,306 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 22:04:47,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 22:04:48,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:04:49,425 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.80 vs. limit=10.0 2023-10-03 22:04:50,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:04:55,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:04:58,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:58,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1422173.3333333333, ans=0.1 2023-10-03 22:04:59,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 22:04:59,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:05:02,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 22:05:05,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:05:08,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:05:09,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 22:05:10,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:05:11,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:05:12,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 22:05:12,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:05:14,105 INFO [train.py:1046] (1/4) Epoch 41, batch 850, loss[loss=0.1668, simple_loss=0.2376, pruned_loss=0.04796, over 23659.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2369, pruned_loss=0.03818, over 4655423.74 frames. ], batch size: 232, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:05:14,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:05:15,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:05:17,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:05:17,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1422240.0, ans=0.125 2023-10-03 22:05:20,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:05:22,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 22:05:22,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 22:05:22,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 22:05:24,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:05:25,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:05:25,712 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:05:26,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:05:27,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:05:27,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:05:31,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:05:31,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:05:32,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 22:05:37,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 22:05:40,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:05:40,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 22:05:43,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 22:05:44,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 22:05:47,114 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 22:05:47,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:05:47,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:05:48,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 22:05:51,485 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.958e+02 2.183e+02 2.406e+02 3.404e+02, threshold=4.367e+02, percent-clipped=0.0 2023-10-03 22:05:51,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:05:52,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:05:52,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 22:05:53,936 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.26 vs. limit=12.0 2023-10-03 22:05:56,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:05:56,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:05:58,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:05:58,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 22:06:00,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:06:02,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 22:06:02,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 22:06:06,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:06:06,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:06:08,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:06:08,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:06:09,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:06:12,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:06:15,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 22:06:15,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1422506.6666666667, ans=0.125 2023-10-03 22:06:16,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:06:16,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:06:17,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:06:19,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1422506.6666666667, ans=0.0 2023-10-03 22:06:24,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:06:25,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:06:25,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 22:06:25,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff2.min_abs, batch_count=1422506.6666666667, ans=0.1 2023-10-03 22:06:27,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:06:27,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:06:29,454 INFO [train.py:1046] (1/4) Epoch 41, batch 900, loss[loss=0.1495, simple_loss=0.2332, pruned_loss=0.03296, over 24294.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2371, pruned_loss=0.03833, over 4678909.25 frames. ], batch size: 61, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:06:30,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 22:06:36,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:06:40,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:06:40,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 22:06:43,086 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.09 vs. limit=15.0 2023-10-03 22:06:45,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:06:45,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 22:06:46,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 22:06:48,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:06:48,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:06:49,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:06:49,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:06:58,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:06:58,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:06:58,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:07:01,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:07:06,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 22:07:06,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:07:11,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:07:12,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:07:12,366 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 22:07:14,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 22:07:21,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:07:21,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:07:21,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:07:27,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:07:27,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:07:29,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 22:07:29,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:07:30,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 22:07:32,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:07:33,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:07:34,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:07:34,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:07:39,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 22:07:39,670 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 22:07:41,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 22:07:41,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 22:07:43,769 INFO [train.py:1046] (1/4) Epoch 41, batch 950, loss[loss=0.1575, simple_loss=0.2482, pruned_loss=0.03336, over 24292.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2372, pruned_loss=0.03841, over 4687759.23 frames. ], batch size: 74, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:07:45,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:07:47,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1422906.6666666667, ans=0.0 2023-10-03 22:07:48,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 22:07:53,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:07:55,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:07:57,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:07:58,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 22:08:00,084 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 22:08:03,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:08:03,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:08:05,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:08:05,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:08:05,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 22:08:06,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 22:08:08,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:09,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 22:08:09,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:08:13,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:13,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:08:13,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:08:15,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 22:08:15,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1423040.0, ans=0.2 2023-10-03 22:08:16,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 22:08:21,140 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.951e+02 2.151e+02 2.455e+02 3.832e+02, threshold=4.302e+02, percent-clipped=0.0 2023-10-03 22:08:21,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:08:21,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:08:25,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:08:26,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:08:26,183 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:08:30,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 22:08:30,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 22:08:30,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:08:32,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:08:32,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:32,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:08:36,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 22:08:38,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:08:39,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:08:40,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:40,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 22:08:40,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:08:40,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:08:41,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 22:08:41,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1423106.6666666667, ans=0.125 2023-10-03 22:08:45,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:08:47,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:08:52,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:08:54,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 22:08:54,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 22:08:57,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:57,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1423240.0, ans=0.07 2023-10-03 22:08:58,934 INFO [train.py:1046] (1/4) Epoch 41, batch 1000, loss[loss=0.1477, simple_loss=0.2232, pruned_loss=0.03614, over 23746.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.236, pruned_loss=0.03879, over 4684585.79 frames. ], batch size: 179, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:09:02,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 22:09:02,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:04,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1423240.0, ans=0.07 2023-10-03 22:09:08,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:09:09,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 22:09:09,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 22:09:13,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:09:13,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:09:15,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:09:17,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 22:09:21,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 22:09:22,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 22:09:23,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:09:27,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 22:09:27,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 22:09:27,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 22:09:29,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:09:29,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:39,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:09:39,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:09:39,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:40,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:09:40,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 22:09:40,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:09:41,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:09:42,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:09:43,388 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 22:09:46,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 22:09:46,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 22:09:48,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 22:09:49,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1423440.0, ans=0.125 2023-10-03 22:09:50,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:09:56,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:56,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:09:57,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1423506.6666666667, ans=0.1 2023-10-03 22:09:58,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:59,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:10:01,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 22:10:02,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:10:04,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 22:10:04,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 22:10:06,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:10:06,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:10:08,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:10:10,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:10:10,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:10:13,258 INFO [train.py:1046] (1/4) Epoch 41, batch 1050, loss[loss=0.1484, simple_loss=0.2242, pruned_loss=0.03627, over 23697.00 frames. ], tot_loss[loss=0.156, simple_loss=0.235, pruned_loss=0.0385, over 4691940.83 frames. ], batch size: 212, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:10:14,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:10:14,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:10:17,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 22:10:17,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:10:20,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:10:20,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1423573.3333333333, ans=0.2 2023-10-03 22:10:23,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:10:24,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:10:26,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:10:26,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:10:28,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:10:29,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:10:29,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 22:10:30,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:10:30,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 22:10:34,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:10:34,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 22:10:34,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:10:41,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:10:43,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:10:43,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:10:44,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1423706.6666666667, ans=0.125 2023-10-03 22:10:45,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 22:10:45,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 22:10:45,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:10:48,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 22:10:50,171 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.894e+02 2.056e+02 2.296e+02 3.481e+02, threshold=4.112e+02, percent-clipped=0.0 2023-10-03 22:10:50,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 22:10:51,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:10:55,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 22:10:57,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:10:58,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:10:58,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:11:01,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1423773.3333333333, ans=0.125 2023-10-03 22:11:02,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:11:05,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 22:11:06,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 22:11:07,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1423773.3333333333, ans=0.125 2023-10-03 22:11:08,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 22:11:08,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:11:08,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:11:10,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 22:11:11,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1423840.0, ans=0.0 2023-10-03 22:11:11,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1423840.0, ans=0.1 2023-10-03 22:11:12,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:11:15,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:11:16,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:11:16,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:11:16,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:11:19,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:11:19,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 22:11:22,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:11:22,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 22:11:22,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 22:11:22,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:11:27,649 INFO [train.py:1046] (1/4) Epoch 41, batch 1100, loss[loss=0.1521, simple_loss=0.2429, pruned_loss=0.03064, over 24653.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2345, pruned_loss=0.0382, over 4692752.93 frames. ], batch size: 73, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:11:29,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:11:31,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:11:35,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:11:35,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:11:35,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:11:37,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 22:11:38,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:11:40,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1423906.6666666667, ans=0.0 2023-10-03 22:11:41,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 22:11:42,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:11:46,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:11:46,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 22:11:46,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 22:11:48,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:11:48,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:11:51,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:11:52,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:11:56,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:11:59,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 22:12:00,433 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 22:12:02,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:12:03,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:12:05,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:12:05,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:12:07,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 22:12:08,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:12:08,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:12:08,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:12:08,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:12:08,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 22:12:15,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:12:15,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 22:12:16,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:12:22,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:12:22,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1424106.6666666667, ans=0.125 2023-10-03 22:12:24,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 22:12:24,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 22:12:24,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1424106.6666666667, ans=0.125 2023-10-03 22:12:27,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:12:30,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:12:30,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:12:30,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 22:12:31,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:12:31,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:12:33,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 22:12:35,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:12:35,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 22:12:37,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:12:37,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:12:38,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:12:38,996 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.65 vs. limit=12.0 2023-10-03 22:12:42,379 INFO [train.py:1046] (1/4) Epoch 41, batch 1150, loss[loss=0.1616, simple_loss=0.2359, pruned_loss=0.04363, over 23411.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2356, pruned_loss=0.03847, over 4707027.92 frames. ], batch size: 285, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:12:43,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:12:46,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:12:49,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:12:49,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:12:49,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 22:12:50,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:12:53,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 22:12:53,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:12:55,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:12:58,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1424306.6666666667, ans=0.125 2023-10-03 22:13:01,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 22:13:02,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:13:07,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:13:07,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:13:07,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1424306.6666666667, ans=0.1 2023-10-03 22:13:09,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 22:13:09,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:13:09,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:13:11,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 22:13:13,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:13:13,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1424373.3333333333, ans=0.125 2023-10-03 22:13:14,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:13:18,761 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.950e+02 2.098e+02 2.374e+02 3.607e+02, threshold=4.196e+02, percent-clipped=0.0 2023-10-03 22:13:21,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:13:27,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:13:27,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 22:13:27,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:13:27,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:13:35,384 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 22:13:36,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:13:44,108 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 22:13:46,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:13:48,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:13:48,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:13:49,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:13:51,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:13:56,001 INFO [train.py:1046] (1/4) Epoch 41, batch 1200, loss[loss=0.158, simple_loss=0.2479, pruned_loss=0.03404, over 24555.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2364, pruned_loss=0.03876, over 4709392.85 frames. ], batch size: 71, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:13:57,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:13:57,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:13:58,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:13:58,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:13:58,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1424573.3333333333, ans=0.125 2023-10-03 22:14:00,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:14:01,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:14:04,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:14:06,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:14:07,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:14:09,675 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 22:14:11,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 22:14:11,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1424640.0, ans=0.0 2023-10-03 22:14:17,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:14:18,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:14:20,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:14:22,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:14:22,931 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 22:14:24,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:14:31,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:14:31,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:14:31,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 22:14:33,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:14:36,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 22:14:40,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 22:14:41,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:14:41,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:14:42,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1424773.3333333333, ans=0.09899494936611666 2023-10-03 22:14:44,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:14:44,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:14:47,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:14:47,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:14:47,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1424773.3333333333, ans=0.1 2023-10-03 22:14:48,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:14:48,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 22:14:48,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:14:48,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:14:48,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:14:52,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:14:52,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:14:56,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 22:14:59,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:15:01,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 22:15:07,068 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 22:15:08,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:15:09,876 INFO [train.py:1046] (1/4) Epoch 41, batch 1250, loss[loss=0.1567, simple_loss=0.2384, pruned_loss=0.03754, over 24508.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2369, pruned_loss=0.03831, over 4726431.73 frames. ], batch size: 63, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:15:09,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:15:11,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:15:13,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:15:15,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 22:15:20,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:15:20,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:15:22,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 22:15:22,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:15:23,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:15:28,172 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.09 vs. limit=15.0 2023-10-03 22:15:28,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 22:15:28,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:15:30,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:15:30,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:15:31,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1424973.3333333333, ans=0.125 2023-10-03 22:15:33,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:15:35,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 22:15:36,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 22:15:36,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:15:36,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:15:36,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:15:36,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1424973.3333333333, ans=0.125 2023-10-03 22:15:39,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:15:40,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:15:42,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1425040.0, ans=0.125 2023-10-03 22:15:46,569 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.95 vs. limit=15.0 2023-10-03 22:15:47,170 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.967e+02 2.225e+02 2.437e+02 4.651e+02, threshold=4.451e+02, percent-clipped=1.0 2023-10-03 22:15:47,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 22:15:47,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:15:50,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:15:50,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 22:15:51,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:15:51,627 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 22:15:51,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:15:51,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:15:56,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:16:00,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:16:00,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:16:01,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 22:16:01,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 22:16:01,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 22:16:02,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1425106.6666666667, ans=0.0 2023-10-03 22:16:04,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:16:05,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 22:16:05,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:16:08,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 22:16:08,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:16:11,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 22:16:11,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 22:16:12,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:16:12,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:16:14,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:16:14,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1425173.3333333333, ans=0.125 2023-10-03 22:16:16,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 22:16:18,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:16:19,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:16:21,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:16:22,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1425240.0, ans=0.125 2023-10-03 22:16:23,734 INFO [train.py:1046] (1/4) Epoch 41, batch 1300, loss[loss=0.1489, simple_loss=0.2344, pruned_loss=0.03165, over 24671.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2369, pruned_loss=0.03836, over 4729129.60 frames. ], batch size: 65, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:16:23,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 22:16:25,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:16:26,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 22:16:29,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:16:32,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:16:32,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:16:34,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:16:35,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:16:35,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 22:16:41,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:16:41,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:16:42,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 22:16:47,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:16:49,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:16:52,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:16:53,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:16:55,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:16:55,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:16:56,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 22:16:58,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 22:17:02,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:17:03,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:17:05,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 22:17:05,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 22:17:08,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:17:10,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:17:10,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 22:17:10,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:17:10,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 22:17:12,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:17:18,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:17:18,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:17:20,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 22:17:21,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 22:17:23,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 22:17:26,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:17:28,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 22:17:31,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:17:37,917 INFO [train.py:1046] (1/4) Epoch 41, batch 1350, loss[loss=0.1693, simple_loss=0.2549, pruned_loss=0.04182, over 24638.00 frames. ], tot_loss[loss=0.157, simple_loss=0.237, pruned_loss=0.03851, over 4720378.11 frames. ], batch size: 73, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:17:38,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 22:17:40,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:17:42,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:17:46,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:17:46,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:17:48,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:17:48,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:17:51,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:17:54,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 22:17:54,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1425640.0, ans=0.0 2023-10-03 22:17:55,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:17:55,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:17:58,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 22:17:58,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:18:01,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:18:01,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 22:18:01,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 22:18:01,763 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.44 vs. limit=22.5 2023-10-03 22:18:04,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 22:18:05,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:18:05,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 22:18:09,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1425706.6666666667, ans=0.125 2023-10-03 22:18:11,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1425706.6666666667, ans=0.125 2023-10-03 22:18:11,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1425706.6666666667, ans=0.2 2023-10-03 22:18:16,172 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 1.965e+02 2.297e+02 2.654e+02 3.860e+02, threshold=4.593e+02, percent-clipped=0.0 2023-10-03 22:18:16,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:18:26,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:18:26,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:18:28,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 22:18:28,644 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.68 vs. limit=6.0 2023-10-03 22:18:30,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1425773.3333333333, ans=0.2 2023-10-03 22:18:32,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:18:32,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 22:18:32,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:18:34,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:18:34,710 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.45 vs. limit=15.0 2023-10-03 22:18:35,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:18:35,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1425840.0, ans=0.1 2023-10-03 22:18:36,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 22:18:38,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:18:41,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 22:18:42,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 22:18:47,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 22:18:48,085 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.38 vs. limit=10.0 2023-10-03 22:18:49,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:18:52,244 INFO [train.py:1046] (1/4) Epoch 41, batch 1400, loss[loss=0.1561, simple_loss=0.2425, pruned_loss=0.0349, over 24500.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2347, pruned_loss=0.03843, over 4700491.79 frames. ], batch size: 66, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:18:53,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:18:53,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:18:57,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 22:18:59,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 22:19:05,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1425973.3333333333, ans=0.1 2023-10-03 22:19:10,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:19:12,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1425973.3333333333, ans=0.125 2023-10-03 22:19:13,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:19:16,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:19:16,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 22:19:22,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:19:22,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 22:19:31,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:19:31,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:19:32,437 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.82 vs. limit=15.0 2023-10-03 22:19:35,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 22:19:35,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:19:37,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:19:37,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:19:39,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:19:39,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:19:40,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:19:40,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:19:42,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 22:19:42,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:19:46,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:19:49,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:19:49,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1426173.3333333333, ans=0.125 2023-10-03 22:19:56,771 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.58 vs. limit=15.0 2023-10-03 22:19:58,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 22:19:58,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 22:19:58,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:20:01,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 22:20:01,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:20:04,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:20:06,481 INFO [train.py:1046] (1/4) Epoch 41, batch 1450, loss[loss=0.1531, simple_loss=0.2412, pruned_loss=0.03251, over 24665.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2338, pruned_loss=0.03792, over 4706444.08 frames. ], batch size: 65, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:20:06,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1426240.0, ans=0.125 2023-10-03 22:20:08,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:20:09,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:20:09,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:09,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 22:20:09,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1426240.0, ans=0.125 2023-10-03 22:20:14,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:20:15,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:20:15,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:20:15,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 22:20:16,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:20:16,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 22:20:17,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:18,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:20:18,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 22:20:20,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:20:22,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:20:22,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 22:20:22,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:20:23,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:20:24,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:27,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:20:31,049 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.10 vs. limit=10.0 2023-10-03 22:20:31,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:20:31,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:20:34,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:20:34,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:37,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:20:37,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:20:37,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:37,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:20:41,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 22:20:43,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:20:44,662 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.98 vs. limit=12.0 2023-10-03 22:20:45,101 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.909e+02 2.047e+02 2.238e+02 3.342e+02, threshold=4.094e+02, percent-clipped=0.0 2023-10-03 22:20:47,905 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 22:20:49,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:20:50,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:20:52,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:20:54,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 22:20:58,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1426440.0, ans=0.2 2023-10-03 22:21:00,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:21:00,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 22:21:01,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 22:21:03,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:21:07,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:21:07,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:21:08,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 22:21:11,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 22:21:11,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 22:21:11,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:21:11,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1426506.6666666667, ans=0.0 2023-10-03 22:21:13,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:21:15,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1426506.6666666667, ans=0.125 2023-10-03 22:21:20,109 INFO [train.py:1046] (1/4) Epoch 41, batch 1500, loss[loss=0.148, simple_loss=0.2299, pruned_loss=0.03309, over 24617.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2344, pruned_loss=0.03795, over 4706048.88 frames. ], batch size: 60, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:21:24,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 22:21:24,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:21:24,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:21:25,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:21:27,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:21:27,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:21:29,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 22:21:30,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:21:30,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:21:30,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:21:30,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:21:32,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:21:35,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:21:41,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:21:41,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 22:21:41,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:21:41,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:21:42,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:21:45,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 22:21:48,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 22:21:49,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:21:51,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 22:21:53,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 22:21:55,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:21:56,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.whiten.whitening_limit, batch_count=1426706.6666666667, ans=15.0 2023-10-03 22:21:56,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:21:56,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:21:58,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 22:21:58,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:21:58,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:21:58,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 22:21:58,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:22:05,774 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.17 vs. limit=15.0 2023-10-03 22:22:06,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:22:06,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 22:22:10,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:22:11,482 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.73 vs. limit=15.0 2023-10-03 22:22:12,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:22:13,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1426773.3333333333, ans=0.125 2023-10-03 22:22:15,014 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 22:22:15,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:22:15,722 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 22:22:17,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:22:18,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:22:19,763 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 22:22:21,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:22:24,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 22:22:25,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:22:28,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:22:28,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:22:28,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:22:30,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:22:31,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:22:32,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 22:22:32,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 22:22:33,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1426906.6666666667, ans=0.125 2023-10-03 22:22:34,152 INFO [train.py:1046] (1/4) Epoch 41, batch 1550, loss[loss=0.166, simple_loss=0.2592, pruned_loss=0.03637, over 24602.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2348, pruned_loss=0.03797, over 4707022.58 frames. ], batch size: 68, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:22:34,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:22:34,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 22:22:34,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 22:22:37,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:22:38,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:22:39,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1426906.6666666667, ans=0.2 2023-10-03 22:22:40,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:22:40,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:22:40,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:22:41,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:22:44,529 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 22:22:44,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:22:45,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:22:45,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:22:47,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:22:47,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 22:22:50,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:22:50,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 22:22:51,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 22:22:51,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 22:22:51,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:22:53,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:22:57,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:22:59,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1426973.3333333333, ans=10.0 2023-10-03 22:23:00,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 22:23:00,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 22:23:08,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:23:10,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:23:11,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 22:23:11,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:23:11,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 22:23:13,021 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.987e+02 2.151e+02 2.393e+02 4.271e+02, threshold=4.303e+02, percent-clipped=1.0 2023-10-03 22:23:15,199 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.73 vs. limit=22.5 2023-10-03 22:23:16,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:23:17,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:23:18,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1427106.6666666667, ans=0.1 2023-10-03 22:23:19,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1427106.6666666667, ans=0.1 2023-10-03 22:23:20,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:23:23,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:23:23,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:23:23,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 22:23:24,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:23:27,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:23:27,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:23:28,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 22:23:28,960 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 22:23:32,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:23:33,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1427173.3333333333, ans=0.09899494936611666 2023-10-03 22:23:38,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 22:23:44,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:23:44,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:23:45,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 22:23:47,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:23:47,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:23:47,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:23:48,767 INFO [train.py:1046] (1/4) Epoch 41, batch 1600, loss[loss=0.1619, simple_loss=0.2487, pruned_loss=0.03758, over 24659.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2359, pruned_loss=0.0383, over 4709426.83 frames. ], batch size: 65, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:23:48,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:23:50,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:23:53,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:23:54,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 22:23:54,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 22:23:57,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 22:23:57,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1427240.0, ans=0.0 2023-10-03 22:23:59,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:24:01,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 22:24:03,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:24:05,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:24:09,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:24:14,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 22:24:16,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1427306.6666666667, ans=0.125 2023-10-03 22:24:17,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:24:18,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 22:24:18,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:24:18,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 22:24:23,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 22:24:30,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:24:31,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 22:24:31,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:24:33,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:24:33,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:24:35,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 22:24:39,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 22:24:39,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1427440.0, ans=0.125 2023-10-03 22:24:42,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:24:42,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:24:42,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:24:42,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1427440.0, ans=0.0 2023-10-03 22:24:43,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:24:45,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:24:45,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:24:48,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:24:51,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1427506.6666666667, ans=0.0 2023-10-03 22:24:54,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:24:55,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:24:58,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 22:24:58,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:24:58,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 22:25:02,712 INFO [train.py:1046] (1/4) Epoch 41, batch 1650, loss[loss=0.144, simple_loss=0.2213, pruned_loss=0.03333, over 23640.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2366, pruned_loss=0.03841, over 4701096.67 frames. ], batch size: 149, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:25:04,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:25:04,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1427573.3333333333, ans=0.125 2023-10-03 22:25:05,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:25:06,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:25:06,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 22:25:06,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 22:25:06,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 22:25:08,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 22:25:11,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:25:12,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:25:13,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:25:14,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:25:17,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:25:19,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 22:25:22,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:25:22,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:25:22,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:25:22,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:25:22,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 22:25:23,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 22:25:28,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:25:29,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:25:35,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1427706.6666666667, ans=0.0 2023-10-03 22:25:36,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 22:25:36,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:25:37,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1427706.6666666667, ans=0.125 2023-10-03 22:25:39,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 22:25:43,250 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.685e+02 1.986e+02 2.197e+02 2.523e+02 3.489e+02, threshold=4.394e+02, percent-clipped=0.0 2023-10-03 22:25:44,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:25:47,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:25:47,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:25:47,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:25:49,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:25:49,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:25:50,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1427773.3333333333, ans=0.125 2023-10-03 22:25:52,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:25:53,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:25:53,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:25:55,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:25:56,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:25:58,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:26:00,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:26:00,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 22:26:03,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:26:03,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 22:26:04,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 22:26:04,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 22:26:04,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:26:05,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:26:05,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:26:06,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:26:06,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 22:26:10,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:26:10,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:26:12,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:26:15,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 22:26:17,082 INFO [train.py:1046] (1/4) Epoch 41, batch 1700, loss[loss=0.1679, simple_loss=0.2476, pruned_loss=0.04409, over 23737.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2367, pruned_loss=0.03857, over 4711213.16 frames. ], batch size: 85, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:26:18,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:26:18,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:26:18,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 22:26:20,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:26:20,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:26:20,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:26:22,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:26:22,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:26:22,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 22:26:26,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:26:32,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:26:36,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:26:36,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1427973.3333333333, ans=0.1 2023-10-03 22:26:36,788 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.61 vs. limit=22.5 2023-10-03 22:26:42,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:26:42,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:26:42,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:26:42,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:26:46,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 22:26:48,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:26:48,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:26:49,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1428040.0, ans=0.125 2023-10-03 22:26:51,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:26:51,972 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.39 vs. limit=22.5 2023-10-03 22:26:52,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:26:53,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 22:26:53,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 22:26:54,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1428040.0, ans=0.125 2023-10-03 22:26:54,610 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.77 vs. limit=15.0 2023-10-03 22:26:55,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:26:57,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 22:26:57,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:27:06,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:27:06,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:27:08,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:27:09,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 22:27:09,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 22:27:09,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:27:11,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:27:11,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 22:27:12,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:27:12,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:27:12,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:27:12,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:27:13,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1428106.6666666667, ans=0.0 2023-10-03 22:27:13,081 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1428106.6666666667, ans=0.125 2023-10-03 22:27:15,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:27:15,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:27:17,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:27:17,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:27:17,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:27:21,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1428173.3333333333, ans=0.125 2023-10-03 22:27:23,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:27:25,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 22:27:26,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:27:28,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:27:28,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 22:27:32,477 INFO [train.py:1046] (1/4) Epoch 41, batch 1750, loss[loss=0.1783, simple_loss=0.2614, pruned_loss=0.04762, over 24045.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2355, pruned_loss=0.03831, over 4706595.30 frames. ], batch size: 86, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:27:32,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:27:34,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:27:34,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:27:35,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 22:27:35,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:27:38,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:27:38,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:27:44,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 22:27:45,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:27:50,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 22:27:50,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:27:50,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:27:52,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1428306.6666666667, ans=0.125 2023-10-03 22:27:53,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 22:27:54,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 22:27:57,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:27:57,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 22:27:59,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1428306.6666666667, ans=0.125 2023-10-03 22:28:04,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:28:07,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:28:07,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:28:07,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1428373.3333333333, ans=0.0 2023-10-03 22:28:11,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:28:11,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:28:13,609 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.948e+02 2.173e+02 2.566e+02 4.881e+02, threshold=4.347e+02, percent-clipped=1.0 2023-10-03 22:28:13,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:28:15,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:28:16,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:28:17,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:28:19,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 22:28:20,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:28:22,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1428440.0, ans=0.2 2023-10-03 22:28:23,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 22:28:23,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:28:25,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:28:26,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:28:28,081 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1428440.0, ans=0.125 2023-10-03 22:28:29,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:28:30,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 22:28:30,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:28:32,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:28:36,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:28:37,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:28:40,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:28:40,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 22:28:40,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:28:42,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:28:42,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:28:42,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:28:42,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:28:43,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:28:46,895 INFO [train.py:1046] (1/4) Epoch 41, batch 1800, loss[loss=0.1679, simple_loss=0.2525, pruned_loss=0.04164, over 23411.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2345, pruned_loss=0.03809, over 4708598.12 frames. ], batch size: 93, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:28:46,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:28:48,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:28:49,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1428573.3333333333, ans=0.07 2023-10-03 22:28:51,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 22:28:54,030 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.58 vs. limit=15.0 2023-10-03 22:28:54,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:28:57,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 22:28:57,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:28:59,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1428573.3333333333, ans=0.0 2023-10-03 22:29:00,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:29:02,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:29:03,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:29:04,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:29:07,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:29:07,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 22:29:07,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1428640.0, ans=0.0 2023-10-03 22:29:08,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:29:09,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1428640.0, ans=0.0 2023-10-03 22:29:12,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:29:12,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1428640.0, ans=0.125 2023-10-03 22:29:13,937 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1428640.0, ans=0.125 2023-10-03 22:29:16,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 22:29:19,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 22:29:19,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 22:29:19,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:29:19,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:29:19,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:29:19,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1428706.6666666667, ans=0.125 2023-10-03 22:29:21,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:29:27,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1428706.6666666667, ans=0.1 2023-10-03 22:29:30,348 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 22:29:30,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:29:33,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:29:34,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 22:29:34,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 22:29:36,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:29:37,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:29:38,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:29:44,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 22:29:49,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:29:49,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 22:29:49,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1428840.0, ans=0.1 2023-10-03 22:29:50,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:29:50,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:29:50,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:29:51,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 22:29:52,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1428840.0, ans=0.2 2023-10-03 22:29:54,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:29:54,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:29:58,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 22:29:58,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:30:00,974 INFO [train.py:1046] (1/4) Epoch 41, batch 1850, loss[loss=0.1431, simple_loss=0.2234, pruned_loss=0.03142, over 22372.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2349, pruned_loss=0.03812, over 4713597.49 frames. ], batch size: 49, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:30:01,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:30:01,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:30:01,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:30:02,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:30:03,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:30:05,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:30:05,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:30:05,919 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.32 vs. limit=10.0 2023-10-03 22:30:07,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:30:08,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:30:15,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:30:15,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 22:30:18,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 22:30:20,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 22:30:23,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1428973.3333333333, ans=0.125 2023-10-03 22:30:25,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:30:26,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 22:30:26,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 22:30:36,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:30:39,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 22:30:40,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:30:40,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:30:41,803 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.909e+02 2.082e+02 2.293e+02 3.628e+02, threshold=4.164e+02, percent-clipped=0.0 2023-10-03 22:30:43,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 22:30:44,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:30:44,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:30:45,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1429106.6666666667, ans=0.125 2023-10-03 22:30:46,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:30:48,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:30:50,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:30:53,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:30:53,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:30:53,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 22:30:53,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:30:56,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:30:58,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:31:00,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1429173.3333333333, ans=0.0 2023-10-03 22:31:01,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 22:31:02,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:31:06,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:31:08,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:31:08,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 22:31:08,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 22:31:09,766 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 22:31:09,845 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 22:31:11,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:31:11,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:31:12,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:31:12,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:31:13,847 INFO [train.py:1046] (1/4) Epoch 41, batch 1900, loss[loss=0.1564, simple_loss=0.2426, pruned_loss=0.03512, over 24625.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2366, pruned_loss=0.03839, over 4712463.26 frames. ], batch size: 68, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:31:13,945 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 22:31:13,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:31:13,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:31:15,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:31:16,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:31:18,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:31:18,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 22:31:19,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:31:19,897 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 22:31:19,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:31:21,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:31:21,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1429240.0, ans=0.125 2023-10-03 22:31:24,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1429240.0, ans=0.2 2023-10-03 22:31:25,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:31:29,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:31:29,124 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 22:31:30,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 22:31:32,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:31:32,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:31:32,326 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 22:31:33,677 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 22:31:36,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 22:31:39,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:31:42,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 22:31:43,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 22:31:43,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1429373.3333333333, ans=0.125 2023-10-03 22:31:48,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1429373.3333333333, ans=0.0 2023-10-03 22:31:49,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1429373.3333333333, ans=0.1 2023-10-03 22:31:51,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 22:31:52,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1429373.3333333333, ans=0.2 2023-10-03 22:31:55,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 22:31:55,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:31:56,631 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 22:31:56,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 22:31:56,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 22:31:57,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 22:31:57,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:32:01,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 22:32:05,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:32:07,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:32:08,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 22:32:10,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:32:13,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 22:32:13,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:32:19,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:32:19,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:32:20,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:32:20,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:32:21,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:32:21,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 22:32:23,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:32:26,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1429573.3333333333, ans=0.1 2023-10-03 22:32:27,044 INFO [train.py:1046] (1/4) Epoch 41, batch 1950, loss[loss=0.1584, simple_loss=0.2351, pruned_loss=0.04088, over 23686.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.237, pruned_loss=0.03884, over 4717804.41 frames. ], batch size: 232, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:32:27,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:32:27,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:32:28,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:32:28,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:32:28,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:32:31,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:32:35,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:32:36,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:32:38,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:32:38,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:32:38,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1429573.3333333333, ans=0.1 2023-10-03 22:32:39,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 22:32:40,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 22:32:40,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:32:42,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:32:42,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1429640.0, ans=0.0 2023-10-03 22:32:45,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:32:45,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:32:45,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:32:46,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:32:51,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:32:51,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:32:51,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:32:51,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:32:53,247 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.68 vs. limit=22.5 2023-10-03 22:32:55,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:32:59,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:32:59,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:32:59,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 22:32:59,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 22:32:59,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:33:01,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:33:01,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:33:01,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1429706.6666666667, ans=0.2 2023-10-03 22:33:04,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:33:07,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:33:09,232 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.661e+02 2.015e+02 2.236e+02 2.546e+02 4.290e+02, threshold=4.473e+02, percent-clipped=1.0 2023-10-03 22:33:12,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:33:13,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:33:13,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:33:13,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 22:33:14,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:33:17,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:33:17,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:33:19,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:33:26,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:33:27,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:33:31,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:33:33,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:33:35,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:33:35,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:33:37,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 22:33:37,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:33:39,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:33:39,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 22:33:41,903 INFO [train.py:1046] (1/4) Epoch 41, batch 2000, loss[loss=0.1446, simple_loss=0.2223, pruned_loss=0.03343, over 24482.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2372, pruned_loss=0.03903, over 4719014.87 frames. ], batch size: 58, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 22:33:42,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:33:44,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:33:46,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:33:46,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:33:47,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:33:49,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:33:51,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1429906.6666666667, ans=0.125 2023-10-03 22:33:53,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 22:33:53,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:33:56,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:33:57,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 22:33:59,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:34:00,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:34:01,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1429973.3333333333, ans=0.0 2023-10-03 22:34:02,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:34:04,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 22:34:05,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:06,901 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.51 vs. limit=15.0 2023-10-03 22:34:07,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:07,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:08,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 22:34:08,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:34:08,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1429973.3333333333, ans=0.1 2023-10-03 22:34:10,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 22:34:10,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:34:13,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:34:14,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 22:34:14,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:14,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:34:15,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:34:17,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 22:34:18,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 22:34:18,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:34:19,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:34:24,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:34:25,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:34:25,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:34:26,517 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.31 vs. limit=22.5 2023-10-03 22:34:27,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:34:28,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1430106.6666666667, ans=0.0 2023-10-03 22:34:28,435 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.48 vs. limit=15.0 2023-10-03 22:34:29,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:34:30,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:34:30,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:34:30,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:34:32,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:34,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:34:35,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 22:34:39,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:34:41,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:34:45,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:34:45,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:34:48,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:51,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:34:51,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:52,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:34:52,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:34:54,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:34:55,661 INFO [train.py:1046] (1/4) Epoch 41, batch 2050, loss[loss=0.1512, simple_loss=0.2269, pruned_loss=0.03778, over 23265.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2369, pruned_loss=0.03893, over 4720625.06 frames. ], batch size: 105, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:34:55,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:57,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:34:58,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:35:00,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1430240.0, ans=0.2 2023-10-03 22:35:04,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:35:06,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:35:07,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:35:08,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:35:09,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 22:35:09,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:35:10,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:35:11,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:35:22,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:35:22,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:35:23,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 22:35:26,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:35:28,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 22:35:28,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:35:31,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:35:32,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:35:34,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:35:34,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:35:36,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:35:36,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:35:37,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:35:38,773 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 2.010e+02 2.255e+02 2.586e+02 3.703e+02, threshold=4.510e+02, percent-clipped=0.0 2023-10-03 22:35:40,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:35:43,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:35:44,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:35:46,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:35:49,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:35:51,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1430440.0, ans=0.125 2023-10-03 22:35:55,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:35:55,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 22:36:01,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:36:01,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:36:03,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:36:07,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 22:36:09,759 INFO [train.py:1046] (1/4) Epoch 41, batch 2100, loss[loss=0.1518, simple_loss=0.2232, pruned_loss=0.0402, over 17512.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2349, pruned_loss=0.03836, over 4711756.44 frames. ], batch size: 38, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:36:09,870 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 22:36:09,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:36:11,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:36:11,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:36:14,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:36:14,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 22:36:14,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 22:36:15,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:36:18,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:36:19,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:36:23,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:36:23,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:36:23,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 22:36:24,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:36:25,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 22:36:25,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 22:36:27,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:36:27,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:36:27,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 22:36:27,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 22:36:28,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1430640.0, ans=0.0 2023-10-03 22:36:32,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1430640.0, ans=0.125 2023-10-03 22:36:33,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 22:36:33,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:36:36,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:36:36,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:36:40,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:36:40,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 22:36:41,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:36:41,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 22:36:43,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 22:36:43,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:36:43,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 22:36:43,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 22:36:45,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 22:36:47,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:36:49,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:36:51,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:36:53,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:36:54,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:36:55,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:36:55,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 22:36:55,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:36:55,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:36:57,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:36:57,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 22:36:58,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 22:37:00,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 22:37:03,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:37:07,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:37:07,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 22:37:12,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:37:13,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:37:15,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:37:15,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:37:15,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 22:37:16,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:37:17,497 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.83 vs. limit=15.0 2023-10-03 22:37:17,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:37:17,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:37:17,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:37:18,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:37:21,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 22:37:22,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 22:37:22,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:37:23,830 INFO [train.py:1046] (1/4) Epoch 41, batch 2150, loss[loss=0.148, simple_loss=0.2264, pruned_loss=0.03485, over 24449.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2345, pruned_loss=0.03813, over 4705996.77 frames. ], batch size: 58, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:37:26,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:37:26,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:37:26,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:37:26,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:37:31,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 22:37:34,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:37:34,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:37:36,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:37:37,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:37:37,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1430973.3333333333, ans=0.0 2023-10-03 22:37:38,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:37:42,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:37:42,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:37:42,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:37:47,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:37:47,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 22:37:50,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1430973.3333333333, ans=0.0 2023-10-03 22:37:51,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:37:53,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:37:53,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:37:54,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:37:54,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:37:55,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:37:56,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:37:56,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:37:57,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:37:57,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 22:37:58,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:38:00,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:38:00,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:38:01,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:38:02,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:38:04,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1431040.0, ans=0.1 2023-10-03 22:38:05,961 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.931e+02 2.155e+02 2.450e+02 3.696e+02, threshold=4.310e+02, percent-clipped=0.0 2023-10-03 22:38:06,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:38:06,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:38:07,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:38:07,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 22:38:07,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:38:08,233 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.47 vs. limit=22.5 2023-10-03 22:38:12,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:38:12,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:13,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:38:15,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:38:15,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:18,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:18,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 22:38:19,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 22:38:19,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:38:21,028 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 22:38:22,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:22,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:38:23,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 22:38:23,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:38:23,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 22:38:23,853 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 22:38:23,853 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 22:38:23,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 22:38:25,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:25,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:38:26,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:38:27,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:27,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 22:38:29,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:29,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:37,239 INFO [train.py:1046] (1/4) Epoch 41, batch 2200, loss[loss=0.1587, simple_loss=0.2387, pruned_loss=0.03939, over 23560.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2344, pruned_loss=0.03798, over 4714866.84 frames. ], batch size: 149, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:38:37,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:38:38,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 22:38:41,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:38:43,026 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.45 vs. limit=15.0 2023-10-03 22:38:45,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:45,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:38:46,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:38:47,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:38:47,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1431240.0, ans=0.035 2023-10-03 22:38:49,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:50,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:38:50,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 22:38:56,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 22:38:58,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:39:02,509 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:39:02,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1431306.6666666667, ans=0.0 2023-10-03 22:39:04,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 22:39:07,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:39:07,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:39:07,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:39:11,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:39:11,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 22:39:15,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:39:17,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:39:17,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 22:39:20,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:39:21,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:39:23,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:39:23,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:39:26,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 22:39:26,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:39:28,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 22:39:30,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:39:30,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 22:39:30,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:39:33,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:39:35,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:39:35,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:39:35,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:39:36,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:39:36,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:39:38,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 22:39:41,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 22:39:41,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:39:44,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:39:44,391 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 22:39:45,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:39:47,302 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 22:39:47,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1431506.6666666667, ans=0.0 2023-10-03 22:39:49,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:39:49,248 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 22:39:52,446 INFO [train.py:1046] (1/4) Epoch 41, batch 2250, loss[loss=0.1565, simple_loss=0.2298, pruned_loss=0.04162, over 23437.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2352, pruned_loss=0.03819, over 4710543.50 frames. ], batch size: 285, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:39:52,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:39:52,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 22:39:53,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:39:56,514 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 22:39:56,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:39:59,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:40:03,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:40:05,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1431640.0, ans=0.2 2023-10-03 22:40:06,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:40:09,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:40:09,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:40:10,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:40:14,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 22:40:14,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:40:14,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:40:15,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 22:40:16,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:40:16,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:40:18,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:40:23,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:40:24,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 22:40:24,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:40:26,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 22:40:27,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:40:28,174 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.02 vs. limit=15.0 2023-10-03 22:40:28,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:40:33,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:40:34,423 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.897e+02 2.082e+02 2.416e+02 4.175e+02, threshold=4.164e+02, percent-clipped=0.0 2023-10-03 22:40:34,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:40:36,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:40:37,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:40:39,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:40:40,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:40:45,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1431773.3333333333, ans=0.0 2023-10-03 22:40:46,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:40:48,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:40:54,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:40:54,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:40:54,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:40:56,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1431840.0, ans=0.1 2023-10-03 22:40:58,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 22:41:01,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:41:01,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 22:41:02,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:41:02,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:41:04,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 22:41:05,698 INFO [train.py:1046] (1/4) Epoch 41, batch 2300, loss[loss=0.16, simple_loss=0.2377, pruned_loss=0.04117, over 23542.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2361, pruned_loss=0.0386, over 4707617.93 frames. ], batch size: 120, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:41:07,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:41:07,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:41:14,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:41:14,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:41:17,721 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 22:41:19,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:41:19,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=1431973.3333333333, ans=0.025 2023-10-03 22:41:26,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:41:26,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:41:26,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:41:26,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:41:26,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 22:41:28,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:41:31,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:41:31,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:41:34,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:41:37,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:41:38,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1432040.0, ans=0.025 2023-10-03 22:41:41,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:41:41,470 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:41:46,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1432040.0, ans=0.2 2023-10-03 22:41:47,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:41:47,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:41:49,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:41:50,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:41:54,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:41:55,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:41:55,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:41:55,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 22:41:55,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1432106.6666666667, ans=0.07 2023-10-03 22:41:59,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 22:42:01,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:42:01,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:42:01,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:42:01,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:42:02,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 22:42:02,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 22:42:04,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 22:42:04,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:42:04,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:42:04,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 22:42:11,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:42:14,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:42:17,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:42:17,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:42:17,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 22:42:20,572 INFO [train.py:1046] (1/4) Epoch 41, batch 2350, loss[loss=0.1558, simple_loss=0.2342, pruned_loss=0.03867, over 23565.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2364, pruned_loss=0.0386, over 4713563.86 frames. ], batch size: 134, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:42:20,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:42:20,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:42:21,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:42:22,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 22:42:29,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:42:29,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 22:42:34,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 22:42:36,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:42:39,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:42:39,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:42:40,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:42:40,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:42:42,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 22:42:45,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:42:47,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1432306.6666666667, ans=0.0 2023-10-03 22:42:51,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 22:42:52,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:42:53,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1432373.3333333333, ans=0.07 2023-10-03 22:42:55,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:42:55,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:42:58,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:42:58,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 22:43:00,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:43:03,363 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 1.889e+02 2.078e+02 2.250e+02 3.278e+02, threshold=4.156e+02, percent-clipped=0.0 2023-10-03 22:43:03,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:43:03,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:43:03,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:43:04,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:43:07,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 22:43:07,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:43:09,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1432440.0, ans=0.125 2023-10-03 22:43:10,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:43:10,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:43:13,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 22:43:13,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:43:16,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 22:43:16,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:43:17,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1432440.0, ans=0.125 2023-10-03 22:43:22,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 22:43:23,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 22:43:24,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1432506.6666666667, ans=0.125 2023-10-03 22:43:25,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:43:25,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 22:43:25,179 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 22:43:25,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1432506.6666666667, ans=0.2 2023-10-03 22:43:26,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 22:43:28,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1432506.6666666667, ans=0.125 2023-10-03 22:43:29,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 22:43:31,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:43:33,788 INFO [train.py:1046] (1/4) Epoch 41, batch 2400, loss[loss=0.1475, simple_loss=0.2335, pruned_loss=0.03072, over 24009.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2362, pruned_loss=0.0386, over 4707158.86 frames. ], batch size: 86, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 22:43:35,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:43:38,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:43:38,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:43:39,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 22:43:39,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 22:43:46,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 22:43:46,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:43:47,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 22:43:48,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:43:49,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:43:49,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 22:43:56,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:43:59,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 22:44:03,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:44:08,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 22:44:10,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:44:11,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:44:15,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:44:15,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 22:44:15,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:44:24,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:44:27,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:44:30,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:44:30,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:44:30,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:44:30,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:44:30,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:44:31,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:44:31,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:44:37,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:44:37,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:44:37,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 22:44:38,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 22:44:40,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:44:40,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:44:41,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 22:44:41,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 22:44:41,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 22:44:41,890 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 22:44:43,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 22:44:45,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:44:47,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:44:47,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:44:48,327 INFO [train.py:1046] (1/4) Epoch 41, batch 2450, loss[loss=0.1671, simple_loss=0.2574, pruned_loss=0.03835, over 24357.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2353, pruned_loss=0.03849, over 4685620.31 frames. ], batch size: 77, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 22:44:48,457 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 22:44:49,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:44:50,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1432906.6666666667, ans=0.125 2023-10-03 22:44:51,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 22:44:54,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:44:54,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:44:56,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1432906.6666666667, ans=0.0 2023-10-03 22:44:58,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:44:58,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:44:58,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 22:45:04,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:45:04,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:45:08,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:45:08,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:45:08,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:45:10,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 22:45:12,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:45:14,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:45:14,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:45:17,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:45:19,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:45:19,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:45:20,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:45:22,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 22:45:24,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:45:32,399 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 2.017e+02 2.173e+02 2.483e+02 3.583e+02, threshold=4.346e+02, percent-clipped=0.0 2023-10-03 22:45:32,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:45:33,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:45:33,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:45:33,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:45:33,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:45:35,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:45:35,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 22:45:39,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:45:40,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:45:42,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:45:42,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:45:42,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1433106.6666666667, ans=0.0 2023-10-03 22:45:45,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1433173.3333333333, ans=0.0 2023-10-03 22:45:47,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:45:47,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 22:45:48,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:45:50,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:45:50,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 22:45:50,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:45:52,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:45:55,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:45:56,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:45:58,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:46:01,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 22:46:02,347 INFO [train.py:1046] (1/4) Epoch 41, batch 2500, loss[loss=0.1405, simple_loss=0.2174, pruned_loss=0.03178, over 24308.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2345, pruned_loss=0.03829, over 4681362.79 frames. ], batch size: 56, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:46:03,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:46:05,932 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.16 vs. limit=12.0 2023-10-03 22:46:07,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:46:08,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1433240.0, ans=0.0 2023-10-03 22:46:17,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:46:17,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:46:19,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:46:19,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 22:46:26,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:46:26,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:46:29,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 22:46:29,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 22:46:29,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 22:46:30,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:46:30,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:46:32,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 22:46:32,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:46:33,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 22:46:33,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:46:33,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1433373.3333333333, ans=0.0 2023-10-03 22:46:36,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:46:37,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:46:39,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 22:46:40,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 22:46:40,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:46:42,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:46:45,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:46:46,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1433440.0, ans=0.0 2023-10-03 22:46:49,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1433440.0, ans=0.0 2023-10-03 22:46:50,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:46:53,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:46:53,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1433440.0, ans=0.125 2023-10-03 22:46:54,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1433440.0, ans=0.125 2023-10-03 22:46:57,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 22:47:00,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 22:47:00,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:47:00,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 22:47:02,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:47:02,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:47:04,696 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 22:47:04,697 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 22:47:04,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 22:47:06,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:47:07,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 22:47:07,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 22:47:08,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:47:10,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 22:47:10,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1433506.6666666667, ans=0.0 2023-10-03 22:47:13,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 22:47:16,702 INFO [train.py:1046] (1/4) Epoch 41, batch 2550, loss[loss=0.1351, simple_loss=0.2194, pruned_loss=0.02536, over 24513.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2355, pruned_loss=0.03826, over 4688906.21 frames. ], batch size: 63, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:47:18,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:47:20,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:47:20,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:47:21,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:47:21,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 22:47:22,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:47:26,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 22:47:28,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:47:30,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:47:30,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=1433640.0, ans=15.0 2023-10-03 22:47:33,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:47:33,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 22:47:33,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:47:33,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:47:34,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:47:37,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:47:37,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 22:47:37,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 22:47:37,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:47:37,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 22:47:49,376 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=4.91 vs. limit=15.0 2023-10-03 22:47:49,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:47:53,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:47:53,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:47:53,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:47:55,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:48:00,988 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 2.004e+02 2.232e+02 2.565e+02 3.401e+02, threshold=4.465e+02, percent-clipped=0.0 2023-10-03 22:48:01,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:48:04,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:48:04,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1433773.3333333333, ans=0.125 2023-10-03 22:48:05,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:48:05,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:48:06,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:48:06,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:48:09,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:48:09,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:48:15,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:48:15,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 22:48:15,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:48:15,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:48:16,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:48:17,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:48:19,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:48:25,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:48:26,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1433840.0, ans=0.2 2023-10-03 22:48:27,346 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.45 vs. limit=22.5 2023-10-03 22:48:27,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:48:29,510 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 22:48:30,740 INFO [train.py:1046] (1/4) Epoch 41, batch 2600, loss[loss=0.1641, simple_loss=0.2354, pruned_loss=0.04644, over 23449.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2353, pruned_loss=0.03802, over 4689473.50 frames. ], batch size: 285, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:48:33,544 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 22:48:33,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:48:33,608 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 22:48:35,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 22:48:35,429 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 22:48:38,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1433906.6666666667, ans=0.0 2023-10-03 22:48:38,485 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:48:39,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:48:39,525 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 22:48:42,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 22:48:42,710 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.03 vs. limit=6.0 2023-10-03 22:48:43,552 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 22:48:44,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:48:46,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 22:48:48,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 22:48:49,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:48:49,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 22:48:51,781 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 22:48:52,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 22:48:58,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:48:59,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:48:59,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:48:59,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 22:49:02,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:49:06,670 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 22:49:11,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:49:11,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:49:12,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 22:49:12,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:49:12,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:49:14,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 22:49:17,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:49:17,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:49:18,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:49:22,069 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 22:49:22,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:49:23,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:49:27,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:49:28,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:49:28,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 22:49:30,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:49:32,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:49:34,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:49:38,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 22:49:38,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:49:40,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 22:49:43,105 INFO [train.py:1046] (1/4) Epoch 41, batch 2650, loss[loss=0.1583, simple_loss=0.2478, pruned_loss=0.03444, over 24428.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2368, pruned_loss=0.03866, over 4700680.19 frames. ], batch size: 69, lr: 2.48e-03, grad_scale: 4.0 2023-10-03 22:49:45,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 22:49:45,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:49:46,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:49:47,044 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 22:49:47,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:49:48,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:49:51,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 22:49:53,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:49:53,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1434240.0, ans=0.0 2023-10-03 22:49:54,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:49:55,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 22:49:55,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:49:57,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:49:59,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 22:50:01,336 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 22:50:04,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:50:05,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 22:50:06,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:50:07,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1434306.6666666667, ans=0.125 2023-10-03 22:50:08,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 22:50:11,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:50:11,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 22:50:13,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:50:13,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:17,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 22:50:17,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 22:50:22,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:50:22,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1434373.3333333333, ans=0.125 2023-10-03 22:50:26,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 22:50:26,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:50:26,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:27,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:50:27,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:50:29,235 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.637e+02 1.964e+02 2.149e+02 2.506e+02 3.247e+02, threshold=4.298e+02, percent-clipped=0.0 2023-10-03 22:50:29,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:50:32,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:50:33,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:50:33,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:50:33,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:50:33,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1434440.0, ans=0.125 2023-10-03 22:50:34,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:50:34,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:50:36,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:50:37,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:50:39,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:50:39,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 22:50:43,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:45,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:50:45,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:50:45,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 22:50:45,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1434506.6666666667, ans=0.125 2023-10-03 22:50:47,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:50:50,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:51,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:53,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:50:53,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:50:54,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:50:56,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:50:56,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 22:50:57,394 INFO [train.py:1046] (1/4) Epoch 41, batch 2700, loss[loss=0.1644, simple_loss=0.2484, pruned_loss=0.04018, over 24032.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2382, pruned_loss=0.03884, over 4697480.74 frames. ], batch size: 80, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:50:57,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1434573.3333333333, ans=0.125 2023-10-03 22:50:58,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:50:58,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 22:50:59,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1434573.3333333333, ans=0.1 2023-10-03 22:51:01,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:51:01,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:01,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:03,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:51:03,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:51:03,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:51:03,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:51:04,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 22:51:04,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:51:07,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:51:08,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:51:08,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:51:13,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:51:13,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 22:51:13,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:51:19,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:51:19,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:51:26,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:51:26,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:51:26,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:51:27,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:51:27,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1434706.6666666667, ans=0.125 2023-10-03 22:51:29,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:51:32,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:51:32,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:51:32,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:51:36,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:36,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:51:45,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:51:45,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:51:50,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:51:50,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:51:53,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:54,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:51:55,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:51:56,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:51:58,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:58,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:52:01,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:52:02,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:52:02,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:52:05,446 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:52:06,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 22:52:07,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:52:10,560 INFO [train.py:1046] (1/4) Epoch 41, batch 2750, loss[loss=0.1504, simple_loss=0.2362, pruned_loss=0.03227, over 24451.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2376, pruned_loss=0.03844, over 4707436.57 frames. ], batch size: 66, lr: 2.48e-03, grad_scale: 4.0 2023-10-03 22:52:10,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:52:10,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 22:52:13,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 22:52:13,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:52:15,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:52:16,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:52:18,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:18,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:52:19,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:21,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1434906.6666666667, ans=0.125 2023-10-03 22:52:24,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:52:24,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 22:52:24,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:52:24,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:24,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 22:52:25,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:52:25,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:52:30,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 22:52:33,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:52:33,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:33,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:52:33,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 22:52:34,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:52:36,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:52:37,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:52:37,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:52:41,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:52:41,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 22:52:41,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:52:44,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:46,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:52:51,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:52:53,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 22:52:53,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:52:57,726 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 1.945e+02 2.112e+02 2.405e+02 3.716e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-03 22:52:59,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:59,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:52:59,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:53:04,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:53:04,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:53:04,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 22:53:07,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1435106.6666666667, ans=0.1 2023-10-03 22:53:08,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:53:10,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 22:53:14,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 22:53:17,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:53:17,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 22:53:17,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:53:20,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:53:20,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 22:53:21,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:53:23,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 22:53:24,747 INFO [train.py:1046] (1/4) Epoch 41, batch 2800, loss[loss=0.1435, simple_loss=0.224, pruned_loss=0.03153, over 23627.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2365, pruned_loss=0.03811, over 4723930.44 frames. ], batch size: 149, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:53:24,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:53:24,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:53:25,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1435240.0, ans=0.0 2023-10-03 22:53:26,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 22:53:26,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:53:26,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:53:28,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:53:28,212 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 22:53:28,213 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 22:53:32,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:53:35,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:53:35,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:53:38,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:53:41,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 22:53:42,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 22:53:44,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 22:53:45,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:53:45,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:53:45,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:53:49,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:53:49,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:53:49,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:53:49,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1435306.6666666667, ans=0.1 2023-10-03 22:53:50,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:53:52,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1435373.3333333333, ans=0.125 2023-10-03 22:53:58,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:54:00,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:54:02,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:54:03,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:54:03,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:54:09,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:54:09,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 22:54:10,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:54:10,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:54:10,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:54:10,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1435440.0, ans=0.125 2023-10-03 22:54:14,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:54:16,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:54:19,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:54:20,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:54:20,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:54:20,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 22:54:21,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:54:21,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:54:23,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:54:23,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 22:54:23,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:54:25,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:54:25,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:54:28,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 22:54:28,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:54:28,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:54:29,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:54:31,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 22:54:36,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:54:36,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:54:36,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:54:38,149 INFO [train.py:1046] (1/4) Epoch 41, batch 2850, loss[loss=0.1665, simple_loss=0.2576, pruned_loss=0.0377, over 24339.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2361, pruned_loss=0.03797, over 4731509.75 frames. ], batch size: 74, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:54:40,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:54:43,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1435573.3333333333, ans=0.125 2023-10-03 22:54:44,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:54:44,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:54:44,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:54:48,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:54:48,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1435573.3333333333, ans=0.0 2023-10-03 22:54:49,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:54:51,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:54:51,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 22:54:57,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 22:54:57,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:54:59,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 22:55:00,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:55:03,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 22:55:03,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 22:55:04,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:55:17,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:55:18,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:55:18,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:55:19,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:55:19,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:55:19,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:55:22,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:55:22,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 22:55:23,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:55:23,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:55:25,197 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.888e+02 2.036e+02 2.281e+02 3.028e+02, threshold=4.073e+02, percent-clipped=0.0 2023-10-03 22:55:25,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:55:26,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:55:30,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:55:30,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:55:31,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:55:33,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:55:34,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:55:36,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:55:36,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:55:36,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1435840.0, ans=0.0 2023-10-03 22:55:39,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:55:43,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:55:45,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 22:55:45,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 22:55:47,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 22:55:47,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:55:48,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 22:55:48,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:55:49,331 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.53 vs. limit=10.0 2023-10-03 22:55:49,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:55:49,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:55:49,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:55:49,910 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 22:55:51,207 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 22:55:51,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:55:52,485 INFO [train.py:1046] (1/4) Epoch 41, batch 2900, loss[loss=0.166, simple_loss=0.2539, pruned_loss=0.03898, over 23296.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2359, pruned_loss=0.03791, over 4726292.74 frames. ], batch size: 93, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:55:52,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:55:56,114 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.81 vs. limit=22.5 2023-10-03 22:55:56,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:55:56,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:55:58,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:55:58,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 22:56:01,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1435906.6666666667, ans=0.0 2023-10-03 22:56:03,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:56:03,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 22:56:04,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 22:56:05,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:56:06,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:56:07,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:56:07,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:56:09,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1435973.3333333333, ans=0.125 2023-10-03 22:56:10,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:56:11,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:56:15,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 22:56:15,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 22:56:17,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:56:18,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:56:20,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 22:56:20,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 22:56:24,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:56:24,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 22:56:24,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:56:26,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:56:26,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:56:28,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:56:30,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:56:33,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:56:35,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:56:35,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1436040.0, ans=0.0 2023-10-03 22:56:37,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 22:56:37,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 22:56:37,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:56:41,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:56:43,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 22:56:44,435 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.31 vs. limit=15.0 2023-10-03 22:56:45,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:56:45,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1436106.6666666667, ans=0.0 2023-10-03 22:56:50,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:56:56,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1436173.3333333333, ans=0.1 2023-10-03 22:56:58,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1436173.3333333333, ans=0.125 2023-10-03 22:56:59,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:56:59,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:57:00,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 22:57:02,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1436173.3333333333, ans=0.125 2023-10-03 22:57:03,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:57:03,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 22:57:03,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:57:03,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:57:06,729 INFO [train.py:1046] (1/4) Epoch 41, batch 2950, loss[loss=0.1803, simple_loss=0.2527, pruned_loss=0.05398, over 23838.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2367, pruned_loss=0.03765, over 4739821.34 frames. ], batch size: 179, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:57:08,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:57:09,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 22:57:09,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:57:10,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:57:11,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:57:13,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:57:13,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 22:57:14,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 22:57:16,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:57:16,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:57:17,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1436240.0, ans=0.125 2023-10-03 22:57:22,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:57:25,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:57:27,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:57:28,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:57:30,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:57:30,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:57:32,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:57:33,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:57:33,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:57:36,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 22:57:39,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1436373.3333333333, ans=0.125 2023-10-03 22:57:41,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 22:57:41,639 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 22:57:43,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:57:43,299 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:57:45,019 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 22:57:46,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 22:57:46,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:57:47,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:57:47,742 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 22:57:47,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 22:57:49,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 22:57:50,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:57:50,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:57:50,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1436440.0, ans=0.125 2023-10-03 22:57:53,089 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.859e+02 2.046e+02 2.288e+02 3.221e+02, threshold=4.092e+02, percent-clipped=0.0 2023-10-03 22:57:53,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:57:54,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:57:54,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:57:56,093 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 22:57:56,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:57:57,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 22:58:02,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:58:03,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:58:05,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 22:58:05,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:58:06,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 22:58:09,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:58:10,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:58:10,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:58:12,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:58:12,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 22:58:14,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:58:15,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:58:15,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:58:15,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 22:58:15,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:58:17,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:58:18,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1436573.3333333333, ans=0.09899494936611666 2023-10-03 22:58:19,924 INFO [train.py:1046] (1/4) Epoch 41, batch 3000, loss[loss=0.1782, simple_loss=0.2668, pruned_loss=0.04481, over 23986.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2369, pruned_loss=0.03815, over 4731143.10 frames. ], batch size: 80, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:58:19,925 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 22:58:32,320 INFO [train.py:1078] (1/4) Epoch 41, validation: loss=0.3725, simple_loss=0.2818, pruned_loss=0.2316, over 1125622.00 frames. 2023-10-03 22:58:32,320 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 20896MB 2023-10-03 22:58:32,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:58:32,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 22:58:33,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:58:36,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:58:37,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:58:40,796 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 22:58:40,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 22:58:42,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:58:42,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:58:43,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 22:58:45,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:58:49,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:58:57,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1436640.0, ans=0.1 2023-10-03 22:58:57,556 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.79 vs. limit=15.0 2023-10-03 22:58:58,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:59:04,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 22:59:06,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:59:08,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:59:08,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:59:10,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:59:11,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:59:11,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 22:59:14,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 22:59:18,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:59:18,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 22:59:20,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:59:20,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:59:20,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:59:20,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:59:23,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:59:23,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:59:23,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:59:25,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:59:25,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1436773.3333333333, ans=0.0 2023-10-03 22:59:27,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 22:59:29,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:59:29,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:59:29,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:59:33,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:59:33,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:59:35,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 22:59:35,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 22:59:36,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:59:37,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 22:59:37,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:59:38,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 22:59:41,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 22:59:42,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 22:59:42,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 22:59:44,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 22:59:44,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 22:59:44,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:59:45,746 INFO [train.py:1046] (1/4) Epoch 41, batch 3050, loss[loss=0.17, simple_loss=0.248, pruned_loss=0.04601, over 20001.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2369, pruned_loss=0.03824, over 4727769.92 frames. ], batch size: 43, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:59:47,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:59:47,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 22:59:47,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:59:47,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:59:49,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 22:59:52,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:59:54,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:59:54,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:59:57,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:59:59,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 23:00:04,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 23:00:06,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 23:00:06,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:00:10,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:00:13,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:00:13,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:00:15,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:00:19,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:00:19,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:00:20,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:00:20,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:00:20,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:00:21,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:00:23,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:00:25,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:00:25,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 23:00:25,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:00:25,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:00:30,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:00:31,142 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.945e+02 2.143e+02 2.357e+02 3.381e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 23:00:31,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:00:31,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:00:32,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:00:37,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:00:37,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:00:43,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:00:44,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:00:44,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:00:46,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:00:46,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:00:48,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:00:49,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 23:00:49,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:00:49,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:00:52,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 23:00:53,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:00:57,649 INFO [train.py:1046] (1/4) Epoch 41, batch 3100, loss[loss=0.156, simple_loss=0.2353, pruned_loss=0.03833, over 23329.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2363, pruned_loss=0.03839, over 4718362.26 frames. ], batch size: 119, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 23:00:57,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:00:57,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:00:59,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:01:02,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 23:01:05,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 23:01:06,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 23:01:08,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:01:11,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:01:11,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:01:14,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 23:01:17,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:01:23,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 23:01:27,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:01:27,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:01:27,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:01:28,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:01:28,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 23:01:31,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:01:31,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 23:01:31,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:01:32,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:01:33,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 23:01:34,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:01:36,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1437373.3333333333, ans=0.1 2023-10-03 23:01:39,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:01:41,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 23:01:42,316 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.25 vs. limit=15.0 2023-10-03 23:01:42,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 23:01:42,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:01:43,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:01:45,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:01:45,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:01:47,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:01:47,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:01:47,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:01:50,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:01:51,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:01:51,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:01:51,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 23:01:55,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:01:57,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 23:02:00,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:02:01,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 23:02:01,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:02:01,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:02:01,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 23:02:12,104 INFO [train.py:1046] (1/4) Epoch 41, batch 3150, loss[loss=0.1495, simple_loss=0.2255, pruned_loss=0.03673, over 23779.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2347, pruned_loss=0.0382, over 4703212.82 frames. ], batch size: 212, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 23:02:12,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 23:02:14,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:02:15,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:02:15,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:02:15,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:02:17,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 23:02:18,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:02:18,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 23:02:20,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 23:02:21,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:02:24,511 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 23:02:27,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 23:02:27,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:02:28,526 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 23:02:28,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 23:02:29,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 23:02:30,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 23:02:30,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 23:02:30,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:02:30,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:02:31,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:02:34,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 23:02:34,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1437640.0, ans=0.125 2023-10-03 23:02:36,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:02:37,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:02:37,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:02:40,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 23:02:43,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 23:02:43,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:02:46,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 23:02:48,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:02:49,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 23:02:51,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1437706.6666666667, ans=0.04949747468305833 2023-10-03 23:02:52,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 23:02:52,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:02:52,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:02:52,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:02:52,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:02:52,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:02:55,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:02:55,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:02:56,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 23:02:56,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:02:56,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:02:57,960 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.71 vs. limit=22.5 2023-10-03 23:02:58,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:02:59,100 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.26 vs. limit=15.0 2023-10-03 23:02:59,589 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.918e+02 2.167e+02 2.429e+02 3.852e+02, threshold=4.335e+02, percent-clipped=0.0 2023-10-03 23:02:59,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:02:59,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1437773.3333333333, ans=0.2 2023-10-03 23:03:00,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 23:03:01,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:03:02,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 23:03:02,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:03,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 23:03:05,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 23:03:06,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:03:06,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:03:08,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 23:03:09,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 23:03:09,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:03:11,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:03:13,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:13,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:03:19,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:03:20,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:22,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 23:03:26,368 INFO [train.py:1046] (1/4) Epoch 41, batch 3200, loss[loss=0.1433, simple_loss=0.2194, pruned_loss=0.03365, over 20218.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2344, pruned_loss=0.03775, over 4705958.58 frames. ], batch size: 44, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 23:03:27,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:03:27,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 23:03:30,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:30,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:03:30,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 23:03:33,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:03:37,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1437906.6666666667, ans=0.125 2023-10-03 23:03:38,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:03:42,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:51,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:03:53,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1437973.3333333333, ans=0.2 2023-10-03 23:03:58,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 23:04:00,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:04:02,201 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.95 vs. limit=15.0 2023-10-03 23:04:03,373 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.07 vs. limit=15.0 2023-10-03 23:04:04,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 23:04:04,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:04:08,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:04:08,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:04:08,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:04:13,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 23:04:13,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 23:04:17,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1438106.6666666667, ans=0.0 2023-10-03 23:04:18,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 23:04:20,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 23:04:22,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:04:25,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:04:25,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:04:26,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:04:26,666 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 23:04:26,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:04:29,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:04:30,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 23:04:30,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 23:04:32,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 23:04:33,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 23:04:33,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1438173.3333333333, ans=0.125 2023-10-03 23:04:35,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:04:35,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1438173.3333333333, ans=0.0 2023-10-03 23:04:37,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 23:04:37,101 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 23:04:37,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:04:37,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:04:39,631 INFO [train.py:1046] (1/4) Epoch 41, batch 3250, loss[loss=0.1657, simple_loss=0.2397, pruned_loss=0.04587, over 23788.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2346, pruned_loss=0.03789, over 4711397.38 frames. ], batch size: 195, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 23:04:39,712 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 23:04:45,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1438240.0, ans=0.2 2023-10-03 23:04:46,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:04:48,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:04:55,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:04:55,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 23:04:57,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:04:57,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:04:57,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:04:58,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:05:00,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:05:00,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:05:01,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:05:01,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:05:02,508 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.60 vs. limit=15.0 2023-10-03 23:05:02,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:05:03,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:05:04,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:05:07,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:05:09,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:05:11,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:05:11,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:05:13,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:05:14,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:05:14,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:05:20,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 23:05:20,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:05:21,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:05:21,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:05:22,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:05:26,868 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 1.962e+02 2.147e+02 2.402e+02 3.461e+02, threshold=4.294e+02, percent-clipped=0.0 2023-10-03 23:05:28,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:05:34,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:05:34,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:05:34,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 23:05:34,918 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.88 vs. limit=12.0 2023-10-03 23:05:35,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:05:35,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 23:05:35,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:05:38,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 23:05:40,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 23:05:40,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:05:40,838 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.30 vs. limit=6.0 2023-10-03 23:05:41,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:05:41,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:05:41,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 23:05:41,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:05:45,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:05:45,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:05:47,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 23:05:49,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:05:51,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:05:51,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 23:05:54,109 INFO [train.py:1046] (1/4) Epoch 41, batch 3300, loss[loss=0.156, simple_loss=0.2328, pruned_loss=0.0396, over 19007.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2354, pruned_loss=0.03823, over 4716952.82 frames. ], batch size: 41, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 23:05:54,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:05:54,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 23:05:56,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 23:05:56,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 23:05:56,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:06:01,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:06:03,083 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.70 vs. limit=10.0 2023-10-03 23:06:03,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:06:03,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:06,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 23:06:06,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:06:07,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:06:10,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:06:12,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1438640.0, ans=0.125 2023-10-03 23:06:14,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 23:06:15,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:06:15,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:06:16,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:18,494 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 23:06:18,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:06:19,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:06:19,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:06:19,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:06:20,626 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 23:06:23,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:06:25,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:06:26,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:26,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 23:06:27,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 23:06:27,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:29,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:06:30,775 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 23:06:33,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 23:06:33,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:06:36,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 23:06:38,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:06:41,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:06:41,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:06:43,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:06:44,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:06:44,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:06:46,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:06:47,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:06:47,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:48,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:06:50,782 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 23:06:51,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1438840.0, ans=0.0 2023-10-03 23:06:52,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 23:06:55,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 23:06:55,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:06:55,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:06:57,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1438840.0, ans=0.0 2023-10-03 23:06:57,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1438840.0, ans=0.0 2023-10-03 23:06:58,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:06:58,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:07:00,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:07:00,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:00,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 23:07:00,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:07:01,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1438840.0, ans=0.0 2023-10-03 23:07:02,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 23:07:03,552 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.72 vs. limit=6.0 2023-10-03 23:07:04,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 23:07:04,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:05,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:07,083 INFO [train.py:1046] (1/4) Epoch 41, batch 3350, loss[loss=0.2004, simple_loss=0.2643, pruned_loss=0.06827, over 19176.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2367, pruned_loss=0.03852, over 4719493.93 frames. ], batch size: 388, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 23:07:07,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:07:08,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:07:09,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:07:11,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:07:11,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:14,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:07:15,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:15,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:07:17,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1438906.6666666667, ans=0.0 2023-10-03 23:07:18,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:20,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:07:21,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:07:21,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:07:25,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 23:07:25,299 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 23:07:25,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:07:25,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1438973.3333333333, ans=0.125 2023-10-03 23:07:28,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 23:07:29,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 23:07:30,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1438973.3333333333, ans=0.2 2023-10-03 23:07:31,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:07:31,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:07:32,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:07:32,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 23:07:32,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:32,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:07:35,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:36,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:38,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:38,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:07:41,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:07:42,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:42,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:07:45,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1439040.0, ans=0.125 2023-10-03 23:07:47,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:07:48,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:51,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:51,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:07:54,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:07:54,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1439106.6666666667, ans=0.125 2023-10-03 23:07:55,980 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.914e+02 2.116e+02 2.439e+02 3.109e+02, threshold=4.232e+02, percent-clipped=0.0 2023-10-03 23:07:56,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 23:07:56,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 23:07:57,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 23:07:57,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:07:58,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 23:08:00,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:08:02,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:08:06,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1439173.3333333333, ans=0.125 2023-10-03 23:08:07,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:08:07,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 23:08:09,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:08:10,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:08:12,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:08:15,838 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.11 vs. limit=15.0 2023-10-03 23:08:16,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:08:17,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 23:08:19,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:08:19,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:08:20,804 INFO [train.py:1046] (1/4) Epoch 41, batch 3400, loss[loss=0.1369, simple_loss=0.2198, pruned_loss=0.02703, over 24294.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2374, pruned_loss=0.03839, over 4720974.77 frames. ], batch size: 61, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 23:08:20,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:08:20,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 23:08:22,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:08:22,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 23:08:23,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:08:24,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:08:24,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:08:25,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:08:26,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 23:08:30,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 23:08:30,936 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 23:08:30,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:08:35,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:08:35,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:08:36,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:08:37,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:08:40,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1439306.6666666667, ans=0.1 2023-10-03 23:08:41,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:08:44,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 23:08:48,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:08:51,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:08:51,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:08:52,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 23:08:59,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:09:01,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1439373.3333333333, ans=0.2 2023-10-03 23:09:02,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 23:09:05,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1439440.0, ans=0.125 2023-10-03 23:09:08,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:09:08,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:09:09,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 23:09:09,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:09:11,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:09:11,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1439440.0, ans=0.0 2023-10-03 23:09:13,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:09:13,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:09:15,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:09:18,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:09:18,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:09:23,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:09:23,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1439506.6666666667, ans=0.125 2023-10-03 23:09:27,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 23:09:31,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:09:35,195 INFO [train.py:1046] (1/4) Epoch 41, batch 3450, loss[loss=0.1533, simple_loss=0.2364, pruned_loss=0.03506, over 23606.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2374, pruned_loss=0.03851, over 4722739.69 frames. ], batch size: 134, lr: 2.48e-03, grad_scale: 4.0 2023-10-03 23:09:36,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 23:09:40,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 23:09:40,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:09:40,669 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.16 vs. limit=12.0 2023-10-03 23:09:42,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:09:42,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 23:09:44,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:09:47,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:09:51,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:09:53,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:09:54,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:09:54,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:09:56,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:10:02,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 23:10:06,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 23:10:06,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:10:06,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:10:06,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:10:14,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 23:10:14,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:10:17,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:10:17,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1439706.6666666667, ans=0.0 2023-10-03 23:10:18,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:10:20,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:10:20,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:10:21,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 23:10:21,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:10:23,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:10:26,227 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.900e+02 2.129e+02 2.402e+02 3.276e+02, threshold=4.259e+02, percent-clipped=0.0 2023-10-03 23:10:26,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:10:29,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 23:10:33,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:10:37,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:10:39,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:10:42,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:10:46,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:10:46,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:10:46,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=1439840.0, ans=0.05 2023-10-03 23:10:48,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:10:49,474 INFO [train.py:1046] (1/4) Epoch 41, batch 3500, loss[loss=0.1474, simple_loss=0.2245, pruned_loss=0.03515, over 24318.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2359, pruned_loss=0.03837, over 4717097.50 frames. ], batch size: 56, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:10:49,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:10:52,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:10:56,216 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.56 vs. limit=22.5 2023-10-03 23:10:57,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:10:57,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 23:10:59,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:11:01,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 23:11:03,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:11:03,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 23:11:07,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:11:08,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:11:12,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:11:13,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:11:14,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 23:11:14,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:15,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:11:15,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 23:11:15,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1439973.3333333333, ans=0.2 2023-10-03 23:11:17,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:18,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:11:18,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:11:23,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:23,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1440040.0, ans=0.125 2023-10-03 23:11:23,822 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.31 vs. limit=6.0 2023-10-03 23:11:24,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 23:11:24,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:11:27,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:11:29,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:11:30,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:32,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:11:33,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:11:35,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 23:11:36,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 23:11:36,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 23:11:37,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:11:38,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:39,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:11:39,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:11:40,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1440106.6666666667, ans=0.125 2023-10-03 23:11:42,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 23:11:43,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:11:48,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:11:49,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 23:11:49,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 23:11:49,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:11:52,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:11:52,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:11:54,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:57,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 23:11:57,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:12:00,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:12:00,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 23:12:03,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 23:12:04,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:12:04,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:12:04,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:12:06,237 INFO [train.py:1046] (1/4) Epoch 41, batch 3550, loss[loss=0.1453, simple_loss=0.2281, pruned_loss=0.03121, over 24280.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2342, pruned_loss=0.03804, over 4711355.36 frames. ], batch size: 61, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:12:06,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:12:06,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1440240.0, ans=0.2 2023-10-03 23:12:08,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:12:17,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:12:17,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1440240.0, ans=0.125 2023-10-03 23:12:19,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 23:12:21,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:12:23,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:12:25,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:12:25,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1440306.6666666667, ans=0.2 2023-10-03 23:12:26,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:12:26,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:12:29,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:12:30,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:12:31,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:12:31,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 23:12:32,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:12:37,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:12:37,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:12:37,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:12:39,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:12:39,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:12:39,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 23:12:39,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:12:39,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:12:40,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 23:12:45,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:12:47,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:12:48,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:12:50,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 23:12:51,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:12:52,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 23:12:52,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:12:55,990 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 1.904e+02 2.052e+02 2.273e+02 3.106e+02, threshold=4.105e+02, percent-clipped=0.0 2023-10-03 23:12:57,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:12:57,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:13:01,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 23:13:01,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:13:09,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:13:09,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 23:13:09,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:13:14,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:13:14,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 23:13:19,411 INFO [train.py:1046] (1/4) Epoch 41, batch 3600, loss[loss=0.1691, simple_loss=0.2479, pruned_loss=0.04519, over 24445.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2345, pruned_loss=0.03787, over 4706413.55 frames. ], batch size: 63, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:13:22,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 23:13:22,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:13:24,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:13:25,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:13:26,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:13:28,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:13:30,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:13:32,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:13:32,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:13:33,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:13:33,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=1440640.0, ans=0.025 2023-10-03 23:13:34,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:13:34,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 23:13:37,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:13:37,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:13:40,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:13:43,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:13:43,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:13:43,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1440640.0, ans=0.125 2023-10-03 23:13:44,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:13:44,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 23:13:44,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:13:46,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1440640.0, ans=0.0 2023-10-03 23:13:47,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:13:49,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:13:49,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:13:52,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:13:53,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:13:53,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 23:13:59,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:14:01,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:14:01,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 23:14:07,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:14:08,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1440773.3333333333, ans=0.125 2023-10-03 23:14:11,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:14:14,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:14:18,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 23:14:20,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:14:20,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 23:14:22,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 23:14:23,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 23:14:26,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:14:26,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:14:28,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 23:14:28,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:14:29,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:14:29,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:14:29,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 23:14:31,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 23:14:34,306 INFO [train.py:1046] (1/4) Epoch 41, batch 3650, loss[loss=0.1427, simple_loss=0.2227, pruned_loss=0.03133, over 24620.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2356, pruned_loss=0.03809, over 4716775.14 frames. ], batch size: 60, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:14:34,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:14:34,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 23:14:40,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 23:14:40,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:14:41,997 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.86 vs. limit=22.5 2023-10-03 23:14:43,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 23:14:45,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 23:14:50,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:14:50,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:14:51,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:14:53,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 23:14:55,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:14:55,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 23:14:56,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:14:56,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:14:56,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 23:14:57,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 23:14:59,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:14:59,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:15:00,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:15:01,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 23:15:04,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 23:15:06,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:15:07,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 23:15:08,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:15:08,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:15:13,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:15:15,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:15:15,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:15:15,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:15:16,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1441040.0, ans=0.125 2023-10-03 23:15:17,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:15:20,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:15:22,040 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:15:23,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:15:24,912 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.947e+02 2.181e+02 2.506e+02 4.142e+02, threshold=4.361e+02, percent-clipped=1.0 2023-10-03 23:15:25,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:15:25,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:15:27,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 23:15:29,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:15:29,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:15:29,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1441106.6666666667, ans=0.1 2023-10-03 23:15:31,934 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.48 vs. limit=10.0 2023-10-03 23:15:34,555 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.17 vs. limit=12.0 2023-10-03 23:15:35,318 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 23:15:38,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:15:38,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:15:39,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:15:40,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1441173.3333333333, ans=0.04949747468305833 2023-10-03 23:15:41,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:15:41,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1441173.3333333333, ans=0.1 2023-10-03 23:15:42,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:15:42,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:15:44,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 23:15:44,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:15:45,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:15:47,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:15:49,084 INFO [train.py:1046] (1/4) Epoch 41, batch 3700, loss[loss=0.1567, simple_loss=0.2372, pruned_loss=0.0381, over 23262.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2359, pruned_loss=0.03767, over 4723539.94 frames. ], batch size: 105, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:15:49,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:15:51,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:15:51,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 23:15:51,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:15:51,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:15:51,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:15:55,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1441240.0, ans=0.2 2023-10-03 23:15:56,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:15:58,469 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.37 vs. limit=12.0 2023-10-03 23:15:59,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:15:59,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:16:00,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:16:00,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:16:02,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 23:16:05,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:16:07,902 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.31 vs. limit=12.0 2023-10-03 23:16:08,191 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 23:16:14,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:16:14,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:16:15,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1441306.6666666667, ans=0.09899494936611666 2023-10-03 23:16:17,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:16:17,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 23:16:17,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:16:21,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:16:21,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 23:16:23,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:16:23,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1441373.3333333333, ans=0.1 2023-10-03 23:16:24,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:16:27,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:16:28,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:16:30,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:16:34,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:16:34,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 23:16:34,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:16:34,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 23:16:41,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:16:41,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:16:44,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:16:44,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 23:16:45,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:16:45,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 23:16:46,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:16:46,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:16:50,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:16:51,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 23:16:52,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 23:16:54,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:16:54,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:16:55,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:16:56,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:16:58,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:16:59,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:17:01,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:17:03,054 INFO [train.py:1046] (1/4) Epoch 41, batch 3750, loss[loss=0.17, simple_loss=0.2444, pruned_loss=0.04781, over 22592.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2368, pruned_loss=0.03805, over 4726498.40 frames. ], batch size: 322, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:17:03,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1441573.3333333333, ans=0.125 2023-10-03 23:17:04,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 23:17:05,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 23:17:07,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:17:07,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 23:17:09,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:17:11,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:17:11,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1441573.3333333333, ans=0.0 2023-10-03 23:17:12,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:17:13,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:17:16,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:17:20,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:17:21,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:17:22,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:17:24,995 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.74 vs. limit=15.0 2023-10-03 23:17:25,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:17:25,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 23:17:27,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:17:28,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:17:28,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:17:29,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 23:17:34,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 23:17:34,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:17:36,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:17:38,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:17:39,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1441706.6666666667, ans=0.125 2023-10-03 23:17:42,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:17:42,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1441706.6666666667, ans=0.125 2023-10-03 23:17:45,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 23:17:49,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 23:17:52,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:17:52,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1441773.3333333333, ans=0.2 2023-10-03 23:17:55,240 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.936e+02 2.132e+02 2.338e+02 3.284e+02, threshold=4.264e+02, percent-clipped=0.0 2023-10-03 23:17:55,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1441773.3333333333, ans=0.125 2023-10-03 23:17:56,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=1441773.3333333333, ans=15.0 2023-10-03 23:17:56,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:17:56,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:17:59,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:18:04,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 23:18:07,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:18:09,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:18:10,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:18:13,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:18:17,516 INFO [train.py:1046] (1/4) Epoch 41, batch 3800, loss[loss=0.162, simple_loss=0.2557, pruned_loss=0.03413, over 24684.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2375, pruned_loss=0.03783, over 4735615.11 frames. ], batch size: 73, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:18:21,498 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.71 vs. limit=15.0 2023-10-03 23:18:22,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:18:22,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1441906.6666666667, ans=0.0 2023-10-03 23:18:23,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1441906.6666666667, ans=0.125 2023-10-03 23:18:25,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:18:26,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 23:18:26,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 23:18:27,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:18:29,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:18:30,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 23:18:32,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 23:18:32,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:18:33,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:18:35,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:18:35,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:18:35,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:18:35,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1441973.3333333333, ans=0.1 2023-10-03 23:18:36,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 23:18:41,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 23:18:41,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:18:42,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1441973.3333333333, ans=0.0 2023-10-03 23:18:44,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:18:47,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:18:47,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:18:48,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1442040.0, ans=0.125 2023-10-03 23:18:50,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 23:18:50,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:18:51,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:18:53,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:18:56,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 23:18:56,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 23:19:00,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:19:07,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:19:13,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:19:14,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 23:19:16,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 23:19:17,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:19:18,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:19:19,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:19:21,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 23:19:23,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 23:19:24,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 23:19:24,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:19:25,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:19:30,865 INFO [train.py:1046] (1/4) Epoch 41, batch 3850, loss[loss=0.1597, simple_loss=0.2376, pruned_loss=0.04092, over 17228.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2367, pruned_loss=0.03768, over 4726646.34 frames. ], batch size: 37, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:19:32,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:19:32,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:19:36,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:19:39,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 23:19:39,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:19:41,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:19:43,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1442240.0, ans=0.125 2023-10-03 23:19:44,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:19:46,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:19:48,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 23:19:48,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 23:19:54,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1442306.6666666667, ans=0.2 2023-10-03 23:19:54,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1442306.6666666667, ans=0.05 2023-10-03 23:19:55,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:19:58,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:19:59,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:19:59,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:20:02,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:03,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:20:03,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:03,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:20:04,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:05,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:07,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:07,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:20:08,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 23:20:08,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 23:20:10,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:20:10,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1442373.3333333333, ans=0.125 2023-10-03 23:20:11,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:13,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:13,640 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:20:14,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:14,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 23:20:16,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 23:20:17,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:19,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 23:20:21,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 23:20:24,013 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.998e+02 2.145e+02 2.490e+02 4.261e+02, threshold=4.290e+02, percent-clipped=0.0 2023-10-03 23:20:26,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:28,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:31,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:32,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 23:20:34,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 23:20:37,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:38,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:42,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:20:42,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:20:42,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:43,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:43,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:20:43,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 23:20:45,646 INFO [train.py:1046] (1/4) Epoch 41, batch 3900, loss[loss=0.1355, simple_loss=0.2186, pruned_loss=0.02623, over 24613.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.236, pruned_loss=0.03758, over 4727704.84 frames. ], batch size: 60, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:20:45,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:20:47,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 23:20:47,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:47,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:50,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:20:50,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:51,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:20:51,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:51,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:53,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:20:53,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 23:20:54,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:56,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:20:57,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:20:59,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:21:00,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:21:03,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:21:03,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:21:04,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:21:06,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 23:21:06,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:21:08,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 23:21:08,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:21:08,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 23:21:10,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 23:21:15,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:21:16,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:21:16,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:21:16,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:21:22,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:21:24,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:21:25,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:21:25,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:21:27,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:21:32,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:21:32,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:21:40,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:21:42,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:21:50,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:21:51,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:21:52,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1442840.0, ans=0.125 2023-10-03 23:21:53,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 23:21:53,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 23:21:53,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:21:55,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 23:21:56,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:21:56,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1442840.0, ans=0.125 2023-10-03 23:21:58,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 23:21:59,418 INFO [train.py:1046] (1/4) Epoch 41, batch 3950, loss[loss=0.1626, simple_loss=0.2452, pruned_loss=0.03994, over 24014.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2365, pruned_loss=0.03741, over 4746037.76 frames. ], batch size: 80, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:22:02,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:22:03,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 23:22:03,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:22:07,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:22:08,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:22:11,986 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.43 vs. limit=15.0 2023-10-03 23:22:14,136 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 23:22:16,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:22:16,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 23:22:16,106 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 23:22:17,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:22:20,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:22:20,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:22:20,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:22:22,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 23:22:24,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1442973.3333333333, ans=0.2 2023-10-03 23:22:26,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:22:28,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:22:28,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:22:28,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:22:29,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:22:29,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1443040.0, ans=0.125 2023-10-03 23:22:38,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:22:38,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:22:42,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 23:22:47,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 23:22:47,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 23:22:47,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:22:48,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:22:51,419 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.901e+02 2.096e+02 2.372e+02 3.248e+02, threshold=4.191e+02, percent-clipped=0.0 2023-10-03 23:22:52,937 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.04 vs. limit=22.5 2023-10-03 23:22:56,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:22:56,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:22:57,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:22:59,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:22:59,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 23:23:02,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:23:03,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:23:04,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1443173.3333333333, ans=0.0 2023-10-03 23:23:06,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 23:23:13,719 INFO [train.py:1046] (1/4) Epoch 41, batch 4000, loss[loss=0.1629, simple_loss=0.236, pruned_loss=0.04492, over 22781.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2372, pruned_loss=0.03779, over 4721429.61 frames. ], batch size: 322, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:23:14,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1443240.0, ans=0.1 2023-10-03 23:23:15,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:23:23,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:23:26,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:23:27,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1443306.6666666667, ans=0.1 2023-10-03 23:23:28,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:23:28,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:23:28,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 23:23:30,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 23:23:30,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 23:23:30,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:23:30,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 23:23:30,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1443306.6666666667, ans=0.2 2023-10-03 23:23:33,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:23:35,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:23:35,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:23:35,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:23:35,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:23:35,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 23:23:38,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:23:40,370 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 23:23:41,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:23:41,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1443373.3333333333, ans=0.125 2023-10-03 23:23:42,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:23:44,976 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 23:23:46,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:23:46,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:23:53,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 23:23:54,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:23:57,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:23:58,015 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 23:24:00,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:24:00,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 23:24:00,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:24:02,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:24:03,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:24:05,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:24:05,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:24:05,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:24:06,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 23:24:07,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:24:07,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1443440.0, ans=0.09899494936611666 2023-10-03 23:24:08,395 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 23:24:14,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:24:15,314 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.89 vs. limit=15.0 2023-10-03 23:24:15,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 23:24:17,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:24:17,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:24:17,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:24:19,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:24:19,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1443506.6666666667, ans=0.1 2023-10-03 23:24:25,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:24:26,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 23:24:26,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 23:24:28,131 INFO [train.py:1046] (1/4) Epoch 41, batch 4050, loss[loss=0.1709, simple_loss=0.2568, pruned_loss=0.04256, over 24380.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2372, pruned_loss=0.03791, over 4724296.07 frames. ], batch size: 77, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:24:30,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:24:30,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:24:32,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:24:32,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:24:34,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:24:35,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:24:39,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:24:40,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 23:24:41,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:24:43,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:24:46,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:24:48,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:24:50,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1443640.0, ans=0.0 2023-10-03 23:24:52,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 23:24:53,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 23:24:53,430 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 23:24:56,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:24:56,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1443706.6666666667, ans=0.2 2023-10-03 23:25:03,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 23:25:03,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:25:08,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:25:09,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1443706.6666666667, ans=0.0 2023-10-03 23:25:10,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1443706.6666666667, ans=0.2 2023-10-03 23:25:11,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1443773.3333333333, ans=0.125 2023-10-03 23:25:12,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:25:12,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:25:12,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:25:12,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1443773.3333333333, ans=0.125 2023-10-03 23:25:15,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1443773.3333333333, ans=0.1 2023-10-03 23:25:17,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:25:21,089 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.958e+02 2.133e+02 2.374e+02 3.448e+02, threshold=4.266e+02, percent-clipped=0.0 2023-10-03 23:25:21,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 23:25:21,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 23:25:21,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:25:22,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 23:25:26,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:25:29,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1443840.0, ans=0.125 2023-10-03 23:25:33,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 23:25:35,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:25:35,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:25:37,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 23:25:37,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 23:25:37,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:25:39,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:25:39,946 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:25:40,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:25:40,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:25:42,259 INFO [train.py:1046] (1/4) Epoch 41, batch 4100, loss[loss=0.1555, simple_loss=0.2337, pruned_loss=0.03864, over 23649.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2385, pruned_loss=0.03873, over 4724491.47 frames. ], batch size: 149, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:25:48,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 23:25:49,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 23:25:52,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 23:25:52,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1443906.6666666667, ans=0.125 2023-10-03 23:25:53,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 23:25:53,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:25:53,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:25:53,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:25:55,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:25:55,311 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 23:25:58,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:25:58,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:25:58,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:26:00,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:26:04,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:26:05,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:26:05,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:26:05,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 23:26:07,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:26:07,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:26:07,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:26:07,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:26:07,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 23:26:13,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:26:13,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 23:26:16,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:26:18,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:26:18,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 23:26:18,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:26:20,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:26:20,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:26:23,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 23:26:23,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:26:24,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:26:26,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 23:26:26,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:26:28,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:26:30,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:26:33,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1444106.6666666667, ans=0.0 2023-10-03 23:26:35,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:26:38,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:26:38,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:26:47,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:26:47,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:26:47,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1444173.3333333333, ans=0.0 2023-10-03 23:26:49,901 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.90 vs. limit=15.0 2023-10-03 23:26:50,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:26:53,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:26:56,965 INFO [train.py:1046] (1/4) Epoch 41, batch 4150, loss[loss=0.1542, simple_loss=0.2451, pruned_loss=0.03159, over 24281.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2385, pruned_loss=0.03837, over 4731521.03 frames. ], batch size: 74, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:26:57,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:26:57,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1444240.0, ans=0.125 2023-10-03 23:26:58,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:27:00,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:27:00,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:27:03,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 23:27:03,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:27:03,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 23:27:04,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 23:27:04,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 23:27:06,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:27:11,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:27:11,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:27:12,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1444306.6666666667, ans=0.1 2023-10-03 23:27:15,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:27:15,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:27:17,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:27:19,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:27:19,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:27:20,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 23:27:23,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:27:26,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:27:28,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 23:27:30,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 23:27:30,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:27:32,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 23:27:32,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:27:32,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:27:35,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:27:36,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:27:40,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 23:27:40,431 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:27:42,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 23:27:44,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1444440.0, ans=0.125 2023-10-03 23:27:45,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:27:45,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 23:27:45,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:27:46,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 23:27:48,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:27:49,674 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 1.998e+02 2.194e+02 2.506e+02 4.254e+02, threshold=4.388e+02, percent-clipped=0.0 2023-10-03 23:27:49,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:27:51,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:27:51,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 23:27:51,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:27:51,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 23:27:55,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 23:27:56,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 23:27:57,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:27:57,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:27:57,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:27:58,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 23:27:59,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:27:59,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:28:00,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:28:02,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:28:02,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 23:28:03,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 23:28:08,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:28:11,385 INFO [train.py:1046] (1/4) Epoch 41, batch 4200, loss[loss=0.1416, simple_loss=0.2076, pruned_loss=0.03779, over 23480.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2371, pruned_loss=0.03789, over 4733498.41 frames. ], batch size: 256, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:28:11,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 23:28:12,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:28:14,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:28:15,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:28:17,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:28:17,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:28:21,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 23:28:24,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 23:28:24,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:28:27,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:28:29,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:28:33,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 23:28:33,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:28:34,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:28:35,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 23:28:35,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:28:36,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:28:36,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:28:36,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:28:37,571 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.72 vs. limit=15.0 2023-10-03 23:28:38,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:28:40,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 23:28:40,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:28:45,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 23:28:46,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:28:48,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:28:48,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:28:49,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:28:49,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 23:28:49,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:28:51,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:28:51,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1444706.6666666667, ans=0.0 2023-10-03 23:28:56,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:28:58,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:29:03,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:29:03,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1444773.3333333333, ans=0.125 2023-10-03 23:29:06,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 23:29:09,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:29:14,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1444840.0, ans=0.125 2023-10-03 23:29:15,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 23:29:15,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:29:16,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 23:29:23,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:29:24,832 INFO [train.py:1046] (1/4) Epoch 41, batch 4250, loss[loss=0.1631, simple_loss=0.2406, pruned_loss=0.04276, over 24510.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2351, pruned_loss=0.03777, over 4722396.09 frames. ], batch size: 63, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:29:25,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1444906.6666666667, ans=0.04949747468305833 2023-10-03 23:29:28,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:29:28,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:29:29,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:29:35,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:29:36,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 23:29:36,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:29:39,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:29:42,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:29:44,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1444973.3333333333, ans=0.0 2023-10-03 23:29:47,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:29:47,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:29:48,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1444973.3333333333, ans=0.05 2023-10-03 23:29:49,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:29:49,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:29:51,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:29:51,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:29:53,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:29:53,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:29:55,816 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.60 vs. limit=15.0 2023-10-03 23:29:56,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:29:56,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 23:29:59,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 23:29:59,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:29:59,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:30:01,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:30:01,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:30:01,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:30:01,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:30:03,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1445040.0, ans=0.2 2023-10-03 23:30:04,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 23:30:04,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:30:08,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:30:08,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:30:10,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 23:30:10,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:30:12,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 23:30:12,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1445106.6666666667, ans=0.125 2023-10-03 23:30:13,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:30:14,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:30:14,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:30:16,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:30:17,405 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.864e+02 1.993e+02 2.259e+02 3.017e+02, threshold=3.986e+02, percent-clipped=0.0 2023-10-03 23:30:17,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 23:30:19,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:30:20,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:30:25,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:30:25,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1445173.3333333333, ans=0.125 2023-10-03 23:30:27,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:30:28,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:30:28,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:30:29,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:30:32,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:30:33,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:30:33,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 23:30:34,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:30:39,188 INFO [train.py:1046] (1/4) Epoch 41, batch 4300, loss[loss=0.137, simple_loss=0.2189, pruned_loss=0.02757, over 24625.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2349, pruned_loss=0.0374, over 4725392.83 frames. ], batch size: 60, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:30:39,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:30:40,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:30:42,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1445240.0, ans=0.0 2023-10-03 23:30:43,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:30:43,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1445240.0, ans=0.0 2023-10-03 23:30:50,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:30:50,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 23:30:52,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:30:53,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:30:55,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:30:55,428 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 23:30:58,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:30:59,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:31:01,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 23:31:01,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:31:01,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1445306.6666666667, ans=0.125 2023-10-03 23:31:02,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 23:31:05,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 23:31:06,394 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.75 vs. limit=15.0 2023-10-03 23:31:06,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:31:08,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:31:08,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:31:08,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1445373.3333333333, ans=0.1 2023-10-03 23:31:10,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:31:13,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:31:13,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:31:13,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 23:31:14,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 23:31:15,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:31:18,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:31:18,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:31:18,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:31:20,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:31:20,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 23:31:20,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 23:31:21,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 23:31:23,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:31:23,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 23:31:23,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 23:31:27,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:31:29,172 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 23:31:29,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:31:30,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:31:31,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:31:35,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 23:31:35,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:31:35,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:31:35,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:31:36,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:31:36,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:31:38,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1445506.6666666667, ans=0.125 2023-10-03 23:31:39,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:31:42,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:31:43,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:31:43,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:31:48,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 23:31:49,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:31:52,854 INFO [train.py:1046] (1/4) Epoch 41, batch 4350, loss[loss=0.1557, simple_loss=0.2368, pruned_loss=0.03734, over 24483.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2359, pruned_loss=0.03761, over 4727816.01 frames. ], batch size: 63, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:31:54,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:31:56,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:31:58,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:31:58,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:32:05,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:32:10,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:32:13,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:32:13,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:32:14,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:32:17,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:32:19,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:32:23,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 23:32:24,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:32:24,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:32:26,797 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.13 vs. limit=12.0 2023-10-03 23:32:29,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:32:32,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 23:32:36,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:32:38,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:32:41,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1445773.3333333333, ans=0.05 2023-10-03 23:32:42,520 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 23:32:43,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:32:43,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:32:45,480 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.734e+02 1.921e+02 2.106e+02 2.345e+02 3.480e+02, threshold=4.212e+02, percent-clipped=0.0 2023-10-03 23:32:45,619 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 23:32:45,781 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1445773.3333333333, ans=0.125 2023-10-03 23:32:46,926 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 23:32:46,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:32:46,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:32:48,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:32:49,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:32:49,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:32:51,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:32:53,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 23:32:53,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:32:53,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:32:55,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:32:55,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 23:32:56,640 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 23:32:56,644 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 23:32:56,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 23:33:00,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:33:00,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:33:01,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:33:01,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:33:01,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1445840.0, ans=0.125 2023-10-03 23:33:02,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 23:33:02,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1445840.0, ans=0.125 2023-10-03 23:33:04,109 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 23:33:05,331 INFO [train.py:1046] (1/4) Epoch 41, batch 4400, loss[loss=0.1515, simple_loss=0.2271, pruned_loss=0.03795, over 23616.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2362, pruned_loss=0.03782, over 4745355.30 frames. ], batch size: 256, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:33:05,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:33:05,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1445906.6666666667, ans=0.125 2023-10-03 23:33:08,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:33:08,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:33:11,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1445906.6666666667, ans=0.125 2023-10-03 23:33:13,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:33:15,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 23:33:15,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 23:33:16,139 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:33:17,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 23:33:17,414 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 23:33:18,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:33:18,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:33:22,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 23:33:23,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:33:23,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:33:23,765 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 23:33:27,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:33:27,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 23:33:27,841 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 23:33:30,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1445973.3333333333, ans=0.2 2023-10-03 23:33:31,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 23:33:31,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 23:33:32,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 23:33:33,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:33:33,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:33:33,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:33:35,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:33:36,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 23:33:36,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 23:33:38,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:33:39,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:33:39,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:33:41,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:33:41,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:33:41,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 23:33:42,728 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 23:33:45,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1446040.0, ans=0.1 2023-10-03 23:33:46,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:33:54,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:33:56,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 23:34:00,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:34:02,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:34:05,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:34:05,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 23:34:05,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:34:05,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:34:05,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:34:06,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:34:10,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 23:34:13,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 23:34:14,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 23:34:14,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:34:14,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 23:34:14,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:34:17,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:34:18,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 23:34:20,053 INFO [train.py:1046] (1/4) Epoch 41, batch 4450, loss[loss=0.1975, simple_loss=0.2624, pruned_loss=0.06629, over 19297.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2376, pruned_loss=0.03842, over 4742940.19 frames. ], batch size: 388, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:34:24,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:34:25,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:34:25,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:34:27,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1446240.0, ans=0.1 2023-10-03 23:34:33,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:34:33,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:34:37,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:34:38,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:34:40,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:34:40,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:34:42,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 23:34:42,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:34:42,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:34:42,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:34:42,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:34:45,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:34:50,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:34:50,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:34:52,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:34:53,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:34:55,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:34:58,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1446373.3333333333, ans=0.1 2023-10-03 23:35:00,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 23:35:01,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 23:35:01,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 23:35:01,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:35:02,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:35:03,840 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.64 vs. limit=6.0 2023-10-03 23:35:04,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 23:35:08,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:35:11,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:35:12,365 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 2.032e+02 2.316e+02 2.697e+02 5.602e+02, threshold=4.632e+02, percent-clipped=2.0 2023-10-03 23:35:12,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 23:35:12,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:35:12,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:35:12,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:35:13,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:35:13,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1446440.0, ans=0.1 2023-10-03 23:35:15,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:35:19,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:35:20,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 23:35:21,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:35:25,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:35:25,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:35:26,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:35:26,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 23:35:27,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:35:29,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1446506.6666666667, ans=0.125 2023-10-03 23:35:31,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 23:35:32,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:35:33,933 INFO [train.py:1046] (1/4) Epoch 41, batch 4500, loss[loss=0.2008, simple_loss=0.2685, pruned_loss=0.06658, over 19789.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2376, pruned_loss=0.03828, over 4741612.23 frames. ], batch size: 388, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:35:35,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:35:36,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 23:35:36,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 23:35:38,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:35:42,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:35:43,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:35:43,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:35:44,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:35:44,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:35:46,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:35:50,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1446640.0, ans=0.0 2023-10-03 23:35:57,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:35:59,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:36:01,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:36:02,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:36:03,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:36:08,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:36:11,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1446706.6666666667, ans=0.0 2023-10-03 23:36:12,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:36:15,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:36:18,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:36:18,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 23:36:19,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:36:19,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:36:22,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:36:23,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1446773.3333333333, ans=0.0 2023-10-03 23:36:24,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:36:25,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:36:25,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 23:36:25,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:36:26,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:36:30,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:36:30,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:36:34,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:36:37,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:36:37,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:36:38,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 23:36:40,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 23:36:40,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 23:36:42,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 23:36:46,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 23:36:47,884 INFO [train.py:1046] (1/4) Epoch 41, batch 4550, loss[loss=0.1476, simple_loss=0.2409, pruned_loss=0.02715, over 24351.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2367, pruned_loss=0.03804, over 4735597.81 frames. ], batch size: 74, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:36:49,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:36:51,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:36:52,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:36:55,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:37:01,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:37:02,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:37:04,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:37:04,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:37:04,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:07,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:37:07,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:37:09,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:37:12,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 23:37:12,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 23:37:14,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:37:14,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1446973.3333333333, ans=0.1 2023-10-03 23:37:17,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 23:37:19,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 23:37:19,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:37:21,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 23:37:25,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:37:26,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1447040.0, ans=0.125 2023-10-03 23:37:29,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:29,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:29,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:37:31,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 23:37:32,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:37:34,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:34,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:37:35,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:37:36,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 23:37:38,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 23:37:38,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:37:39,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 23:37:40,779 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 2.077e+02 2.278e+02 2.553e+02 3.904e+02, threshold=4.555e+02, percent-clipped=0.0 2023-10-03 23:37:40,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 23:37:40,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:37:42,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1447106.6666666667, ans=0.2 2023-10-03 23:37:44,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:37:44,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:37:45,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:45,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:37:46,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:37:46,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 23:37:48,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:37:48,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 23:37:50,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 23:37:50,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:37:51,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 23:37:53,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:37:53,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:37:56,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:37:56,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:56,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:37:59,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:37:59,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:38:02,608 INFO [train.py:1046] (1/4) Epoch 41, batch 4600, loss[loss=0.1578, simple_loss=0.233, pruned_loss=0.04127, over 23893.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2349, pruned_loss=0.03803, over 4723200.11 frames. ], batch size: 195, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:38:04,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:05,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:38:08,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:38:08,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:38:09,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:38:09,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 23:38:10,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:38:14,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:38:14,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:38:17,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:24,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 23:38:26,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:29,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:30,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:38:30,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:38:32,969 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.08 vs. limit=15.0 2023-10-03 23:38:35,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 23:38:35,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:38:36,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:38:36,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1447373.3333333333, ans=0.0 2023-10-03 23:38:40,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:42,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:38:43,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:38:48,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 23:38:49,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 23:38:54,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:38:56,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:38:58,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:38:58,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 23:38:58,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:39:00,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 23:39:00,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:39:01,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1447506.6666666667, ans=0.125 2023-10-03 23:39:01,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1447506.6666666667, ans=0.1 2023-10-03 23:39:02,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:39:02,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:39:03,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:39:04,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:39:06,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 23:39:06,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 23:39:06,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 23:39:06,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:39:07,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:39:07,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:39:09,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:39:16,070 INFO [train.py:1046] (1/4) Epoch 41, batch 4650, loss[loss=0.1439, simple_loss=0.2267, pruned_loss=0.03061, over 24320.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.235, pruned_loss=0.03782, over 4723172.58 frames. ], batch size: 61, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:39:18,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:39:22,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:39:22,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:39:22,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:39:22,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1447573.3333333333, ans=0.0 2023-10-03 23:39:23,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:39:23,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:39:23,781 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1447573.3333333333, ans=0.2 2023-10-03 23:39:25,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:39:26,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=1447573.3333333333, ans=15.0 2023-10-03 23:39:26,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 23:39:31,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:39:32,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 23:39:32,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:39:34,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 23:39:34,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:39:34,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 23:39:34,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 23:39:34,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:39:35,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:39:37,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1447640.0, ans=0.0 2023-10-03 23:39:38,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:39:39,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:39:40,014 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 23:39:40,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1447640.0, ans=0.1 2023-10-03 23:39:44,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:39:45,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 23:39:46,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:39:47,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:39:48,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 23:39:50,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:39:52,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:39:58,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:02,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:40:06,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:40:06,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:40:06,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:40:08,750 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 1.906e+02 2.155e+02 2.545e+02 4.224e+02, threshold=4.311e+02, percent-clipped=0.0 2023-10-03 23:40:08,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 23:40:08,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 23:40:10,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 23:40:10,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 23:40:11,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:40:14,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1447840.0, ans=0.2 2023-10-03 23:40:17,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:40:17,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:40:17,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 23:40:17,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:19,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:40:19,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:40:20,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:40:22,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:40:22,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:40:22,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:40:26,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:40:26,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:40:26,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:40:28,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 23:40:29,395 INFO [train.py:1046] (1/4) Epoch 41, batch 4700, loss[loss=0.168, simple_loss=0.2407, pruned_loss=0.04767, over 23536.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2359, pruned_loss=0.03842, over 4720107.05 frames. ], batch size: 285, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:40:29,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:40:30,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 23:40:38,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:38,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1447906.6666666667, ans=0.125 2023-10-03 23:40:39,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:40:41,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:40:41,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:40:42,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1447973.3333333333, ans=0.125 2023-10-03 23:40:43,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 23:40:47,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 23:40:48,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 23:40:48,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1447973.3333333333, ans=0.1 2023-10-03 23:40:50,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:51,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:40:51,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:40:53,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:59,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:41:00,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 23:41:01,560 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.03 vs. limit=22.5 2023-10-03 23:41:04,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:41:07,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1448040.0, ans=0.125 2023-10-03 23:41:08,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1448040.0, ans=0.0 2023-10-03 23:41:09,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 23:41:10,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1448040.0, ans=0.04949747468305833 2023-10-03 23:41:10,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:41:12,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:17,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 23:41:19,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:41:22,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:41:24,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 23:41:26,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:26,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:41:27,623 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.76 vs. limit=6.0 2023-10-03 23:41:28,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:41:28,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:41:30,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 23:41:32,127 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 23:41:32,767 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.24 vs. limit=22.5 2023-10-03 23:41:33,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:41:36,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:36,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:36,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 23:41:37,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:39,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 23:41:41,869 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.78 vs. limit=8.0 2023-10-03 23:41:43,429 INFO [train.py:1046] (1/4) Epoch 41, batch 4750, loss[loss=0.1575, simple_loss=0.2334, pruned_loss=0.0408, over 23606.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2364, pruned_loss=0.03854, over 4718431.25 frames. ], batch size: 256, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:41:43,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:41:43,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:41:46,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:41:47,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:41:49,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 23:41:49,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:41:51,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1448240.0, ans=0.09899494936611666 2023-10-03 23:41:53,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 23:41:56,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:41:56,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:41:57,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:41:58,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1448306.6666666667, ans=0.0 2023-10-03 23:42:02,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 23:42:04,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1448306.6666666667, ans=0.1 2023-10-03 23:42:05,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:42:08,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 23:42:08,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:42:11,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:42:11,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:42:12,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:42:14,284 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 23:42:14,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 23:42:19,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 23:42:20,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:42:23,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:42:25,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:42:25,971 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 23:42:25,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:42:28,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:42:29,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1448440.0, ans=0.05 2023-10-03 23:42:30,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:42:33,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 23:42:33,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 23:42:34,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:42:34,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:42:34,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:42:35,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1448440.0, ans=0.125 2023-10-03 23:42:36,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:42:36,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 23:42:38,093 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.957e+02 2.175e+02 2.344e+02 3.042e+02, threshold=4.350e+02, percent-clipped=0.0 2023-10-03 23:42:39,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 23:42:40,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:42:43,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:42:43,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 23:42:45,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:42:45,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:42:46,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:42:47,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:42:48,721 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.95 vs. limit=22.5 2023-10-03 23:42:49,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:42:52,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:42:53,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 23:42:53,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 23:42:53,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 23:42:55,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:42:56,573 INFO [train.py:1046] (1/4) Epoch 41, batch 4800, loss[loss=0.1511, simple_loss=0.2387, pruned_loss=0.0317, over 24650.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2374, pruned_loss=0.03895, over 4720277.87 frames. ], batch size: 68, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:42:56,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:42:56,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 23:43:03,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:05,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:43:11,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:43:12,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:43:13,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:13,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 23:43:14,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:43:14,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:43:16,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:43:20,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:43:22,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:43:22,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:43:23,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:43:23,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 23:43:23,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:43:25,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:43:27,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:43:29,259 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:43:30,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:43:30,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:43:30,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:43:31,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:43:33,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:34,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 23:43:34,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 23:43:36,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:36,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:43:36,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:43:38,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:43:38,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:43:38,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:43:39,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:43:45,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:43:46,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:43:48,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:43:52,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 23:43:52,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:43:52,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:43:52,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:43:53,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1448773.3333333333, ans=0.125 2023-10-03 23:43:53,548 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.35 vs. limit=22.5 2023-10-03 23:43:54,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:55,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:43:57,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:43:57,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:43:58,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:43:58,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:43:58,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:44:01,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:44:01,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:44:01,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:44:04,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 23:44:06,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 23:44:06,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:44:06,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:44:07,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:44:07,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:44:10,450 INFO [train.py:1046] (1/4) Epoch 41, batch 4850, loss[loss=0.1651, simple_loss=0.2444, pruned_loss=0.0429, over 23236.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2378, pruned_loss=0.03937, over 4712057.75 frames. ], batch size: 105, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:44:10,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:44:18,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1448906.6666666667, ans=0.2 2023-10-03 23:44:19,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 23:44:20,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:44:26,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:44:26,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1448973.3333333333, ans=0.1 2023-10-03 23:44:26,970 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.22 vs. limit=15.0 2023-10-03 23:44:27,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:44:27,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:44:30,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:44:31,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:44:33,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:44:33,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 23:44:36,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:44:39,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:44:39,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:44:40,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1449040.0, ans=0.125 2023-10-03 23:44:41,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:44:41,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 23:44:45,305 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.83 vs. limit=22.5 2023-10-03 23:44:45,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:44:45,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:44:48,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:44:48,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 23:44:48,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 23:44:50,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:44:50,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1449040.0, ans=0.0 2023-10-03 23:44:55,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1449106.6666666667, ans=0.1 2023-10-03 23:44:59,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:44:59,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 23:45:00,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:45:01,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:45:03,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:45:04,520 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 2.005e+02 2.187e+02 2.519e+02 4.301e+02, threshold=4.375e+02, percent-clipped=0.0 2023-10-03 23:45:04,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 23:45:04,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:45:04,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 23:45:04,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:45:06,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:45:07,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 23:45:15,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1449173.3333333333, ans=0.2 2023-10-03 23:45:17,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:45:22,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:45:22,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:45:22,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1449240.0, ans=0.2 2023-10-03 23:45:24,434 INFO [train.py:1046] (1/4) Epoch 41, batch 4900, loss[loss=0.1432, simple_loss=0.2023, pruned_loss=0.04208, over 22599.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2369, pruned_loss=0.0387, over 4708098.87 frames. ], batch size: 322, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:45:28,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 23:45:28,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:45:34,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:45:34,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:45:34,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:45:37,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 23:45:40,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 23:45:45,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 23:45:47,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 23:45:47,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:45:47,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:45:47,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:45:47,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:45:48,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:45:48,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 23:45:51,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 23:45:51,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:45:52,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1449306.6666666667, ans=0.125 2023-10-03 23:45:53,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:45:53,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:45:57,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:45:57,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:45:58,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:45:58,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 23:46:00,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:46:01,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:46:01,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 23:46:01,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 23:46:06,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1449373.3333333333, ans=0.0 2023-10-03 23:46:07,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 23:46:10,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:46:11,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:46:11,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:46:11,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:46:13,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 23:46:13,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:46:13,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 23:46:15,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:46:17,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 23:46:18,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:46:21,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 23:46:22,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:46:22,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 23:46:24,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 23:46:27,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1449506.6666666667, ans=0.1 2023-10-03 23:46:28,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1449506.6666666667, ans=0.125 2023-10-03 23:46:30,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:46:31,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:46:33,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 23:46:33,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:46:33,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:46:33,723 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.64 vs. limit=15.0 2023-10-03 23:46:34,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:46:35,746 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.78 vs. limit=22.5 2023-10-03 23:46:37,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:46:37,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:46:37,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:46:37,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 23:46:39,027 INFO [train.py:1046] (1/4) Epoch 41, batch 4950, loss[loss=0.1677, simple_loss=0.2515, pruned_loss=0.04194, over 24309.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2362, pruned_loss=0.03859, over 4706555.54 frames. ], batch size: 77, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:46:39,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:46:42,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:46:42,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:46:45,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 23:46:45,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 23:46:45,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:46:46,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 23:46:46,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:46:46,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:46:48,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:46:48,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:46:51,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:46:51,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:46:53,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:46:54,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:46:54,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1449640.0, ans=0.125 2023-10-03 23:46:56,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:46:57,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:47:00,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:47:05,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:47:07,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:47:07,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:47:07,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:47:08,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:47:11,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 23:47:11,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 23:47:14,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:47:17,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:47:17,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:47:19,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:47:19,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:47:20,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:47:23,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:47:25,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:47:26,036 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1449773.3333333333, ans=0.0 2023-10-03 23:47:26,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:47:28,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:47:28,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:47:28,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 23:47:28,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:47:31,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:47:33,936 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.903e+02 2.164e+02 2.599e+02 4.348e+02, threshold=4.328e+02, percent-clipped=0.0 2023-10-03 23:47:34,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:47:35,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:47:37,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:47:37,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:47:38,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:47:38,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:47:39,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:47:41,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:47:41,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:47:43,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 23:47:47,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:47:47,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1449840.0, ans=0.125 2023-10-03 23:47:53,057 INFO [train.py:1046] (1/4) Epoch 41, batch 5000, loss[loss=0.1257, simple_loss=0.2047, pruned_loss=0.02335, over 24344.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2357, pruned_loss=0.03829, over 4712068.15 frames. ], batch size: 56, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:47:53,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 23:47:53,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 23:47:55,376 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.09 vs. limit=15.0 2023-10-03 23:48:00,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:48:00,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:48:00,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 23:48:02,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 23:48:04,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:48:05,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 23:48:06,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:48:06,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:48:06,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 23:48:06,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:48:08,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:48:08,569 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1449973.3333333333, ans=0.125 2023-10-03 23:48:09,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 23:48:09,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:48:09,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:48:11,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 23:48:11,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 23:48:13,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:48:13,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 23:48:13,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:48:14,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:48:15,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:48:15,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 23:48:15,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 23:48:17,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 23:48:17,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:48:18,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:48:19,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 23:48:21,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:48:23,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:48:23,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:48:24,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 23:48:24,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1450040.0, ans=0.0 2023-10-03 23:48:27,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 23:48:28,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1450040.0, ans=0.1 2023-10-03 23:48:29,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:48:30,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:48:34,976 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 23:48:37,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:48:39,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:48:39,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:48:42,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 23:48:43,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:48:43,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:48:43,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:48:45,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 23:48:47,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:48:49,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:48:51,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:48:55,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 23:48:59,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:49:06,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1450240.0, ans=0.125 2023-10-03 23:49:07,310 INFO [train.py:1046] (1/4) Epoch 41, batch 5050, loss[loss=0.1571, simple_loss=0.2395, pruned_loss=0.03736, over 24035.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2354, pruned_loss=0.03825, over 4699552.87 frames. ], batch size: 86, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:49:07,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:49:08,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:49:08,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:49:10,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:49:10,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:49:10,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:49:10,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:49:16,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:49:16,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 23:49:18,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:49:18,362 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:49:20,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:49:22,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:49:22,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 23:49:23,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:49:23,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:49:25,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:49:26,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:49:26,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:49:36,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 23:49:36,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 23:49:37,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1450373.3333333333, ans=0.125 2023-10-03 23:49:38,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:49:38,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 23:49:38,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:49:39,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:49:39,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:49:40,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:49:40,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 23:49:40,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 23:49:42,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:49:44,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:49:48,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:49:48,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 23:49:49,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:49:52,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 23:49:54,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:49:54,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:49:55,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:49:55,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:49:58,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:49:58,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:49:58,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1450440.0, ans=0.1 2023-10-03 23:49:59,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:01,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:50:01,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:50:01,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 23:50:01,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:50:03,169 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 1.933e+02 2.126e+02 2.444e+02 3.244e+02, threshold=4.252e+02, percent-clipped=0.0 2023-10-03 23:50:04,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:50:04,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1450506.6666666667, ans=0.0 2023-10-03 23:50:06,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:50:06,195 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 23:50:06,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 23:50:07,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:50:07,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1450506.6666666667, ans=0.1 2023-10-03 23:50:08,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:08,984 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 23:50:12,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:50:12,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 23:50:12,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:14,198 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.06 vs. limit=15.0 2023-10-03 23:50:16,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:50:18,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:18,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 23:50:20,656 INFO [train.py:1046] (1/4) Epoch 41, batch 5100, loss[loss=0.1501, simple_loss=0.2221, pruned_loss=0.03906, over 23713.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2362, pruned_loss=0.03829, over 4702415.13 frames. ], batch size: 232, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:50:20,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 23:50:23,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:50:23,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:50:23,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:50:23,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1450573.3333333333, ans=0.1 2023-10-03 23:50:23,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1450573.3333333333, ans=0.0 2023-10-03 23:50:24,530 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.35 vs. limit=15.0 2023-10-03 23:50:26,327 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 23:50:28,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:50:31,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 23:50:32,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 23:50:32,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:50:33,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:50:34,717 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.13 vs. limit=15.0 2023-10-03 23:50:35,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:50:35,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 23:50:35,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 23:50:41,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:50:41,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:50:47,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:50:49,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 23:50:49,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:50:50,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:50,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 23:50:53,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:50:53,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:50:53,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 23:50:55,218 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.29 vs. limit=15.0 2023-10-03 23:50:56,027 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 23:50:56,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:50:57,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 23:50:57,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 23:51:00,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:51:04,592 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.52 vs. limit=12.0 2023-10-03 23:51:09,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:51:11,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 23:51:12,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1450773.3333333333, ans=0.125 2023-10-03 23:51:13,152 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 23:51:13,160 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 23:51:15,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 23:51:15,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:51:18,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 23:51:22,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 23:51:23,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:51:25,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:51:26,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 23:51:27,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 23:51:28,480 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.06 vs. limit=10.0 2023-10-03 23:51:29,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 23:51:29,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1450840.0, ans=0.125 2023-10-03 23:51:33,711 INFO [train.py:1046] (1/4) Epoch 41, batch 5150, loss[loss=0.1595, simple_loss=0.2447, pruned_loss=0.03712, over 24308.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2372, pruned_loss=0.03838, over 4710280.06 frames. ], batch size: 77, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:51:33,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:51:34,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:51:34,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:51:34,754 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:51:35,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:51:35,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 23:51:37,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:51:37,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 23:51:37,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 23:51:38,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 23:51:38,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:51:38,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 23:51:40,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:51:40,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 23:51:40,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1450906.6666666667, ans=0.0 2023-10-03 23:51:41,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1450906.6666666667, ans=0.0 2023-10-03 23:51:42,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:51:44,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:51:49,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:51:49,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 23:51:50,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:51:52,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:51:53,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:51:53,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:51:53,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:51:54,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:51:54,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:51:56,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 23:51:57,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:51:57,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:51:59,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:52:02,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 23:52:03,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:52:08,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:52:08,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 23:52:12,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:52:18,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:52:19,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:52:21,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1451106.6666666667, ans=0.2 2023-10-03 23:52:22,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:52:22,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:52:25,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 23:52:29,391 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.988e+02 2.282e+02 2.710e+02 3.872e+02, threshold=4.565e+02, percent-clipped=0.0 2023-10-03 23:52:29,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:52:30,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:52:30,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:52:31,558 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.71 vs. limit=6.0 2023-10-03 23:52:34,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:52:35,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:52:36,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 23:52:41,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:52:43,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:52:44,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:52:44,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:52:46,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 23:52:46,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:52:46,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:52:46,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:52:47,996 INFO [train.py:1046] (1/4) Epoch 41, batch 5200, loss[loss=0.1357, simple_loss=0.2007, pruned_loss=0.03535, over 22625.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2381, pruned_loss=0.03866, over 4709017.32 frames. ], batch size: 322, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:52:50,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:52:52,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:52:54,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:52:58,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 23:52:58,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:52:58,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:53:02,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:53:02,717 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.98 vs. limit=15.0 2023-10-03 23:53:03,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:53:03,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:53:04,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 23:53:07,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:53:09,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:53:09,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1451306.6666666667, ans=0.0 2023-10-03 23:53:10,325 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.84 vs. limit=15.0 2023-10-03 23:53:10,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 23:53:13,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:53:13,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:53:15,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 23:53:16,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 23:53:19,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 23:53:19,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:53:19,757 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 23:53:19,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:53:22,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:53:22,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:53:23,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 23:53:23,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:53:25,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:53:26,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1451373.3333333333, ans=0.125 2023-10-03 23:53:28,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 23:53:28,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 23:53:28,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 23:53:30,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.whiten.whitening_limit, batch_count=1451440.0, ans=12.0 2023-10-03 23:53:30,228 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.81 vs. limit=12.0 2023-10-03 23:53:32,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 23:53:33,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1451440.0, ans=0.125 2023-10-03 23:53:34,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:53:40,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:53:40,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:53:42,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 23:53:42,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:53:42,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 23:53:42,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:53:44,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:53:45,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:53:46,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:53:48,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:53:51,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:53:51,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:53:54,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.63 vs. limit=15.0 2023-10-03 23:53:55,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:53:57,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 23:53:57,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:53:57,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:53:58,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:53:58,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 23:54:00,006 INFO [train.py:1046] (1/4) Epoch 41, batch 5250, loss[loss=0.15, simple_loss=0.2359, pruned_loss=0.03205, over 24553.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2373, pruned_loss=0.03823, over 4715436.49 frames. ], batch size: 71, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:54:00,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:54:03,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:54:05,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:54:06,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:54:07,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:54:09,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1451573.3333333333, ans=0.125 2023-10-03 23:54:12,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:54:13,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:54:16,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:54:17,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1451640.0, ans=0.125 2023-10-03 23:54:18,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:54:21,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 23:54:21,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:54:21,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:54:51,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1451773.3333333333, ans=0.125 2023-10-03 23:54:52,782 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.869e+02 2.005e+02 2.219e+02 3.156e+02, threshold=4.009e+02, percent-clipped=0.0 2023-10-03 23:55:02,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1451840.0, ans=0.0 2023-10-03 23:55:08,763 INFO [train.py:1046] (1/4) Epoch 41, batch 5300, loss[loss=0.1491, simple_loss=0.2328, pruned_loss=0.03274, over 24317.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2358, pruned_loss=0.03834, over 4713966.83 frames. ], batch size: 61, lr: 2.46e-03, grad_scale: 16.0 2023-10-03 23:55:23,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:55:23,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 23:55:23,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 23:55:23,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:55:23,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:55:23,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:55:23,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:55:23,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:55:23,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:55:23,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:55:23,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 23:55:24,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:55:24,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 23:55:24,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 23:55:24,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 23:55:24,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 23:55:24,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 23:55:24,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 23:55:25,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:55:25,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:55:25,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:55:25,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:55:25,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:55:25,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:55:25,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:55:25,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:55:25,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:55:25,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:55:25,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:55:25,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:55:25,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:55:26,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 23:55:26,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:55:26,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:55:26,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 23:55:26,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 23:55:27,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:55:27,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:55:27,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 23:55:27,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 23:55:27,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:55:27,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:55:28,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:55:28,242 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 23:55:28,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 23:55:28,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:55:28,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:55:28,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 23:55:28,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 23:55:28,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 23:55:28,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:55:32,748 INFO [train.py:1046] (1/4) Epoch 42, batch 0, loss[loss=0.1489, simple_loss=0.2308, pruned_loss=0.03347, over 23595.00 frames. ], tot_loss[loss=0.1489, simple_loss=0.2308, pruned_loss=0.03347, over 23595.00 frames. ], batch size: 135, lr: 2.43e-03, grad_scale: 32.0 2023-10-03 23:55:32,749 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-03 23:55:44,913 INFO [train.py:1078] (1/4) Epoch 42, validation: loss=0.3268, simple_loss=0.2729, pruned_loss=0.1903, over 1125622.00 frames. 2023-10-03 23:55:44,914 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-03 23:55:48,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 23:55:48,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:55:48,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1451986.6666666667, ans=0.0 2023-10-03 23:55:51,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:55:54,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1451986.6666666667, ans=0.0 2023-10-03 23:55:55,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:55:55,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:55:56,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:55:56,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 23:55:58,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 23:55:59,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:55:59,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:56:02,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:56:02,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:56:03,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:56:03,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:56:05,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 23:56:05,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1452053.3333333333, ans=0.0 2023-10-03 23:56:05,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1452053.3333333333, ans=0.125 2023-10-03 23:56:06,363 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.43 vs. limit=6.0 2023-10-03 23:56:06,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:56:14,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:56:15,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:56:17,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 23:56:17,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1452120.0, ans=0.125 2023-10-03 23:56:17,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1452120.0, ans=0.125 2023-10-03 23:56:19,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:56:19,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:56:22,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:56:26,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:56:27,287 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.72 vs. limit=22.5 2023-10-03 23:56:31,167 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:56:32,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:56:36,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 23:56:41,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 23:56:41,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:56:41,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:56:43,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:56:43,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:56:46,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 23:56:48,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:56:49,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:56:53,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:56:55,949 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 23:56:57,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:56:57,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1452320.0, ans=0.125 2023-10-03 23:56:58,752 INFO [train.py:1046] (1/4) Epoch 42, batch 50, loss[loss=0.1746, simple_loss=0.2456, pruned_loss=0.05179, over 23699.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2362, pruned_loss=0.03717, over 1066839.17 frames. ], batch size: 179, lr: 2.43e-03, grad_scale: 16.0 2023-10-03 23:57:01,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:57:04,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:57:04,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 23:57:05,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:57:05,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:57:08,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:57:08,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:57:11,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:57:15,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 23:57:15,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:57:16,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1452386.6666666667, ans=0.125 2023-10-03 23:57:16,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1452386.6666666667, ans=0.0 2023-10-03 23:57:20,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:57:21,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 23:57:23,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 23:57:25,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:57:25,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:57:25,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:57:27,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:57:28,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:57:28,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:57:28,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:57:35,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:57:37,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:57:37,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:57:38,509 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 2.003e+02 2.232e+02 2.630e+02 3.790e+02, threshold=4.463e+02, percent-clipped=0.0 2023-10-03 23:57:38,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 23:57:38,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:57:40,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:57:40,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 23:57:40,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:57:42,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 23:57:51,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:57:51,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:57:53,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:57:54,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:57:54,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:57:57,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 23:57:57,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 23:57:58,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:57:58,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:58:00,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:58:00,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:58:00,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 23:58:01,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 23:58:02,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 23:58:04,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:05,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:58:06,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 23:58:06,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 23:58:06,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:08,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:58:08,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:58:09,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:58:12,879 INFO [train.py:1046] (1/4) Epoch 42, batch 100, loss[loss=0.1536, simple_loss=0.2452, pruned_loss=0.03099, over 24299.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2387, pruned_loss=0.037, over 1875109.39 frames. ], batch size: 74, lr: 2.43e-03, grad_scale: 16.0 2023-10-03 23:58:12,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:58:15,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:58:19,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:58:21,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 23:58:21,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:58:25,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:58:25,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:58:25,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:58:25,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:58:26,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:58:27,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1452720.0, ans=0.0 2023-10-03 23:58:28,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 23:58:31,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:58:31,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:31,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:58:31,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:58:34,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 23:58:34,843 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.88 vs. limit=6.0 2023-10-03 23:58:35,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:35,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:58:36,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:58:38,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:58:41,105 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 23:58:41,126 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 23:58:42,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:58:42,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:58:45,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:58:48,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:50,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:58:52,269 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.74 vs. limit=22.5 2023-10-03 23:58:55,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:58:57,314 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 23:58:58,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1452853.3333333333, ans=0.0 2023-10-03 23:59:00,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 23:59:02,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:59:04,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:59:06,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:59:09,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:59:13,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:59:13,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1452920.0, ans=0.025 2023-10-03 23:59:14,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:59:16,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:59:18,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:59:19,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:59:19,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:59:19,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:59:21,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 23:59:21,127 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 23:59:21,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:59:23,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:59:24,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:24,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:59:24,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 23:59:24,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:59:26,531 INFO [train.py:1046] (1/4) Epoch 42, batch 150, loss[loss=0.1683, simple_loss=0.2376, pruned_loss=0.04953, over 22929.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2382, pruned_loss=0.03716, over 2515266.85 frames. ], batch size: 322, lr: 2.43e-03, grad_scale: 8.0 2023-10-03 23:59:26,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:59:26,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:26,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:59:28,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:59:28,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:59:28,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:59:30,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:59:33,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:59:33,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:59:34,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:35,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:59:36,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:37,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:59:39,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:43,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 23:59:43,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 23:59:43,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 23:59:46,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:59:46,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:59:47,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:59:49,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:59:49,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:59:49,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:49,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:50,942 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 23:59:52,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:59:52,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1453053.3333333333, ans=0.2 2023-10-03 23:59:56,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:59:59,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:00:00,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 00:00:03,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:00:04,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:00:04,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:00:06,212 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.980e+02 2.184e+02 2.445e+02 3.491e+02, threshold=4.367e+02, percent-clipped=0.0 2023-10-04 00:00:07,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:00:08,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:00:09,249 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:00:10,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:00:11,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:00:11,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 00:00:16,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:00:17,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:00:19,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:00:19,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:00:21,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:00:23,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 00:00:25,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:00:27,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:00:28,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:00:28,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1453253.3333333333, ans=0.125 2023-10-04 00:00:30,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:00:30,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 00:00:30,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:00:31,491 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 00:00:34,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:00:38,125 INFO [train.py:1046] (1/4) Epoch 42, batch 200, loss[loss=0.1475, simple_loss=0.2196, pruned_loss=0.03771, over 23872.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2384, pruned_loss=0.03792, over 3003254.25 frames. ], batch size: 164, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:00:38,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:00:38,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:00:41,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 00:00:41,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:00:42,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:00:44,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 00:00:45,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:00:46,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:00:48,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:00:54,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:00:54,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:00:54,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:00:55,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1453386.6666666667, ans=0.125 2023-10-04 00:01:12,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:01:12,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:01:14,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:01:14,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:01:15,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 00:01:15,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 00:01:18,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:01:18,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1453453.3333333333, ans=0.125 2023-10-04 00:01:19,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:01:19,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:01:19,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:01:21,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 00:01:22,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 00:01:22,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:01:26,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:01:33,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:01:38,670 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.77 vs. limit=15.0 2023-10-04 00:01:40,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:01:40,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:01:47,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:01:47,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 00:01:49,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:01:49,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:01:49,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:01:50,407 INFO [train.py:1046] (1/4) Epoch 42, batch 250, loss[loss=0.1447, simple_loss=0.2349, pruned_loss=0.02723, over 24469.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2375, pruned_loss=0.03807, over 3381946.48 frames. ], batch size: 66, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:01:50,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:01:52,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 00:01:52,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:01:52,589 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 00:01:55,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:01:57,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:02:00,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:02:00,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:02:03,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:02:04,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:02:04,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:02:07,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:02:14,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:02:17,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:02:17,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:02:17,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=1453720.0, ans=0.1 2023-10-04 00:02:23,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:02:24,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:02:24,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:02:24,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:02:26,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 00:02:26,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:02:26,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1453786.6666666667, ans=0.0 2023-10-04 00:02:28,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:02:31,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:02:32,300 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.009e+02 2.241e+02 2.579e+02 4.202e+02, threshold=4.483e+02, percent-clipped=0.0 2023-10-04 00:02:32,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 00:02:32,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:02:32,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1453786.6666666667, ans=0.1 2023-10-04 00:02:35,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:02:35,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:02:35,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:02:36,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:02:36,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:02:37,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:02:40,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:02:41,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:02:42,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:02:42,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1453853.3333333333, ans=0.125 2023-10-04 00:02:42,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1453853.3333333333, ans=0.0 2023-10-04 00:02:47,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:02:48,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1453920.0, ans=0.125 2023-10-04 00:02:50,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:02:51,264 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.19 vs. limit=15.0 2023-10-04 00:02:53,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:02:59,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:03:01,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:03:03,959 INFO [train.py:1046] (1/4) Epoch 42, batch 300, loss[loss=0.1614, simple_loss=0.2344, pruned_loss=0.04414, over 23730.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2354, pruned_loss=0.0378, over 3680955.64 frames. ], batch size: 150, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:03:05,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 00:03:05,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:03:05,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:03:06,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 00:03:06,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:03:08,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:03:09,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 00:03:13,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:03:13,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:03:17,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:03:17,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 00:03:19,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:03:19,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:03:19,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 00:03:19,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:03:23,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1454053.3333333333, ans=0.0 2023-10-04 00:03:24,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 00:03:28,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:03:28,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 00:03:31,692 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.79 vs. limit=12.0 2023-10-04 00:03:32,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 00:03:32,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1454120.0, ans=0.125 2023-10-04 00:03:33,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:03:34,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:03:36,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:03:36,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 00:03:36,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:03:39,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:03:40,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:03:40,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:03:43,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 00:03:43,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 00:03:44,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:03:47,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:03:48,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 00:03:48,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:03:55,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:03:58,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:03:58,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 00:04:04,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:04:04,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:04:05,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:04:07,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:04:07,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 00:04:07,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:04:08,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:04:10,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 00:04:11,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:04:12,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:14,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:04:14,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:04:15,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:16,929 INFO [train.py:1046] (1/4) Epoch 42, batch 350, loss[loss=0.1462, simple_loss=0.2201, pruned_loss=0.03614, over 23371.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2352, pruned_loss=0.03768, over 3913994.69 frames. ], batch size: 119, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:04:17,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1454320.0, ans=0.125 2023-10-04 00:04:18,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:04:18,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 00:04:21,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:27,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:04:31,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:04:31,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:32,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 00:04:34,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:04:35,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 00:04:36,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:36,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 00:04:38,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:04:41,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 00:04:43,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:04:45,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:04:46,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:04:46,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:04:47,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:04:47,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:04:47,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:04:48,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:04:50,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:04:50,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:58,162 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.664e+02 1.875e+02 2.208e+02 2.566e+02 3.808e+02, threshold=4.416e+02, percent-clipped=0.0 2023-10-04 00:04:58,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:04:58,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:04:59,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:04:59,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:05:04,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 00:05:04,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:05:08,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:05:08,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:05:08,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:05:11,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 00:05:12,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:13,941 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 00:05:15,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 00:05:15,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:05:18,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:05:18,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 00:05:19,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:22,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:05:22,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:05:24,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:24,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:05:26,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:05:29,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:05:30,690 INFO [train.py:1046] (1/4) Epoch 42, batch 400, loss[loss=0.1455, simple_loss=0.2296, pruned_loss=0.03068, over 23616.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2341, pruned_loss=0.03759, over 4095273.79 frames. ], batch size: 135, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:05:30,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:05:32,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 00:05:32,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:32,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:05:34,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:05:34,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:05:36,335 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.58 vs. limit=22.5 2023-10-04 00:05:38,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:05:39,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:05:41,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 00:05:42,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 00:05:42,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:05:45,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 00:05:45,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:05:47,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1454720.0, ans=0.1 2023-10-04 00:05:49,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:05:49,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:05:49,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 00:05:49,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:05:50,568 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.42 vs. limit=15.0 2023-10-04 00:05:50,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:05:50,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:05:50,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:54,110 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 00:05:54,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 00:05:58,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:05:58,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:06:00,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 00:06:02,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 00:06:04,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:06:08,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:06:08,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1454786.6666666667, ans=0.2 2023-10-04 00:06:10,197 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.00 vs. limit=10.0 2023-10-04 00:06:15,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 00:06:17,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:06:18,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 00:06:20,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:06:20,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:06:20,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 00:06:25,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:06:26,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 00:06:28,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:06:32,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:06:32,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 00:06:34,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 00:06:35,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 00:06:35,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1454920.0, ans=0.125 2023-10-04 00:06:37,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:06:37,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:06:40,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 00:06:42,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:06:42,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:06:44,247 INFO [train.py:1046] (1/4) Epoch 42, batch 450, loss[loss=0.1669, simple_loss=0.2546, pruned_loss=0.03966, over 24325.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2353, pruned_loss=0.03809, over 4223034.76 frames. ], batch size: 77, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:06:44,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:06:44,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1454986.6666666667, ans=0.2 2023-10-04 00:06:45,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 00:06:45,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:06:47,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:06:47,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:06:47,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 00:06:48,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:06:48,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:06:51,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:06:55,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1454986.6666666667, ans=10.0 2023-10-04 00:06:59,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:07:00,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:07:00,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1455053.3333333333, ans=0.125 2023-10-04 00:07:02,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 00:07:02,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 00:07:04,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1455053.3333333333, ans=0.125 2023-10-04 00:07:06,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:07:08,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:07:11,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:07:11,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1455053.3333333333, ans=0.1 2023-10-04 00:07:15,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:07:15,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:07:15,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1455120.0, ans=0.1 2023-10-04 00:07:17,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 00:07:18,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 00:07:21,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 00:07:21,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:07:22,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:07:23,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:07:24,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1455120.0, ans=0.05 2023-10-04 00:07:25,421 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 00:07:25,429 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 00:07:25,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:07:27,290 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.923e+02 2.068e+02 2.346e+02 3.607e+02, threshold=4.137e+02, percent-clipped=0.0 2023-10-04 00:07:27,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:07:27,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 00:07:30,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 00:07:31,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:07:31,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 00:07:33,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 00:07:34,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:07:36,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:07:36,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:07:39,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 00:07:39,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1455186.6666666667, ans=0.04949747468305833 2023-10-04 00:07:42,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:07:42,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 00:07:43,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 00:07:45,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:07:50,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:07:51,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:07:53,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:07:54,554 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 00:07:57,761 INFO [train.py:1046] (1/4) Epoch 42, batch 500, loss[loss=0.136, simple_loss=0.2193, pruned_loss=0.02636, over 24347.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2365, pruned_loss=0.03802, over 4337467.74 frames. ], batch size: 61, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:07:59,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:07:59,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:08:00,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:08:00,561 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 00:08:02,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 00:08:02,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:08:03,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1455320.0, ans=0.1 2023-10-04 00:08:06,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 00:08:09,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 00:08:10,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:08:10,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1455386.6666666667, ans=0.125 2023-10-04 00:08:12,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:08:12,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:08:13,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:08:24,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:08:26,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:08:27,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 00:08:27,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:08:27,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 00:08:27,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:08:29,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:08:32,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:08:32,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:08:32,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:08:32,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 00:08:37,005 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 00:08:39,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:08:42,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:08:42,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:08:42,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:08:43,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:08:45,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 00:08:47,557 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.52 vs. limit=15.0 2023-10-04 00:08:48,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:08:49,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:08:52,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:08:55,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:08:55,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1455586.6666666667, ans=0.125 2023-10-04 00:09:02,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:09:04,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 00:09:04,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:09:06,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:09:09,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 00:09:10,889 INFO [train.py:1046] (1/4) Epoch 42, batch 550, loss[loss=0.1442, simple_loss=0.2369, pruned_loss=0.02571, over 24625.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2375, pruned_loss=0.03821, over 4432594.86 frames. ], batch size: 73, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:09:10,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:09:12,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:09:16,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 00:09:17,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 00:09:17,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:09:17,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 00:09:17,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:09:17,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:09:19,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:19,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:19,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:09:19,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1455653.3333333333, ans=0.0 2023-10-04 00:09:20,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:09:22,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:09:22,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 00:09:22,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:09:22,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1455653.3333333333, ans=0.125 2023-10-04 00:09:27,393 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=4.10 vs. limit=12.0 2023-10-04 00:09:29,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:09:29,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:32,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:09:33,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:34,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 00:09:36,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 00:09:36,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:09:42,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:09:42,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:09:43,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:09:45,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:09:45,109 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 00:09:46,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:48,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 00:09:52,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:09:52,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:09:52,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:09:53,655 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.909e+02 2.051e+02 2.343e+02 3.682e+02, threshold=4.101e+02, percent-clipped=0.0 2023-10-04 00:09:53,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:09:55,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 00:09:57,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 00:09:57,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:09:57,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:09:58,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:09:58,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:10:01,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:10:04,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:10:05,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:10:07,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:10:09,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 00:10:10,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:10:11,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:10:12,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:10:12,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1455920.0, ans=0.0 2023-10-04 00:10:13,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:10:14,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 00:10:14,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 00:10:21,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 00:10:24,573 INFO [train.py:1046] (1/4) Epoch 42, batch 600, loss[loss=0.1614, simple_loss=0.2519, pruned_loss=0.03541, over 24656.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2379, pruned_loss=0.03847, over 4481058.90 frames. ], batch size: 68, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:10:24,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 00:10:26,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:10:26,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:10:26,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:10:33,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:10:34,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 00:10:37,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 00:10:40,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:10:40,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:10:41,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:10:44,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 00:10:44,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:10:51,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 00:10:54,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:10:54,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:10:54,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:11:00,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:11:02,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:11:02,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:11:08,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:11:13,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:11:13,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:11:13,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:11:18,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 00:11:19,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1456186.6666666667, ans=0.1 2023-10-04 00:11:21,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1456186.6666666667, ans=0.015 2023-10-04 00:11:24,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 00:11:24,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:11:27,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 00:11:28,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:11:30,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 00:11:30,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:11:32,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:11:38,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 00:11:40,237 INFO [train.py:1046] (1/4) Epoch 42, batch 650, loss[loss=0.1441, simple_loss=0.2207, pruned_loss=0.03374, over 23353.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2359, pruned_loss=0.03825, over 4519075.08 frames. ], batch size: 119, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:11:40,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 00:11:42,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:11:44,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:11:44,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1456320.0, ans=0.1 2023-10-04 00:11:47,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:11:48,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 00:11:48,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:11:55,438 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.97 vs. limit=15.0 2023-10-04 00:11:56,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:11:56,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:11:59,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:12:02,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 00:12:02,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1456386.6666666667, ans=0.0 2023-10-04 00:12:05,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:12:05,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:12:09,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:12:09,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 00:12:13,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:12:14,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:14,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:12:14,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:16,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:12:18,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:12:18,742 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 00:12:18,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:12:18,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:12:21,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1456453.3333333333, ans=0.125 2023-10-04 00:12:22,744 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 1.909e+02 2.145e+02 2.381e+02 3.459e+02, threshold=4.289e+02, percent-clipped=0.0 2023-10-04 00:12:22,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:22,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:12:22,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:12:24,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:12:26,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 00:12:26,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:12:26,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:12:27,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:12:27,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:12:29,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:12:31,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 00:12:32,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 00:12:32,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:32,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:12:32,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:12:32,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:12:35,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:12:37,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1456520.0, ans=0.125 2023-10-04 00:12:41,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:43,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:12:43,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:12:46,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:12:46,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:12:46,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:12:46,831 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.78 vs. limit=15.0 2023-10-04 00:12:53,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:12:53,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:12:53,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:12:54,877 INFO [train.py:1046] (1/4) Epoch 42, batch 700, loss[loss=0.1559, simple_loss=0.2443, pruned_loss=0.03375, over 24654.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2353, pruned_loss=0.03817, over 4562608.30 frames. ], batch size: 65, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:12:54,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:12:59,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 00:13:00,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 00:13:04,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 00:13:05,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:13:05,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:13:06,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 00:13:12,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:13:14,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:13:15,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1456720.0, ans=0.1 2023-10-04 00:13:16,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:13:17,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:13:19,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:13:20,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:13:23,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 00:13:23,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:13:24,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 00:13:26,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1456786.6666666667, ans=0.125 2023-10-04 00:13:28,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 00:13:32,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:13:32,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:13:35,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:13:38,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:13:39,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 00:13:43,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:13:43,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=1456853.3333333333, ans=0.05 2023-10-04 00:13:45,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:13:45,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 00:13:49,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:13:50,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:13:52,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:13:55,316 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.00 vs. limit=22.5 2023-10-04 00:13:57,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:13:57,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 00:14:01,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 00:14:01,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 00:14:04,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:14:05,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:14:06,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:14:06,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:14:06,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 00:14:10,010 INFO [train.py:1046] (1/4) Epoch 42, batch 750, loss[loss=0.1615, simple_loss=0.2377, pruned_loss=0.04266, over 23817.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.235, pruned_loss=0.03801, over 4592286.17 frames. ], batch size: 179, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:14:11,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 00:14:11,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 00:14:11,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 00:14:11,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 00:14:11,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 00:14:11,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1456986.6666666667, ans=0.125 2023-10-04 00:14:13,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:14:13,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 00:14:14,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:14:16,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:14:16,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:14:18,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:14:19,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:14:20,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:14:20,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1456986.6666666667, ans=0.125 2023-10-04 00:14:20,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1456986.6666666667, ans=0.0 2023-10-04 00:14:23,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:14:23,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:14:25,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:14:26,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:14:27,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:14:28,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 00:14:30,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1457053.3333333333, ans=0.125 2023-10-04 00:14:31,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:14:31,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:14:32,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:14:32,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1457053.3333333333, ans=0.125 2023-10-04 00:14:34,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:14:34,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 00:14:34,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:14:35,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1457053.3333333333, ans=0.125 2023-10-04 00:14:36,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 00:14:36,951 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 00:14:38,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 00:14:38,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:14:38,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:14:40,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:14:43,759 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:14:45,337 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=7.73 vs. limit=12.0 2023-10-04 00:14:46,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:14:46,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:14:46,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1457120.0, ans=0.0 2023-10-04 00:14:47,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:14:48,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:14:50,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:14:50,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 00:14:52,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:14:52,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 00:14:53,510 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 1.922e+02 2.060e+02 2.309e+02 3.255e+02, threshold=4.120e+02, percent-clipped=0.0 2023-10-04 00:14:53,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:14:53,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1457186.6666666667, ans=0.2 2023-10-04 00:14:55,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:14:55,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 00:14:56,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:15:02,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:15:02,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:15:03,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:06,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:15:09,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 00:15:09,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:15:11,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:15:13,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:15:13,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:15:16,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:15:16,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:15:24,721 INFO [train.py:1046] (1/4) Epoch 42, batch 800, loss[loss=0.1656, simple_loss=0.2402, pruned_loss=0.04547, over 23636.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2364, pruned_loss=0.03868, over 4609766.14 frames. ], batch size: 232, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:15:26,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:15:26,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:15:26,436 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:15:29,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:15:29,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:15:30,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:15:31,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:33,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:15:36,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:15:36,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:15:36,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1457320.0, ans=0.125 2023-10-04 00:15:36,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1457320.0, ans=0.125 2023-10-04 00:15:40,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 00:15:40,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:42,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:15:42,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:15:42,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:15:44,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 00:15:44,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:15:44,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 00:15:47,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:15:49,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:15:51,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:15:52,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:15:56,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:56,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:16:00,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:16:01,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:16:01,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 00:16:03,679 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 00:16:04,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 00:16:04,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:16:04,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:16:06,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:16:06,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:16:11,957 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 00:16:13,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 00:16:15,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:16:16,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:16:17,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1457520.0, ans=0.09899494936611666 2023-10-04 00:16:20,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:16:23,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:16:25,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 00:16:27,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:16:29,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 00:16:29,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1457586.6666666667, ans=0.025 2023-10-04 00:16:35,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:16:36,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:16:37,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 00:16:38,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:16:38,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:16:38,700 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.19 vs. limit=12.0 2023-10-04 00:16:39,410 INFO [train.py:1046] (1/4) Epoch 42, batch 850, loss[loss=0.1553, simple_loss=0.2366, pruned_loss=0.03695, over 24031.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2372, pruned_loss=0.03873, over 4631930.32 frames. ], batch size: 80, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:16:39,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 00:16:39,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:16:40,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:16:40,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:16:43,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:16:43,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1457653.3333333333, ans=0.125 2023-10-04 00:16:45,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:16:45,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 00:16:46,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 00:16:46,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 00:16:46,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:16:48,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:16:49,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:16:50,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:16:51,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:16:55,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:16:56,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:16:57,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 00:16:57,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1457720.0, ans=0.1 2023-10-04 00:17:00,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 00:17:01,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:17:03,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 00:17:05,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 00:17:07,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 00:17:10,019 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 00:17:10,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:17:10,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:17:10,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 00:17:10,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1457786.6666666667, ans=0.125 2023-10-04 00:17:12,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:17:14,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:17:14,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 00:17:16,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1457786.6666666667, ans=0.125 2023-10-04 00:17:17,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:17:18,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1457786.6666666667, ans=0.2 2023-10-04 00:17:19,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:17:20,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:17:21,767 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.961e+02 2.256e+02 2.503e+02 3.661e+02, threshold=4.513e+02, percent-clipped=0.0 2023-10-04 00:17:21,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:17:23,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:17:24,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 00:17:24,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 00:17:28,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:17:28,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:17:29,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:17:29,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:17:29,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:17:31,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:17:34,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:17:35,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:17:35,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:17:37,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:17:45,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:17:47,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:17:47,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 00:17:47,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:17:47,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:17:50,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 00:17:52,886 INFO [train.py:1046] (1/4) Epoch 42, batch 900, loss[loss=0.1612, simple_loss=0.2315, pruned_loss=0.04541, over 23826.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2378, pruned_loss=0.03901, over 4646005.08 frames. ], batch size: 195, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:17:57,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:18:00,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:18:00,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 00:18:03,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:18:05,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 00:18:05,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 00:18:06,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:18:06,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:18:08,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:18:08,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:18:10,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1458053.3333333333, ans=0.0 2023-10-04 00:18:19,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:18:19,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:18:19,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:18:20,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:18:23,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1458120.0, ans=10.0 2023-10-04 00:18:25,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 00:18:28,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:18:32,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:18:34,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:18:36,041 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 00:18:36,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 00:18:42,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:18:42,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:18:44,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:18:49,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:18:49,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:18:51,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 00:18:51,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:18:54,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 00:18:54,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1458253.3333333333, ans=0.1 2023-10-04 00:18:55,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:18:55,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:18:57,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:18:57,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:19:00,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 00:19:00,177 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 00:19:03,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 00:19:03,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 00:19:06,348 INFO [train.py:1046] (1/4) Epoch 42, batch 950, loss[loss=0.1558, simple_loss=0.2371, pruned_loss=0.03723, over 23324.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2384, pruned_loss=0.03929, over 4654690.89 frames. ], batch size: 119, lr: 2.43e-03, grad_scale: 4.0 2023-10-04 00:19:06,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:19:09,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 00:19:14,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:19:15,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1458320.0, ans=0.125 2023-10-04 00:19:16,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:19:16,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:19:16,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 00:19:19,662 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 00:19:23,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:19:24,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:19:24,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:19:24,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:19:24,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 00:19:25,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1458386.6666666667, ans=0.1 2023-10-04 00:19:26,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 00:19:28,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:19:29,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 00:19:30,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:19:36,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:19:36,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:19:37,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:19:37,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 00:19:37,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1458453.3333333333, ans=0.125 2023-10-04 00:19:40,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 00:19:41,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:19:43,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:19:47,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:19:47,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:19:50,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 00:19:51,441 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.895e+02 2.094e+02 2.430e+02 3.415e+02, threshold=4.187e+02, percent-clipped=0.0 2023-10-04 00:19:52,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 00:19:52,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:19:52,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:19:54,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:19:54,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:19:57,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 00:19:59,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:20:02,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:20:02,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:20:02,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 00:20:02,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:20:02,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:20:03,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 00:20:08,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:20:10,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:20:13,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:20:15,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 00:20:15,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 00:20:17,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:20:18,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1458653.3333333333, ans=0.1 2023-10-04 00:20:19,153 INFO [train.py:1046] (1/4) Epoch 42, batch 1000, loss[loss=0.1547, simple_loss=0.2445, pruned_loss=0.03246, over 24568.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2371, pruned_loss=0.03917, over 4662638.68 frames. ], batch size: 71, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:20:22,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 00:20:23,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:20:28,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:20:29,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 00:20:29,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 00:20:34,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:20:34,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:20:35,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:20:35,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1458720.0, ans=0.125 2023-10-04 00:20:38,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 00:20:39,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 00:20:42,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 00:20:42,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:20:44,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 00:20:45,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 00:20:45,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 00:20:46,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:20:48,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:20:55,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1458786.6666666667, ans=0.1 2023-10-04 00:20:56,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:20:57,291 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=10.95 vs. limit=22.5 2023-10-04 00:20:58,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:21:00,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:00,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:21:00,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 00:21:00,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:21:02,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:21:03,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:21:03,342 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 00:21:06,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 00:21:08,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 00:21:09,782 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.93 vs. limit=15.0 2023-10-04 00:21:10,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 00:21:11,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:21:13,773 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.28 vs. limit=22.5 2023-10-04 00:21:17,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:17,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:21:17,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:17,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1458920.0, ans=0.0 2023-10-04 00:21:19,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:21:19,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1458920.0, ans=0.125 2023-10-04 00:21:20,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1458920.0, ans=0.07 2023-10-04 00:21:21,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 00:21:22,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:21:22,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 00:21:22,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1458920.0, ans=0.1 2023-10-04 00:21:23,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 00:21:24,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:21:24,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:21:26,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:21:28,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:21:30,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:21:32,048 INFO [train.py:1046] (1/4) Epoch 42, batch 1050, loss[loss=0.1595, simple_loss=0.2285, pruned_loss=0.04528, over 23511.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2349, pruned_loss=0.03868, over 4667071.26 frames. ], batch size: 285, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:21:35,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:21:35,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:21:38,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 00:21:39,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:41,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:21:42,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:21:44,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:21:46,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:21:48,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:21:48,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:21:49,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:21:49,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 00:21:51,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:21:51,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 00:21:52,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:21:53,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 00:21:53,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:21:55,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1459053.3333333333, ans=0.125 2023-10-04 00:21:58,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:58,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:22:00,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:22:01,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 00:22:01,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 00:22:01,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:22:05,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 00:22:08,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1459120.0, ans=0.0 2023-10-04 00:22:09,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 00:22:09,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:22:12,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 00:22:13,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1459120.0, ans=0.125 2023-10-04 00:22:14,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1459120.0, ans=0.0 2023-10-04 00:22:15,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 00:22:15,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:22:15,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:22:18,167 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.961e+02 2.126e+02 2.339e+02 3.298e+02, threshold=4.252e+02, percent-clipped=0.0 2023-10-04 00:22:19,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:22:21,606 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.46 vs. limit=22.5 2023-10-04 00:22:22,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 00:22:23,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 00:22:24,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 00:22:25,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:22:25,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:22:26,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 00:22:26,566 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.68 vs. limit=22.5 2023-10-04 00:22:28,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:22:30,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:22:30,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:22:31,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:22:32,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:22:36,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1459253.3333333333, ans=0.125 2023-10-04 00:22:37,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:22:37,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 00:22:38,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:22:38,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 00:22:40,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 00:22:40,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:22:43,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1459253.3333333333, ans=0.0 2023-10-04 00:22:44,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:22:46,935 INFO [train.py:1046] (1/4) Epoch 42, batch 1100, loss[loss=0.162, simple_loss=0.2486, pruned_loss=0.03765, over 24470.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.234, pruned_loss=0.03828, over 4663627.62 frames. ], batch size: 77, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:22:48,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:22:52,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:22:52,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:22:52,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:22:54,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 00:22:54,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1459320.0, ans=0.1 2023-10-04 00:22:55,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:22:59,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:23:00,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:23:02,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1459386.6666666667, ans=0.1 2023-10-04 00:23:03,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:23:03,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 00:23:03,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 00:23:05,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:23:05,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:23:08,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:23:11,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:23:15,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:23:18,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 00:23:18,131 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 00:23:18,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1459453.3333333333, ans=0.0 2023-10-04 00:23:19,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:23:19,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:23:21,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:23:21,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:23:22,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 00:23:24,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:23:24,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:23:24,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:23:24,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:23:24,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 00:23:26,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1459453.3333333333, ans=0.125 2023-10-04 00:23:31,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:23:33,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 00:23:34,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:23:40,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:23:43,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 00:23:43,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 00:23:45,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:23:47,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:23:47,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:23:49,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 00:23:49,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1459586.6666666667, ans=0.2 2023-10-04 00:23:50,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:23:50,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:23:50,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1459586.6666666667, ans=0.0 2023-10-04 00:23:52,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 00:23:52,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:23:52,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 00:23:53,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:23:53,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:23:55,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:24:00,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:24:01,477 INFO [train.py:1046] (1/4) Epoch 42, batch 1150, loss[loss=0.1553, simple_loss=0.2372, pruned_loss=0.03666, over 23610.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2353, pruned_loss=0.03854, over 4665850.77 frames. ], batch size: 256, lr: 2.43e-03, grad_scale: 4.0 2023-10-04 00:24:01,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:24:03,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:24:03,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:24:04,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 00:24:04,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:24:07,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 00:24:08,627 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.59 vs. limit=15.0 2023-10-04 00:24:09,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:24:09,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:24:15,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 00:24:16,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:24:20,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:24:20,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:24:21,520 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.81 vs. limit=15.0 2023-10-04 00:24:22,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 00:24:22,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:24:22,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:24:25,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 00:24:25,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:24:27,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:24:36,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:24:41,728 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.19 vs. limit=10.0 2023-10-04 00:24:42,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:24:42,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 00:24:42,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:24:43,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:24:46,362 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 00:24:48,977 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.006e+02 2.296e+02 2.643e+02 4.791e+02, threshold=4.591e+02, percent-clipped=2.0 2023-10-04 00:24:49,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:24:55,676 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 00:24:59,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:25:01,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:25:01,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:25:02,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:25:06,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:25:11,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:25:13,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:25:14,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:25:14,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:25:14,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:25:15,932 INFO [train.py:1046] (1/4) Epoch 42, batch 1200, loss[loss=0.1794, simple_loss=0.264, pruned_loss=0.04742, over 23697.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2363, pruned_loss=0.03892, over 4674388.14 frames. ], batch size: 85, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:25:17,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:25:19,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:25:20,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:25:20,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:25:23,318 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 00:25:24,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 00:25:26,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:25:30,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:25:32,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:25:35,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:25:35,521 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 00:25:36,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:25:43,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:25:43,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:25:44,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 00:25:45,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:25:47,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 00:25:51,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 00:25:51,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:25:52,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1460120.0, ans=0.125 2023-10-04 00:25:53,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:25:54,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:25:56,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:25:58,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:25:58,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:25:58,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:25:58,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 00:25:58,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1460120.0, ans=0.05 2023-10-04 00:26:01,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:26:01,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:26:01,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:26:02,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:26:02,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:26:08,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 00:26:09,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:26:11,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 00:26:15,718 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 00:26:16,360 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.62 vs. limit=15.0 2023-10-04 00:26:18,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:26:21,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:26:22,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:26:23,373 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.95 vs. limit=15.0 2023-10-04 00:26:24,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:26:26,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 00:26:29,721 INFO [train.py:1046] (1/4) Epoch 42, batch 1250, loss[loss=0.1607, simple_loss=0.2512, pruned_loss=0.03516, over 24093.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2367, pruned_loss=0.03883, over 4690032.34 frames. ], batch size: 80, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:26:31,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:26:33,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:26:35,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 00:26:36,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:26:36,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:26:36,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1460320.0, ans=0.0 2023-10-04 00:26:40,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 00:26:42,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:26:44,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:26:44,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:26:46,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:26:48,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 00:26:48,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:26:48,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:26:49,444 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.18 vs. limit=15.0 2023-10-04 00:26:51,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:26:51,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:26:55,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:26:55,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:27:01,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 00:27:01,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:27:04,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:27:06,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 00:27:06,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:27:06,597 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 00:27:06,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:27:06,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:27:09,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:27:12,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:27:14,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:27:15,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 00:27:15,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1460520.0, ans=0.0 2023-10-04 00:27:17,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 00:27:17,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 00:27:18,735 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.951e+02 2.115e+02 2.289e+02 3.132e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-04 00:27:21,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:27:23,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 00:27:23,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:27:25,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 00:27:25,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:27:27,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 00:27:27,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 00:27:28,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:27:28,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 00:27:28,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:27:30,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 00:27:32,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:27:33,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:27:34,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:27:37,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:27:40,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:27:42,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 00:27:45,170 INFO [train.py:1046] (1/4) Epoch 42, batch 1300, loss[loss=0.1436, simple_loss=0.2134, pruned_loss=0.0369, over 23596.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2367, pruned_loss=0.03848, over 4704204.27 frames. ], batch size: 256, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:27:46,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:27:46,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1460653.3333333333, ans=0.125 2023-10-04 00:27:47,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 00:27:48,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:27:51,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:27:52,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:27:54,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 00:27:57,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:27:57,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1460653.3333333333, ans=0.0 2023-10-04 00:27:58,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:27:59,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 00:28:04,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:28:08,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:28:08,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1460720.0, ans=0.125 2023-10-04 00:28:10,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:28:11,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:28:11,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:28:12,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:28:12,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 00:28:14,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 00:28:17,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1460786.6666666667, ans=0.1 2023-10-04 00:28:18,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1460786.6666666667, ans=0.0 2023-10-04 00:28:20,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:28:21,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:28:22,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 00:28:24,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 00:28:24,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:28:27,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:28:28,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 00:28:28,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:28:30,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 00:28:31,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:28:35,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:28:35,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:28:36,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1460853.3333333333, ans=0.125 2023-10-04 00:28:37,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 00:28:39,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 00:28:42,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 00:28:45,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:28:48,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 00:28:48,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1460920.0, ans=0.0 2023-10-04 00:28:50,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1460920.0, ans=0.125 2023-10-04 00:28:51,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:28:56,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 00:28:58,616 INFO [train.py:1046] (1/4) Epoch 42, batch 1350, loss[loss=0.1564, simple_loss=0.2415, pruned_loss=0.03563, over 24495.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2351, pruned_loss=0.03794, over 4700817.34 frames. ], batch size: 66, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:29:00,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:29:02,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:29:03,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1460986.6666666667, ans=0.1 2023-10-04 00:29:05,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:29:05,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:29:08,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:29:09,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:29:15,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:29:17,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 00:29:17,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:29:19,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:29:20,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 00:29:20,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:29:22,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:29:22,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 00:29:23,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 00:29:26,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 00:29:27,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:29:27,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 00:29:33,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1461120.0, ans=0.125 2023-10-04 00:29:39,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:29:46,270 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.991e+02 2.237e+02 2.537e+02 4.042e+02, threshold=4.474e+02, percent-clipped=0.0 2023-10-04 00:29:47,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:29:49,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:29:49,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 00:29:51,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:29:51,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 00:29:51,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:29:53,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:29:54,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:29:56,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 00:29:57,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:30:01,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1461253.3333333333, ans=0.125 2023-10-04 00:30:02,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1461253.3333333333, ans=0.125 2023-10-04 00:30:03,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 00:30:04,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 00:30:06,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1461253.3333333333, ans=0.125 2023-10-04 00:30:08,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1461253.3333333333, ans=0.125 2023-10-04 00:30:09,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 00:30:10,187 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.62 vs. limit=15.0 2023-10-04 00:30:11,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:30:12,372 INFO [train.py:1046] (1/4) Epoch 42, batch 1400, loss[loss=0.1696, simple_loss=0.253, pruned_loss=0.04312, over 24329.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2347, pruned_loss=0.03778, over 4715072.62 frames. ], batch size: 77, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:30:16,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:30:16,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:30:19,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 00:30:20,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 00:30:29,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:30:30,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:30:33,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:30:33,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:30:38,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:30:38,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 00:30:46,283 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:30:47,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:30:47,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:30:50,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1461453.3333333333, ans=0.1 2023-10-04 00:30:51,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 00:30:52,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:30:52,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:30:54,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:30:54,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:30:55,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:30:55,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:30:57,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:30:57,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 00:30:58,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:31:03,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:07,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:31:09,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_na.min_abs, batch_count=1461520.0, ans=0.02 2023-10-04 00:31:14,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 00:31:15,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 00:31:18,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:31:18,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1461586.6666666667, ans=0.125 2023-10-04 00:31:19,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 00:31:21,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:31:23,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:31:26,682 INFO [train.py:1046] (1/4) Epoch 42, batch 1450, loss[loss=0.1711, simple_loss=0.2528, pruned_loss=0.04474, over 23737.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2344, pruned_loss=0.03787, over 4709576.86 frames. ], batch size: 85, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:31:26,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:31:28,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:31:28,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:28,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 00:31:33,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:31:34,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:31:35,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:31:35,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 00:31:37,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:31:37,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 00:31:39,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:40,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:31:40,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 00:31:41,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:31:41,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:31:43,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 00:31:43,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:31:45,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:31:46,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:48,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:31:53,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:31:53,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:31:55,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:31:55,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:58,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:31:58,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:31:58,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:58,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:32:02,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 00:32:03,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:32:09,644 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 00:32:11,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:32:13,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:32:14,444 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.045e+02 2.411e+02 2.943e+02 4.436e+02, threshold=4.821e+02, percent-clipped=0.0 2023-10-04 00:32:14,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:32:14,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 00:32:19,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:32:20,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 00:32:22,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 00:32:23,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:32:26,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:32:26,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:32:27,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 00:32:30,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 00:32:30,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 00:32:30,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:32:31,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:32:40,508 INFO [train.py:1046] (1/4) Epoch 42, batch 1500, loss[loss=0.1395, simple_loss=0.2189, pruned_loss=0.03004, over 24443.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2348, pruned_loss=0.03804, over 4708123.70 frames. ], batch size: 58, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:32:43,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 00:32:44,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:32:44,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:32:46,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:32:46,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:32:48,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:32:48,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 00:32:48,673 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.64 vs. limit=15.0 2023-10-04 00:32:49,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:32:49,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:32:49,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:32:51,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:32:54,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:32:55,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:32:57,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1462053.3333333333, ans=0.0 2023-10-04 00:33:02,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:33:02,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 00:33:02,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:33:02,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:33:03,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:33:06,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 00:33:12,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 00:33:14,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:33:14,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 00:33:16,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 00:33:18,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:33:18,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:33:18,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:33:22,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 00:33:22,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:33:22,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:33:23,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 00:33:23,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:33:29,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:33:29,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 00:33:34,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:33:36,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:33:40,393 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 00:33:40,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:33:40,453 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 00:33:41,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:33:43,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:33:43,697 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 00:33:45,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:33:47,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 00:33:48,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:33:52,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:33:52,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:33:52,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:33:53,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:33:54,246 INFO [train.py:1046] (1/4) Epoch 42, batch 1550, loss[loss=0.1578, simple_loss=0.2385, pruned_loss=0.03856, over 23659.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2356, pruned_loss=0.0383, over 4704653.89 frames. ], batch size: 149, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:33:54,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:33:55,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 00:33:55,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 00:33:57,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:33:57,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 00:33:59,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 00:34:00,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:34:02,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:34:02,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:34:02,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:34:02,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:34:04,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:34:07,100 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 00:34:07,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:34:07,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:34:07,801 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.95 vs. limit=15.0 2023-10-04 00:34:08,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:34:08,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1462386.6666666667, ans=0.125 2023-10-04 00:34:10,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:34:10,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 00:34:11,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:34:11,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 00:34:13,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 00:34:13,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 00:34:14,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:34:14,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:34:20,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:34:21,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 00:34:21,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 00:34:24,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1462453.3333333333, ans=0.0 2023-10-04 00:34:25,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1462453.3333333333, ans=0.1 2023-10-04 00:34:30,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:34:34,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:34:34,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:34:34,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:34:35,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 00:34:41,000 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.990e+02 2.197e+02 2.410e+02 4.079e+02, threshold=4.394e+02, percent-clipped=0.0 2023-10-04 00:34:42,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:34:42,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:34:46,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:34:47,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:34:48,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:34:48,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 00:34:50,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:34:52,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:34:52,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:34:53,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 00:34:53,511 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 00:34:56,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:35:00,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1462586.6666666667, ans=0.125 2023-10-04 00:35:01,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 00:35:05,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:35:06,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:35:07,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 00:35:08,286 INFO [train.py:1046] (1/4) Epoch 42, batch 1600, loss[loss=0.1765, simple_loss=0.2643, pruned_loss=0.04435, over 23964.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.236, pruned_loss=0.03827, over 4712003.69 frames. ], batch size: 86, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:35:08,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:35:09,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:35:09,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:35:09,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:35:11,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:35:15,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:35:15,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 00:35:17,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 00:35:17,866 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:35:19,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 00:35:22,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:35:23,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 00:35:23,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:35:27,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:35:30,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1462720.0, ans=0.125 2023-10-04 00:35:31,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:35:31,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1462720.0, ans=0.07 2023-10-04 00:35:35,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 00:35:38,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:35:38,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 00:35:39,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:35:39,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 00:35:45,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 00:35:47,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1462786.6666666667, ans=0.125 2023-10-04 00:35:51,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:35:51,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 00:35:53,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:35:53,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:35:53,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:35:55,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 00:35:55,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1462853.3333333333, ans=0.125 2023-10-04 00:36:01,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 00:36:02,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:36:02,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:36:04,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:36:04,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:36:07,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:36:07,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:36:07,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1462920.0, ans=0.125 2023-10-04 00:36:07,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1462920.0, ans=0.125 2023-10-04 00:36:08,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:36:14,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:36:15,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:36:17,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 00:36:17,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:36:18,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 00:36:21,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1462986.6666666667, ans=0.125 2023-10-04 00:36:22,921 INFO [train.py:1046] (1/4) Epoch 42, batch 1650, loss[loss=0.153, simple_loss=0.2472, pruned_loss=0.02939, over 24653.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2373, pruned_loss=0.03866, over 4704919.22 frames. ], batch size: 68, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:36:23,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:36:24,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:36:25,725 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.22 vs. limit=22.5 2023-10-04 00:36:26,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:36:26,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 00:36:26,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 00:36:26,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 00:36:26,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 00:36:30,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:36:32,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:36:34,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:36:34,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:36:35,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:36:36,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 00:36:39,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:36:39,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:36:39,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:36:39,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:36:41,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 00:36:41,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 00:36:46,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:36:48,178 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:36:48,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1463053.3333333333, ans=0.125 2023-10-04 00:36:49,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:36:57,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 00:36:58,092 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.49 vs. limit=10.0 2023-10-04 00:36:58,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:37:00,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 00:37:01,047 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:37:04,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:37:06,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:37:06,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:37:06,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:37:08,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:37:08,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:37:10,445 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.89 vs. limit=15.0 2023-10-04 00:37:11,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:37:11,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:37:11,810 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.65 vs. limit=15.0 2023-10-04 00:37:12,339 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 1.974e+02 2.211e+02 2.480e+02 3.454e+02, threshold=4.423e+02, percent-clipped=0.0 2023-10-04 00:37:12,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:37:12,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:37:13,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:37:15,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:37:16,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:37:17,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 00:37:18,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:37:20,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 00:37:20,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1463186.6666666667, ans=0.125 2023-10-04 00:37:21,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 00:37:21,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 00:37:21,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:37:21,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:37:23,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:37:24,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:37:24,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 00:37:28,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:37:30,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:37:30,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:37:32,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 00:37:38,123 INFO [train.py:1046] (1/4) Epoch 42, batch 1700, loss[loss=0.1516, simple_loss=0.2075, pruned_loss=0.04786, over 19601.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2362, pruned_loss=0.03838, over 4706790.01 frames. ], batch size: 388, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:37:38,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:37:38,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:37:38,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 00:37:39,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:37:39,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:37:39,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:37:40,129 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.51 vs. limit=22.5 2023-10-04 00:37:43,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:37:43,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:37:43,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 00:37:45,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:37:46,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1463320.0, ans=0.05 2023-10-04 00:37:51,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:37:54,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:38:00,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:38:01,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:38:01,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:38:01,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:38:04,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 00:38:06,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:38:06,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:38:07,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:38:07,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:38:10,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 00:38:10,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 00:38:12,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:38:13,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 00:38:13,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1463453.3333333333, ans=0.1 2023-10-04 00:38:14,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:38:21,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:38:21,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:38:22,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:38:25,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 00:38:25,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 00:38:25,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:38:28,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:38:28,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 00:38:29,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:38:29,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:38:29,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:38:29,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:38:34,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:38:34,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:38:35,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:38:37,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:38:37,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:38:41,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:38:42,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 00:38:44,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:38:45,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:38:47,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 00:38:50,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1463653.3333333333, ans=0.125 2023-10-04 00:38:52,290 INFO [train.py:1046] (1/4) Epoch 42, batch 1750, loss[loss=0.1428, simple_loss=0.2198, pruned_loss=0.03289, over 24426.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.235, pruned_loss=0.03799, over 4702976.59 frames. ], batch size: 58, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:38:53,164 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.24 vs. limit=15.0 2023-10-04 00:38:55,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:38:56,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:38:56,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:38:58,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 00:38:59,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:39:01,633 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.86 vs. limit=15.0 2023-10-04 00:39:02,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:39:02,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:39:06,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 00:39:08,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:39:09,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 00:39:09,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:39:11,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:39:12,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1463720.0, ans=0.0 2023-10-04 00:39:13,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 00:39:16,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 00:39:17,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:39:18,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 00:39:26,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1463786.6666666667, ans=0.1 2023-10-04 00:39:27,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:39:29,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:39:30,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:39:32,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:39:33,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:39:35,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:39:37,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:39:39,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:39:40,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1463853.3333333333, ans=0.1 2023-10-04 00:39:41,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:39:42,565 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.890e+02 2.169e+02 2.400e+02 4.108e+02, threshold=4.338e+02, percent-clipped=0.0 2023-10-04 00:39:42,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 00:39:43,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:39:45,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 00:39:46,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:39:48,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:39:49,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:39:53,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:39:53,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 00:39:53,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:39:54,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:39:56,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:39:59,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:39:59,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:39:59,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1463920.0, ans=0.0 2023-10-04 00:40:00,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 00:40:00,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:40:02,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:40:04,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:40:04,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:40:04,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:40:04,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:40:06,913 INFO [train.py:1046] (1/4) Epoch 42, batch 1800, loss[loss=0.1477, simple_loss=0.2288, pruned_loss=0.03335, over 23560.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2344, pruned_loss=0.03785, over 4703566.29 frames. ], batch size: 149, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:40:07,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:40:08,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:40:10,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 00:40:12,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:40:14,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 00:40:15,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:40:17,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:40:20,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:40:20,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:40:21,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:40:25,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:40:25,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 00:40:25,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:40:28,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:40:31,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 00:40:33,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 00:40:33,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 00:40:33,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:40:36,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:40:36,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:40:36,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:40:40,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1464120.0, ans=0.125 2023-10-04 00:40:42,902 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 00:40:43,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:40:45,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:40:48,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 00:40:50,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 00:40:50,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:40:51,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:40:53,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:40:53,286 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:40:55,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 00:40:56,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=1464186.6666666667, ans=15.0 2023-10-04 00:41:04,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:41:05,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 00:41:06,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:41:06,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:41:07,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1464253.3333333333, ans=0.2 2023-10-04 00:41:08,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:41:08,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 00:41:11,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:41:11,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:41:12,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 00:41:12,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:41:13,334 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.76 vs. limit=10.0 2023-10-04 00:41:15,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:41:15,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:41:15,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:41:17,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:41:19,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:41:21,133 INFO [train.py:1046] (1/4) Epoch 42, batch 1850, loss[loss=0.1764, simple_loss=0.2559, pruned_loss=0.04839, over 23375.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2351, pruned_loss=0.0382, over 4687133.73 frames. ], batch size: 93, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:41:21,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:41:21,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:41:22,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1464320.0, ans=0.125 2023-10-04 00:41:23,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:41:24,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:41:24,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1464320.0, ans=0.1 2023-10-04 00:41:27,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=1464320.0, ans=6.0 2023-10-04 00:41:30,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:41:31,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 00:41:36,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1464386.6666666667, ans=0.1 2023-10-04 00:41:38,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 00:41:40,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 00:41:43,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:41:43,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 00:41:43,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 00:41:46,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1464386.6666666667, ans=0.125 2023-10-04 00:41:54,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:41:55,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 00:41:58,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:41:58,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:41:58,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1464453.3333333333, ans=0.125 2023-10-04 00:42:02,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 00:42:02,995 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.96 vs. limit=15.0 2023-10-04 00:42:03,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:03,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 00:42:05,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:42:07,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:42:10,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:42:12,390 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.933e+02 2.158e+02 2.386e+02 3.653e+02, threshold=4.316e+02, percent-clipped=0.0 2023-10-04 00:42:13,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:42:13,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:15,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 00:42:15,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:42:15,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1464520.0, ans=0.125 2023-10-04 00:42:16,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:42:18,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:42:20,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 00:42:22,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:42:25,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:42:26,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:42:26,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 00:42:26,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 00:42:28,233 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 00:42:29,599 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 00:42:31,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:42:31,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:42:31,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:42:32,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:32,443 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 00:42:32,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:42:32,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:32,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1464586.6666666667, ans=0.125 2023-10-04 00:42:33,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:42:35,116 INFO [train.py:1046] (1/4) Epoch 42, batch 1900, loss[loss=0.1683, simple_loss=0.2424, pruned_loss=0.04712, over 23574.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2352, pruned_loss=0.03804, over 4692868.20 frames. ], batch size: 256, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:42:36,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:42:36,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:42:36,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 00:42:37,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:37,995 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 00:42:39,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:42:39,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:42:42,043 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.79 vs. limit=10.0 2023-10-04 00:42:45,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:42:48,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:42:48,791 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 00:42:48,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 00:42:50,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:42:51,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:42:51,559 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 00:42:51,602 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 00:42:56,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 00:42:56,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:42:56,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1464720.0, ans=0.0 2023-10-04 00:42:58,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 00:42:59,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1464720.0, ans=0.1 2023-10-04 00:43:00,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 00:43:05,511 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.65 vs. limit=15.0 2023-10-04 00:43:08,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1464786.6666666667, ans=0.125 2023-10-04 00:43:13,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 00:43:17,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 00:43:17,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:43:17,203 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 00:43:17,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 00:43:17,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 00:43:18,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 00:43:18,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:43:22,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 00:43:27,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:43:28,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:43:28,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 00:43:30,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:43:34,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 00:43:35,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:43:40,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:43:40,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:43:40,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:43:41,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:43:43,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:43:43,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 00:43:45,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:43:47,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:43:47,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:43:49,770 INFO [train.py:1046] (1/4) Epoch 42, batch 1950, loss[loss=0.1444, simple_loss=0.2304, pruned_loss=0.02921, over 24528.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2364, pruned_loss=0.03837, over 4697817.24 frames. ], batch size: 63, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:43:49,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:43:49,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:43:49,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:43:52,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:43:55,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:43:56,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1464986.6666666667, ans=0.1 2023-10-04 00:43:57,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:43:57,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:43:57,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:43:58,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 00:44:00,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 00:44:00,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:02,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:05,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:44:05,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:06,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:06,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:44:09,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:44:09,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:44:09,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:44:09,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:14,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:15,236 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.21 vs. limit=15.0 2023-10-04 00:44:18,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:44:18,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:18,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 00:44:18,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 00:44:19,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 00:44:19,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:44:19,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:21,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1465120.0, ans=0.0 2023-10-04 00:44:22,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:25,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:44:27,862 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.63 vs. limit=12.0 2023-10-04 00:44:28,027 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.78 vs. limit=6.0 2023-10-04 00:44:28,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:44:31,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:44:31,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:44:31,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 00:44:31,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1465120.0, ans=0.125 2023-10-04 00:44:31,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1465120.0, ans=0.125 2023-10-04 00:44:32,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:44:36,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:44:38,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:44:39,323 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.656e+02 1.957e+02 2.205e+02 2.513e+02 3.289e+02, threshold=4.410e+02, percent-clipped=0.0 2023-10-04 00:44:39,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:44:47,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:49,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:50,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:52,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:54,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:44:55,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:56,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 00:44:56,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:44:58,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:59,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 00:45:01,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:45:03,803 INFO [train.py:1046] (1/4) Epoch 42, batch 2000, loss[loss=0.1579, simple_loss=0.2412, pruned_loss=0.03736, over 24434.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.237, pruned_loss=0.03826, over 4695318.84 frames. ], batch size: 77, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:45:03,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:45:05,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:45:05,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:45:05,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1465320.0, ans=0.125 2023-10-04 00:45:05,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1465320.0, ans=0.2 2023-10-04 00:45:06,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:45:09,010 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.79 vs. limit=12.0 2023-10-04 00:45:09,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:45:12,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 00:45:12,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:45:17,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:45:20,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 00:45:20,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:45:20,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:45:24,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:45:25,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 00:45:25,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:27,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:27,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:28,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 00:45:29,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 00:45:32,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 00:45:32,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:45:34,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:45:36,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:45:36,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:37,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:45:37,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:45:37,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 00:45:41,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 00:45:41,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:45:41,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:45:48,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:45:48,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:45:48,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:45:50,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:45:51,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:45:51,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:45:52,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:45:52,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:45:54,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:57,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:45:57,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 00:46:01,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:46:04,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:07,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:07,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:46:10,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:11,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:46:11,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:13,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:46:14,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:46:16,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:17,972 INFO [train.py:1046] (1/4) Epoch 42, batch 2050, loss[loss=0.1512, simple_loss=0.2327, pruned_loss=0.03483, over 24440.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2369, pruned_loss=0.03811, over 4692973.75 frames. ], batch size: 63, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:46:18,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:21,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:46:21,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:23,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1465653.3333333333, ans=0.0 2023-10-04 00:46:27,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:46:27,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1465653.3333333333, ans=0.09899494936611666 2023-10-04 00:46:28,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:46:28,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1465653.3333333333, ans=0.1 2023-10-04 00:46:29,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:30,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:46:31,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 00:46:31,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:46:34,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:46:34,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:46:34,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1465720.0, ans=0.1 2023-10-04 00:46:39,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1465720.0, ans=0.125 2023-10-04 00:46:44,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:46:44,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:48,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 00:46:49,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:51,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 00:46:51,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:46:52,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:46:55,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:46:57,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:46:57,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:46:58,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:47:00,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:47:00,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:47:03,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:47:03,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1465853.3333333333, ans=0.5 2023-10-04 00:47:06,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:47:06,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1465853.3333333333, ans=0.2 2023-10-04 00:47:08,812 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.968e+02 2.128e+02 2.407e+02 4.254e+02, threshold=4.257e+02, percent-clipped=0.0 2023-10-04 00:47:08,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:47:10,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:47:13,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:47:17,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:47:17,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 00:47:24,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:47:24,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:47:27,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:47:28,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 00:47:29,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1465920.0, ans=0.0 2023-10-04 00:47:33,012 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.58 vs. limit=12.0 2023-10-04 00:47:33,503 INFO [train.py:1046] (1/4) Epoch 42, batch 2100, loss[loss=0.1518, simple_loss=0.2293, pruned_loss=0.03711, over 24412.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2351, pruned_loss=0.03785, over 4692953.40 frames. ], batch size: 58, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:47:33,554 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 00:47:33,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:47:33,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:47:34,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:47:36,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:47:36,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 00:47:36,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 00:47:37,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:47:40,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:47:41,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:47:44,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:47:46,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:47:46,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 00:47:46,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:47:46,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 00:47:46,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 00:47:46,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1466053.3333333333, ans=0.2 2023-10-04 00:47:48,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:47:48,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:47:48,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 00:47:48,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 00:47:53,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 00:47:53,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:47:58,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:47:58,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:48:01,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:48:03,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 00:48:03,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:48:03,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 00:48:04,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 00:48:04,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:48:05,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 00:48:05,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 00:48:05,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 00:48:08,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:48:10,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:48:11,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:48:13,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:48:14,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:48:15,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1466186.6666666667, ans=0.0 2023-10-04 00:48:17,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:48:17,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 00:48:17,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:48:18,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:48:20,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:48:20,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 00:48:22,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 00:48:22,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 00:48:26,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:48:28,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:48:28,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 00:48:32,302 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.04 vs. limit=15.0 2023-10-04 00:48:34,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:48:37,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:48:37,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:48:37,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:48:37,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 00:48:38,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:48:39,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:48:39,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:48:39,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:48:41,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:48:42,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 00:48:44,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 00:48:44,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:48:46,747 INFO [train.py:1046] (1/4) Epoch 42, batch 2150, loss[loss=0.1377, simple_loss=0.1933, pruned_loss=0.04101, over 19338.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2337, pruned_loss=0.03756, over 4691572.01 frames. ], batch size: 388, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:48:46,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:48:46,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:48:46,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:48:46,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:48:54,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 00:48:54,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:48:55,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:48:57,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:48:57,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:48:58,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:49:03,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:49:03,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:49:03,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:49:04,142 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.79 vs. limit=6.0 2023-10-04 00:49:06,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1466386.6666666667, ans=0.125 2023-10-04 00:49:07,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:07,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 00:49:08,476 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.83 vs. limit=15.0 2023-10-04 00:49:11,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:49:13,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:49:14,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:14,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:49:15,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:15,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:49:15,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:49:15,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:49:17,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:49:17,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 00:49:20,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:49:22,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:49:22,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:49:23,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:49:24,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:49:26,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:49:27,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:49:29,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:49:29,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 00:49:29,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:49:32,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:49:34,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:34,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:49:35,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:49:35,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:49:37,169 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.733e+02 1.971e+02 2.124e+02 2.467e+02 3.717e+02, threshold=4.249e+02, percent-clipped=0.0 2023-10-04 00:49:37,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:37,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 00:49:40,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 00:49:40,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:49:40,186 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 00:49:40,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:49:41,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:49:42,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 00:49:42,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:49:42,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 00:49:42,898 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 00:49:42,898 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 00:49:43,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1466520.0, ans=0.125 2023-10-04 00:49:43,349 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.05 vs. limit=22.5 2023-10-04 00:49:44,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 00:49:45,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:49:45,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:49:45,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:49:45,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:46,304 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.96 vs. limit=15.0 2023-10-04 00:49:47,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 00:49:48,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:49:48,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:50,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1466586.6666666667, ans=0.125 2023-10-04 00:49:57,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:49:57,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 00:50:00,807 INFO [train.py:1046] (1/4) Epoch 42, batch 2200, loss[loss=0.1744, simple_loss=0.2377, pruned_loss=0.05551, over 19314.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2344, pruned_loss=0.03783, over 4694608.22 frames. ], batch size: 388, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:50:02,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:50:07,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:50:07,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:50:09,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:50:09,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:50:10,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:50:10,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:50:10,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 00:50:17,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 00:50:20,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:50:20,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=1466720.0, ans=0.05 2023-10-04 00:50:24,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 00:50:27,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:50:29,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:50:29,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:50:31,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1466786.6666666667, ans=0.0 2023-10-04 00:50:33,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:50:33,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 00:50:37,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:50:38,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:50:40,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 00:50:41,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:50:44,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:50:45,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:50:47,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:50:48,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 00:50:50,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:50:50,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 00:50:52,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:50:52,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 00:50:52,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:50:56,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:50:56,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:50:56,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:50:56,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:50:59,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:51:01,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:51:02,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 00:51:07,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 00:51:07,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:51:09,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:51:11,252 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 00:51:11,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:51:12,720 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 00:51:12,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 00:51:12,850 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 00:51:14,417 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:51:15,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:51:15,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 00:51:16,806 INFO [train.py:1046] (1/4) Epoch 42, batch 2250, loss[loss=0.1639, simple_loss=0.2375, pruned_loss=0.04515, over 23877.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.235, pruned_loss=0.03824, over 4689183.39 frames. ], batch size: 195, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:51:18,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:51:19,581 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 00:51:20,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:51:22,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1466986.6666666667, ans=0.0 2023-10-04 00:51:23,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:51:29,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:51:31,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:51:33,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1467053.3333333333, ans=0.0 2023-10-04 00:51:33,928 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.46 vs. limit=22.5 2023-10-04 00:51:34,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:51:34,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:51:34,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1467053.3333333333, ans=0.125 2023-10-04 00:51:35,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:51:38,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 00:51:38,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:51:38,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:51:41,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 00:51:42,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:51:42,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:51:43,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:51:43,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1467053.3333333333, ans=0.125 2023-10-04 00:51:48,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:51:48,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 00:51:48,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:51:50,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 00:51:51,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:51:54,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn1.whiten.whitening_limit, batch_count=1467120.0, ans=22.5 2023-10-04 00:51:54,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:52:00,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:52:02,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:52:02,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1467186.6666666667, ans=0.1 2023-10-04 00:52:03,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:52:03,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:52:05,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:52:07,087 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.956e+02 2.105e+02 2.368e+02 2.905e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-04 00:52:08,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:52:08,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1467186.6666666667, ans=0.0 2023-10-04 00:52:09,443 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=12.0 2023-10-04 00:52:11,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:52:13,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:52:17,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:52:17,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:52:17,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:52:23,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 00:52:25,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:52:25,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 00:52:25,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:52:25,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:52:28,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 00:52:30,573 INFO [train.py:1046] (1/4) Epoch 42, batch 2300, loss[loss=0.163, simple_loss=0.2526, pruned_loss=0.0367, over 24344.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2359, pruned_loss=0.03842, over 4695861.15 frames. ], batch size: 74, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:52:33,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:52:35,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:52:40,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:52:41,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:52:43,185 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 00:52:44,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:52:45,434 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.99 vs. limit=15.0 2023-10-04 00:52:48,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1467386.6666666667, ans=0.125 2023-10-04 00:52:50,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:52:51,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:52:51,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:52:51,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:52:51,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 00:52:52,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:52:54,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:52:54,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:52:57,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:52:59,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:53:03,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:53:07,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:53:07,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:53:11,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:53:15,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:53:17,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:53:19,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:53:20,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:53:20,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 00:53:23,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 00:53:23,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:53:24,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:53:24,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:53:26,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:53:26,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1467520.0, ans=0.2 2023-10-04 00:53:27,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 00:53:27,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:53:27,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 00:53:27,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:53:27,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:53:28,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 00:53:35,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:53:38,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:53:42,046 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.91 vs. limit=15.0 2023-10-04 00:53:42,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:53:42,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:53:42,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:53:44,434 INFO [train.py:1046] (1/4) Epoch 42, batch 2350, loss[loss=0.1677, simple_loss=0.2491, pruned_loss=0.04321, over 23288.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2364, pruned_loss=0.03835, over 4701259.24 frames. ], batch size: 93, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:53:44,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:53:44,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:53:44,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:53:45,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 00:53:52,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:53:52,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 00:53:57,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 00:53:59,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:54:03,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:54:03,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:54:03,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:54:03,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:54:04,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 00:54:08,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:54:16,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 00:54:16,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:54:19,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:54:19,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:54:20,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:54:21,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 00:54:23,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:54:23,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:54:23,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1467786.6666666667, ans=0.2 2023-10-04 00:54:24,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:54:24,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:54:27,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:54:31,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 00:54:31,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:54:33,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:54:33,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:54:34,606 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 2.014e+02 2.206e+02 2.508e+02 4.663e+02, threshold=4.412e+02, percent-clipped=1.0 2023-10-04 00:54:34,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1467853.3333333333, ans=0.125 2023-10-04 00:54:36,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 00:54:37,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:54:40,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 00:54:40,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:54:42,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1467920.0, ans=0.2 2023-10-04 00:54:43,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 00:54:47,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 00:54:48,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:54:48,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:54:48,702 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 00:54:48,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 00:54:51,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 00:54:54,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:54:57,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:54:58,382 INFO [train.py:1046] (1/4) Epoch 42, batch 2400, loss[loss=0.1649, simple_loss=0.2552, pruned_loss=0.03732, over 24041.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2368, pruned_loss=0.03826, over 4697595.71 frames. ], batch size: 80, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 00:55:01,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:55:03,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:55:03,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 00:55:03,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 00:55:11,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 00:55:11,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:55:12,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 00:55:12,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:55:13,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1468053.3333333333, ans=0.125 2023-10-04 00:55:14,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:55:14,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 00:55:18,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:55:20,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 00:55:20,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1468053.3333333333, ans=0.125 2023-10-04 00:55:21,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1468053.3333333333, ans=0.1 2023-10-04 00:55:24,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:55:27,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1468120.0, ans=10.0 2023-10-04 00:55:30,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 00:55:32,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:55:34,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:55:39,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:55:39,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 00:55:39,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:55:41,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1468120.0, ans=0.125 2023-10-04 00:55:45,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1468186.6666666667, ans=0.0 2023-10-04 00:55:49,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:55:51,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:55:54,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:55:55,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:55:55,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:55:55,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:55:55,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:55:55,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:55:55,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:55:56,592 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.13 vs. limit=15.0 2023-10-04 00:56:01,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:56:01,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:56:02,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 00:56:04,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 00:56:05,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:56:05,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:56:06,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 00:56:08,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 00:56:08,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 00:56:08,360 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 00:56:08,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 00:56:09,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:56:11,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:56:11,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:56:12,383 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 00:56:12,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:56:13,725 INFO [train.py:1046] (1/4) Epoch 42, batch 2450, loss[loss=0.1583, simple_loss=0.2416, pruned_loss=0.03752, over 24484.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2355, pruned_loss=0.0383, over 4674731.59 frames. ], batch size: 66, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 00:56:13,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 00:56:16,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:56:16,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:56:18,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1468320.0, ans=0.0 2023-10-04 00:56:21,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:56:21,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:56:22,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 00:56:28,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:56:28,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:56:30,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:56:30,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:56:32,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:56:32,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 00:56:35,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:56:38,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:56:38,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:56:38,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1468386.6666666667, ans=0.2 2023-10-04 00:56:41,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:56:42,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:56:44,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:56:44,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:56:45,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 00:56:47,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:56:54,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:56:55,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:56:55,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:56:57,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:56:57,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:56:57,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:56:58,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 00:57:02,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:57:03,209 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 2.034e+02 2.290e+02 2.639e+02 4.932e+02, threshold=4.579e+02, percent-clipped=1.0 2023-10-04 00:57:03,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:57:07,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:57:07,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:57:12,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:57:12,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 00:57:14,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:57:14,833 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.72 vs. limit=6.0 2023-10-04 00:57:15,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:57:15,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 00:57:15,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:57:15,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:57:19,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:57:19,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1468586.6666666667, ans=0.2 2023-10-04 00:57:20,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:57:21,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:57:22,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1468586.6666666667, ans=0.125 2023-10-04 00:57:25,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 00:57:25,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1468653.3333333333, ans=0.2 2023-10-04 00:57:26,968 INFO [train.py:1046] (1/4) Epoch 42, batch 2500, loss[loss=0.1603, simple_loss=0.2574, pruned_loss=0.0316, over 24326.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2346, pruned_loss=0.03788, over 4692710.73 frames. ], batch size: 74, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 00:57:27,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:57:32,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:57:42,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:57:43,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:57:45,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:57:45,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 00:57:51,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 00:57:51,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:57:52,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 00:57:52,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 00:57:53,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 00:57:53,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:57:55,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:57:55,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 00:57:55,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1468786.6666666667, ans=0.125 2023-10-04 00:57:56,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:57:56,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 00:57:56,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:58:01,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:58:02,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:58:04,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 00:58:04,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 00:58:05,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:58:07,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:58:10,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:58:15,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:58:18,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:58:21,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1468853.3333333333, ans=0.125 2023-10-04 00:58:24,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 00:58:25,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 00:58:27,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:58:27,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 00:58:28,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:58:28,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:58:30,287 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 00:58:30,287 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 00:58:30,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 00:58:31,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1468920.0, ans=0.125 2023-10-04 00:58:34,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:58:36,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 00:58:36,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 00:58:36,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:58:37,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 00:58:40,304 INFO [train.py:1046] (1/4) Epoch 42, batch 2550, loss[loss=0.1509, simple_loss=0.2305, pruned_loss=0.03566, over 24581.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2356, pruned_loss=0.03783, over 4710496.79 frames. ], batch size: 60, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:58:40,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 00:58:40,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1468986.6666666667, ans=0.1 2023-10-04 00:58:43,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:58:46,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:58:46,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:58:49,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:58:49,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 00:58:51,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 00:58:52,020 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.39 vs. limit=22.5 2023-10-04 00:58:53,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 00:58:55,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:58:56,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:58:58,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:58:58,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 00:58:58,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:58:59,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:59:00,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:59:03,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:59:03,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 00:59:03,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 00:59:04,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:59:04,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 00:59:16,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:59:19,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:59:19,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:59:19,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:59:20,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:59:24,201 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.83 vs. limit=15.0 2023-10-04 00:59:27,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:59:31,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:59:31,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:59:33,037 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.635e+02 1.986e+02 2.220e+02 2.495e+02 3.870e+02, threshold=4.440e+02, percent-clipped=0.0 2023-10-04 00:59:33,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:59:33,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:59:33,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:59:33,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1469186.6666666667, ans=0.2 2023-10-04 00:59:36,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:59:36,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:59:38,127 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.62 vs. limit=15.0 2023-10-04 00:59:40,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1469253.3333333333, ans=0.07 2023-10-04 00:59:41,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:59:41,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 00:59:41,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:59:41,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:59:43,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:59:45,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1469253.3333333333, ans=0.0 2023-10-04 00:59:46,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:59:48,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:59:49,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1469253.3333333333, ans=0.0 2023-10-04 00:59:53,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:59:54,964 INFO [train.py:1046] (1/4) Epoch 42, batch 2600, loss[loss=0.1492, simple_loss=0.2329, pruned_loss=0.03279, over 24353.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2361, pruned_loss=0.038, over 4719036.61 frames. ], batch size: 61, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:59:56,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:59:58,119 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 00:59:58,934 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.39 vs. limit=6.0 2023-10-04 01:00:00,885 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 01:00:00,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:00:00,938 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 01:00:02,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 01:00:02,813 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 01:00:05,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:00:05,979 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 01:00:07,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 01:00:08,771 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 01:00:10,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:00:11,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 01:00:12,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 01:00:14,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 01:00:14,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1469386.6666666667, ans=0.125 2023-10-04 01:00:15,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 01:00:17,803 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 01:00:19,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 01:00:20,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1469386.6666666667, ans=0.1 2023-10-04 01:00:26,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:00:26,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:00:26,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:00:26,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 01:00:28,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:00:31,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1469453.3333333333, ans=0.2 2023-10-04 01:00:34,409 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 01:00:37,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:00:39,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:00:39,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 01:00:41,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:00:41,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:00:41,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 01:00:44,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:00:44,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:00:46,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:00:47,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1469520.0, ans=0.1 2023-10-04 01:00:48,772 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 01:00:48,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:00:48,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:00:53,798 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.62 vs. limit=12.0 2023-10-04 01:00:54,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:00:54,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:00:54,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 01:00:57,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:00:58,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:00:59,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:01:07,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 01:01:07,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:01:08,418 INFO [train.py:1046] (1/4) Epoch 42, batch 2650, loss[loss=0.1643, simple_loss=0.2371, pruned_loss=0.04572, over 23836.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2367, pruned_loss=0.03847, over 4717979.10 frames. ], batch size: 164, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:01:09,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:01:13,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 01:01:13,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:01:15,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:01:17,077 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 01:01:17,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:01:18,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:01:19,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:01:21,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:01:22,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:01:25,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 01:01:25,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:01:25,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:01:29,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 01:01:30,850 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 01:01:33,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:01:35,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 01:01:36,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:01:36,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 01:01:42,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:01:42,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:01:42,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:01:42,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:01:47,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 01:01:47,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 01:01:49,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:01:53,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 01:01:54,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:01:54,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:01:54,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:01:56,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:01:56,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:01:56,766 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.16 vs. limit=15.0 2023-10-04 01:01:57,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:01:58,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:01:58,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:02:00,144 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.901e+02 2.089e+02 2.276e+02 3.340e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-04 01:02:00,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:02:00,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:02:01,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:02:03,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:02:04,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:02:05,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:02:07,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:02:10,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:02:11,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:02:11,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:02:11,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 01:02:16,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:02:18,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:02:20,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:02:21,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:02:21,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:02:22,897 INFO [train.py:1046] (1/4) Epoch 42, batch 2700, loss[loss=0.1384, simple_loss=0.2147, pruned_loss=0.03106, over 24445.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2371, pruned_loss=0.03823, over 4727024.31 frames. ], batch size: 58, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:02:22,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:02:24,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:02:24,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 01:02:27,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:02:28,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 01:02:29,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:02:30,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:02:30,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:02:31,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:02:31,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:02:31,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:02:32,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 01:02:32,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 01:02:34,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:02:35,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:02:35,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:02:36,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:02:40,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:02:41,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 01:02:42,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:02:46,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:02:46,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:02:51,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1470120.0, ans=0.125 2023-10-04 01:02:53,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:02:53,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:02:53,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:02:54,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:02:56,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:02:59,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:02:59,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:02:59,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:03:01,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1470120.0, ans=0.0 2023-10-04 01:03:02,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:03:02,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:03:09,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:03:11,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:03:14,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:03:14,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:03:14,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1470186.6666666667, ans=0.125 2023-10-04 01:03:17,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:03:17,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1470186.6666666667, ans=0.125 2023-10-04 01:03:19,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:03:19,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:03:20,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:20,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:03:22,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:03:24,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:03:25,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1470253.3333333333, ans=0.0 2023-10-04 01:03:26,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:03:26,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:03:27,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 01:03:29,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:03:31,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:03:31,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 01:03:33,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 01:03:34,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:03:36,329 INFO [train.py:1046] (1/4) Epoch 42, batch 2750, loss[loss=0.1618, simple_loss=0.2545, pruned_loss=0.03454, over 24660.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2368, pruned_loss=0.03856, over 4716772.85 frames. ], batch size: 73, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:03:37,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:03:37,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:03:40,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:40,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:03:40,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:43,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:03:43,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 01:03:43,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:03:43,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:43,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 01:03:44,709 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.81 vs. limit=15.0 2023-10-04 01:03:45,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:03:45,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:03:50,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 01:03:50,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1470386.6666666667, ans=0.1 2023-10-04 01:03:52,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:03:52,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:54,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:03:54,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:03:55,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:03:55,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1470386.6666666667, ans=0.125 2023-10-04 01:03:57,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:03:57,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:03:58,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:04:01,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:04:01,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:04:03,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:04:03,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:04:03,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:04:10,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:04:10,905 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.52 vs. limit=22.5 2023-10-04 01:04:13,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:04:13,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:04:15,733 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:04:18,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1470453.3333333333, ans=0.125 2023-10-04 01:04:20,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:04:20,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:04:21,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:04:27,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:04:27,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:04:27,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 01:04:28,470 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 1.995e+02 2.181e+02 2.407e+02 3.708e+02, threshold=4.362e+02, percent-clipped=0.0 2023-10-04 01:04:32,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:04:34,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 01:04:37,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1470586.6666666667, ans=0.09899494936611666 2023-10-04 01:04:38,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 01:04:41,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:04:41,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 01:04:42,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:04:44,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:04:44,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 01:04:44,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:04:48,730 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.15 vs. limit=15.0 2023-10-04 01:04:49,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 01:04:50,534 INFO [train.py:1046] (1/4) Epoch 42, batch 2800, loss[loss=0.1548, simple_loss=0.2372, pruned_loss=0.03623, over 23356.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2359, pruned_loss=0.03836, over 4717532.78 frames. ], batch size: 93, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 01:04:50,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:04:50,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:04:51,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 01:04:51,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:04:51,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:04:55,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:04:55,452 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 01:04:55,452 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 01:04:58,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:04:59,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:04:59,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:05:03,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:05:05,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 01:05:07,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 01:05:08,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 01:05:08,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:05:10,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:05:10,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:05:15,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:05:15,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:05:15,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 01:05:16,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:05:17,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1470720.0, ans=0.1 2023-10-04 01:05:17,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1470720.0, ans=0.09899494936611666 2023-10-04 01:05:23,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:05:24,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:05:26,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1470786.6666666667, ans=0.0 2023-10-04 01:05:27,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:05:29,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:05:30,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:05:33,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:05:34,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 01:05:34,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:05:34,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:05:34,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:05:39,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:05:40,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:05:42,685 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.71 vs. limit=22.5 2023-10-04 01:05:43,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:05:46,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:05:46,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:05:46,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 01:05:46,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 01:05:48,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:05:48,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:05:48,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 01:05:49,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:05:49,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:05:49,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1470920.0, ans=0.0 2023-10-04 01:05:50,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:05:52,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 01:05:53,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:05:53,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:05:53,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:05:55,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 01:06:02,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:06:02,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:06:02,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:06:04,307 INFO [train.py:1046] (1/4) Epoch 42, batch 2850, loss[loss=0.1579, simple_loss=0.235, pruned_loss=0.04041, over 23426.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2343, pruned_loss=0.03804, over 4698225.96 frames. ], batch size: 93, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 01:06:04,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:06:07,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:06:07,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:06:07,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:06:09,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:06:11,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:06:12,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:06:12,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 01:06:18,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1471053.3333333333, ans=0.0 2023-10-04 01:06:19,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 01:06:19,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:06:20,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 01:06:22,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:06:23,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 01:06:23,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1471053.3333333333, ans=0.125 2023-10-04 01:06:25,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 01:06:25,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:06:37,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:06:38,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:06:38,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:06:38,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:06:38,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:06:38,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:06:40,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:06:40,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1471120.0, ans=0.125 2023-10-04 01:06:41,666 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.31 vs. limit=15.0 2023-10-04 01:06:41,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 01:06:44,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:06:44,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:06:44,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:06:46,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:06:48,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:06:48,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:06:50,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:06:52,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:06:53,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:06:53,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:06:55,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:06:56,319 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.878e+02 2.070e+02 2.200e+02 2.793e+02, threshold=4.141e+02, percent-clipped=0.0 2023-10-04 01:06:56,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:06:58,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1471186.6666666667, ans=0.125 2023-10-04 01:07:01,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:07:03,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 01:07:03,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 01:07:05,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:07:05,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:07:05,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 01:07:07,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:07:07,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:07:08,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:07:08,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:07:08,434 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 01:07:08,468 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 01:07:08,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:07:08,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:07:10,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1471253.3333333333, ans=0.125 2023-10-04 01:07:15,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 01:07:15,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:07:17,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:07:19,098 INFO [train.py:1046] (1/4) Epoch 42, batch 2900, loss[loss=0.1442, simple_loss=0.2365, pruned_loss=0.02596, over 24610.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2343, pruned_loss=0.03785, over 4689087.05 frames. ], batch size: 68, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 01:07:19,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 01:07:23,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:07:23,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 01:07:24,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 01:07:26,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:07:26,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:07:28,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:07:30,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:07:32,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:07:34,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:07:35,356 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.75 vs. limit=22.5 2023-10-04 01:07:36,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 01:07:38,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 01:07:39,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:07:39,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:07:42,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 01:07:42,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 01:07:45,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:07:45,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 01:07:45,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:07:47,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:07:47,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 01:07:50,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:07:50,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:07:53,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:07:54,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:07:57,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 01:07:57,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 01:07:57,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:08:00,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:08:03,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 01:08:03,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:08:03,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1471520.0, ans=0.1 2023-10-04 01:08:08,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:08:17,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:08:17,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:08:17,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1471586.6666666667, ans=0.2 2023-10-04 01:08:18,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 01:08:18,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1471586.6666666667, ans=0.125 2023-10-04 01:08:22,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:08:22,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 01:08:22,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:08:23,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:08:31,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:08:32,947 INFO [train.py:1046] (1/4) Epoch 42, batch 2950, loss[loss=0.1658, simple_loss=0.2575, pruned_loss=0.03707, over 24524.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.235, pruned_loss=0.03781, over 4692877.59 frames. ], batch size: 71, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:08:33,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 01:08:34,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:08:34,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:08:35,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:08:37,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:08:37,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 01:08:39,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 01:08:39,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:08:39,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:08:44,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:08:45,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:08:46,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:08:47,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1471720.0, ans=0.125 2023-10-04 01:08:48,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:08:51,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:08:51,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:08:54,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:08:54,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:08:54,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:08:55,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 01:09:00,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 01:09:00,224 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 01:09:01,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:09:04,694 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 01:09:04,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 01:09:04,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:09:06,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:09:06,088 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 01:09:06,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:09:10,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 01:09:10,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:09:11,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1471786.6666666667, ans=0.0 2023-10-04 01:09:12,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:09:15,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:09:18,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:09:18,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:09:19,703 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 01:09:19,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:09:19,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 01:09:24,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:09:26,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:09:26,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 01:09:26,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:09:27,395 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.891e+02 2.131e+02 2.333e+02 3.581e+02, threshold=4.262e+02, percent-clipped=0.0 2023-10-04 01:09:28,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 01:09:31,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:09:34,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:09:34,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:09:35,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:09:35,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 01:09:37,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:09:39,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:09:39,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:09:39,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:09:39,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:09:40,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:09:41,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:09:41,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 01:09:43,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:09:45,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:09:45,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.89 vs. limit=15.0 2023-10-04 01:09:45,862 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.72 vs. limit=15.0 2023-10-04 01:09:46,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:09:47,911 INFO [train.py:1046] (1/4) Epoch 42, batch 3000, loss[loss=0.1464, simple_loss=0.2258, pruned_loss=0.03351, over 24576.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2355, pruned_loss=0.03752, over 4690683.06 frames. ], batch size: 60, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:09:47,911 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 01:09:59,535 INFO [train.py:1078] (1/4) Epoch 42, validation: loss=0.3457, simple_loss=0.2797, pruned_loss=0.2058, over 1125622.00 frames. 2023-10-04 01:09:59,536 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 01:09:59,719 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 01:10:01,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 01:10:04,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:10:05,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:10:05,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 01:10:07,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:10:08,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1471986.6666666667, ans=0.125 2023-10-04 01:10:13,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 01:10:14,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1472053.3333333333, ans=0.125 2023-10-04 01:10:22,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:10:24,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1472053.3333333333, ans=0.0 2023-10-04 01:10:27,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 01:10:28,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:10:31,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:10:32,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:10:32,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:10:35,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:10:35,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 01:10:38,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 01:10:40,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:10:40,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:10:40,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1472120.0, ans=0.125 2023-10-04 01:10:42,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:10:43,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:10:43,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:10:43,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:10:47,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:10:48,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:10:48,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:10:49,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1472186.6666666667, ans=10.0 2023-10-04 01:10:50,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:10:52,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 01:10:53,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:10:53,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:10:53,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:10:56,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:10:56,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:10:58,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 01:10:59,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 01:10:59,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:10:59,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 01:11:00,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:11:02,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 01:11:02,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1472253.3333333333, ans=0.0 2023-10-04 01:11:03,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1472253.3333333333, ans=0.0 2023-10-04 01:11:05,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:11:05,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:11:05,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 01:11:07,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 01:11:07,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 01:11:08,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:11:09,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:11:09,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 01:11:09,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:11,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:11:12,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 01:11:14,091 INFO [train.py:1046] (1/4) Epoch 42, batch 3050, loss[loss=0.1683, simple_loss=0.2551, pruned_loss=0.04076, over 24660.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.236, pruned_loss=0.03753, over 4699989.67 frames. ], batch size: 65, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:11:15,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:11:18,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:11:18,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:11:19,614 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.58 vs. limit=6.0 2023-10-04 01:11:23,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:26,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 01:11:31,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 01:11:31,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 01:11:31,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:11:33,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:11:37,408 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.41 vs. limit=15.0 2023-10-04 01:11:37,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:37,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:11:38,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:11:39,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:11:39,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1472386.6666666667, ans=0.0 2023-10-04 01:11:40,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:11:40,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:11:41,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:11:41,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:11:43,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:45,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:11:49,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:11:49,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 01:11:49,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:51,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:11:54,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:11:55,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:11:55,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:11:56,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:12:02,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:12:02,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:12:04,819 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.26 vs. limit=15.0 2023-10-04 01:12:06,775 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.976e+02 2.144e+02 2.378e+02 3.256e+02, threshold=4.288e+02, percent-clipped=0.0 2023-10-04 01:12:06,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:12:08,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:12:08,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:12:10,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:12:10,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:12:11,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:12:12,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 01:12:12,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:12:14,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:12:14,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 01:12:16,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:12:17,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1472586.6666666667, ans=0.0 2023-10-04 01:12:23,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:12:25,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:12:27,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:12:28,411 INFO [train.py:1046] (1/4) Epoch 42, batch 3100, loss[loss=0.1478, simple_loss=0.2065, pruned_loss=0.0445, over 19397.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2355, pruned_loss=0.03741, over 4703630.10 frames. ], batch size: 388, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:12:28,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 01:12:30,192 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:12:31,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 01:12:32,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 01:12:34,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:12:38,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:12:39,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:12:42,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 01:12:44,863 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.83 vs. limit=15.0 2023-10-04 01:12:45,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:12:51,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 01:12:51,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1472720.0, ans=0.1 2023-10-04 01:12:54,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:12:56,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:12:57,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:12:57,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:12:57,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 01:13:00,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:13:00,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 01:13:00,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:13:01,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:13:01,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 01:13:02,374 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.12 vs. limit=15.0 2023-10-04 01:13:03,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1472786.6666666667, ans=0.0 2023-10-04 01:13:04,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:13:07,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1472786.6666666667, ans=0.1 2023-10-04 01:13:08,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:13:08,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 01:13:09,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 01:13:10,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:11,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:13:14,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:13:14,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:14,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:13:16,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:13:16,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:13:17,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:13:18,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:13:18,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:18,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 01:13:21,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:13:24,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 01:13:26,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:13:26,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 01:13:28,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:13:28,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:28,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 01:13:33,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1472920.0, ans=0.0 2023-10-04 01:13:39,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 01:13:40,611 INFO [train.py:1046] (1/4) Epoch 42, batch 3150, loss[loss=0.1384, simple_loss=0.2154, pruned_loss=0.03069, over 24688.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.234, pruned_loss=0.0372, over 4713023.69 frames. ], batch size: 60, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:13:40,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:13:42,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:43,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:13:43,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:13:45,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 01:13:46,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:13:46,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 01:13:48,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 01:13:49,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:13:50,981 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 01:13:53,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 01:13:53,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:13:53,877 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 01:13:55,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 01:13:57,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 01:13:58,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 01:13:58,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 01:13:58,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:13:58,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:13:59,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:14:01,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 01:14:02,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:14:03,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:14:04,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:14:05,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1473053.3333333333, ans=0.1 2023-10-04 01:14:06,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 01:14:07,497 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.85 vs. limit=15.0 2023-10-04 01:14:10,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 01:14:11,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:14:12,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:14:14,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:14:14,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 01:14:18,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 01:14:18,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:14:19,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 01:14:19,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:14:19,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:14:19,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:14:21,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:14:21,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:14:21,606 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:14:22,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 01:14:22,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:14:22,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:14:24,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:14:24,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:14:24,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 01:14:26,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:14:28,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 01:14:28,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:14:29,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 01:14:30,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 01:14:32,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:14:32,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:14:33,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 01:14:34,960 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.982e+02 2.210e+02 2.426e+02 4.214e+02, threshold=4.421e+02, percent-clipped=0.0 2023-10-04 01:14:36,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 01:14:36,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:14:39,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:14:41,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:14:41,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:14:48,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:14:48,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:14:49,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 01:14:53,764 INFO [train.py:1046] (1/4) Epoch 42, batch 3200, loss[loss=0.1567, simple_loss=0.2428, pruned_loss=0.03529, over 23700.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2335, pruned_loss=0.03698, over 4716835.13 frames. ], batch size: 85, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:14:55,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:14:55,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 01:15:00,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:15:02,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:15:02,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 01:15:04,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:15:08,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:15:13,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:15:18,549 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.36 vs. limit=15.0 2023-10-04 01:15:20,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:15:24,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1473453.3333333333, ans=0.0 2023-10-04 01:15:24,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1473453.3333333333, ans=0.125 2023-10-04 01:15:30,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 01:15:30,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:15:34,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 01:15:34,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:15:37,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:15:37,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:15:38,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:15:41,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 01:15:43,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 01:15:45,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 01:15:48,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 01:15:51,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:15:57,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:15:57,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:15:57,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:15:58,581 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 01:15:58,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:16:01,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:16:05,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 01:16:05,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 01:16:06,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 01:16:06,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 01:16:07,918 INFO [train.py:1046] (1/4) Epoch 42, batch 3250, loss[loss=0.1793, simple_loss=0.2618, pruned_loss=0.04841, over 24377.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2342, pruned_loss=0.0372, over 4724260.89 frames. ], batch size: 77, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:16:08,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:16:10,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 01:16:10,062 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 01:16:10,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:16:11,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:12,762 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 01:16:16,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:16:18,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:16:21,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1473720.0, ans=0.2 2023-10-04 01:16:26,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:16:26,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 01:16:26,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:16:27,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:16:27,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:16:29,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:16:29,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:16:34,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:34,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:16:34,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:16:34,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:34,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:36,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:16:37,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:16:37,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:16:41,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:16:41,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:42,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:16:42,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:16:42,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:16:46,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 01:16:47,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:16:47,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:16:50,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:16:50,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:16:55,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:17:02,403 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 1.958e+02 2.109e+02 2.405e+02 3.560e+02, threshold=4.218e+02, percent-clipped=0.0 2023-10-04 01:17:02,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:17:02,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:02,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 01:17:02,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:17:02,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 01:17:03,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:06,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 01:17:06,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 01:17:06,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:17:08,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:17:09,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:17:09,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 01:17:09,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1473920.0, ans=0.1 2023-10-04 01:17:10,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:17:13,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:17:13,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:17:15,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 01:17:15,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:17:18,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:17:18,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 01:17:20,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1473986.6666666667, ans=0.125 2023-10-04 01:17:21,307 INFO [train.py:1046] (1/4) Epoch 42, batch 3300, loss[loss=0.1571, simple_loss=0.245, pruned_loss=0.03466, over 24661.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2352, pruned_loss=0.03725, over 4715718.96 frames. ], batch size: 68, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:17:22,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:17:22,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 01:17:22,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 01:17:24,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 01:17:24,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:17:27,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:17:28,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:17:30,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:32,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 01:17:32,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:17:36,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:17:37,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:17:42,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 01:17:42,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1474053.3333333333, ans=0.125 2023-10-04 01:17:43,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:17:43,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:17:45,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:45,187 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 01:17:46,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:17:46,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:17:48,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:17:48,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:17:48,075 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 01:17:49,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1474120.0, ans=0.125 2023-10-04 01:17:50,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:17:50,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:17:53,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:53,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 01:17:55,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 01:17:55,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:56,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:17:57,940 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 01:17:59,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 01:17:59,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1474120.0, ans=0.0 2023-10-04 01:18:00,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:18:02,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 01:18:04,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1474186.6666666667, ans=0.1 2023-10-04 01:18:05,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:18:06,610 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.92 vs. limit=6.0 2023-10-04 01:18:08,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 01:18:08,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:18:13,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:18:13,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:18:13,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:18:13,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:18:13,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1474186.6666666667, ans=0.125 2023-10-04 01:18:16,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:18:16,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:18:17,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:18:19,197 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 01:18:20,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 01:18:23,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:18:23,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:18:23,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:18:24,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:18:24,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:18:26,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:18:26,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:18:26,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 01:18:27,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:18:29,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 01:18:30,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 01:18:32,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:18:32,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:18:34,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1474320.0, ans=0.07 2023-10-04 01:18:35,187 INFO [train.py:1046] (1/4) Epoch 42, batch 3350, loss[loss=0.157, simple_loss=0.2357, pruned_loss=0.03915, over 23999.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.236, pruned_loss=0.03751, over 4696243.09 frames. ], batch size: 196, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:18:36,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:18:36,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:18:38,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:18:39,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:18:39,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:18:41,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:18:42,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:18:44,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:18:47,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:18:47,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:18:50,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:18:50,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:18:51,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 01:18:52,993 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 01:18:54,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:18:56,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 01:18:57,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 01:18:58,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:18:59,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:19:01,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:01,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 01:19:01,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:19:01,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:19:03,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:19:04,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:19:04,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:19:05,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:19:09,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:19:10,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:19:10,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:19:14,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:19:16,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:19:19,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:19:19,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:21,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:21,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1474520.0, ans=0.2 2023-10-04 01:19:23,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 01:19:23,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 01:19:23,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 01:19:25,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:19:25,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 01:19:26,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:19:28,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:19:29,256 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.949e+02 2.124e+02 2.464e+02 3.729e+02, threshold=4.249e+02, percent-clipped=0.0 2023-10-04 01:19:35,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:35,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 01:19:36,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:19:38,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:19:38,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:19:44,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:19:44,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1474586.6666666667, ans=0.0 2023-10-04 01:19:45,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 01:19:45,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:19:47,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:19:49,097 INFO [train.py:1046] (1/4) Epoch 42, batch 3400, loss[loss=0.1533, simple_loss=0.2351, pruned_loss=0.03571, over 23346.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2369, pruned_loss=0.03833, over 4696607.25 frames. ], batch size: 119, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:19:49,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:19:50,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 01:19:50,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:50,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 01:19:52,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:19:53,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:19:55,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:19:56,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:19:56,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 01:20:00,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 01:20:00,751 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 01:20:02,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:20:06,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:20:06,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:20:06,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:20:06,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:20:11,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:20:14,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 01:20:17,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:20:18,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:20:20,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:20:22,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 01:20:26,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:20:28,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1474786.6666666667, ans=0.125 2023-10-04 01:20:29,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 01:20:36,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:20:38,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:20:38,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 01:20:39,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:20:39,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:20:39,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:20:40,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:20:44,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:20:47,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:20:47,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:20:50,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:20:53,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 01:21:00,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:21:03,414 INFO [train.py:1046] (1/4) Epoch 42, batch 3450, loss[loss=0.1324, simple_loss=0.2166, pruned_loss=0.02417, over 24450.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2374, pruned_loss=0.03827, over 4693719.29 frames. ], batch size: 58, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:21:03,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 01:21:08,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 01:21:08,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:21:09,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:21:09,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 01:21:10,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:21:13,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:21:17,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1475053.3333333333, ans=0.09899494936611666 2023-10-04 01:21:17,943 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.55 vs. limit=15.0 2023-10-04 01:21:20,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:21:20,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:21:22,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:21:22,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:21:22,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1475053.3333333333, ans=0.2 2023-10-04 01:21:23,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:21:29,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 01:21:33,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 01:21:33,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:21:35,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:21:36,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:21:40,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 01:21:41,601 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.11 vs. limit=22.5 2023-10-04 01:21:42,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:21:46,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:21:46,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:21:49,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:21:51,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:21:51,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 01:21:51,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:21:53,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:21:56,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:21:58,676 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.992e+02 2.174e+02 2.512e+02 3.685e+02, threshold=4.348e+02, percent-clipped=0.0 2023-10-04 01:21:58,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 01:22:00,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:22:06,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:22:07,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:09,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:22:13,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:13,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:22:15,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:22:16,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:22:17,646 INFO [train.py:1046] (1/4) Epoch 42, batch 3500, loss[loss=0.143, simple_loss=0.2034, pruned_loss=0.04133, over 22804.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2363, pruned_loss=0.03807, over 4690553.31 frames. ], batch size: 322, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:22:21,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:22:22,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:22:23,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1475320.0, ans=0.0 2023-10-04 01:22:24,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 01:22:26,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:22:27,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1475320.0, ans=0.125 2023-10-04 01:22:28,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 01:22:29,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:22:29,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 01:22:34,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:22:35,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:22:37,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:22:37,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:22:37,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 01:22:38,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:38,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:22:38,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 01:22:39,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:41,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 01:22:44,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:22:48,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:49,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 01:22:49,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:22:52,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:22:53,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:22:55,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:56,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:22:56,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:22:59,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 01:23:00,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 01:23:00,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 01:23:00,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:23:02,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:23:02,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:23:03,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:23:04,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1475520.0, ans=0.1 2023-10-04 01:23:05,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 01:23:06,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:23:11,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:23:12,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 01:23:12,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 01:23:12,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:23:14,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:23:15,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:23:17,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:23:17,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1475586.6666666667, ans=0.1 2023-10-04 01:23:19,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 01:23:21,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:23:23,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:23:24,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 01:23:26,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 01:23:27,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:23:29,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:23:29,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:23:29,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:23:31,969 INFO [train.py:1046] (1/4) Epoch 42, batch 3550, loss[loss=0.1561, simple_loss=0.2242, pruned_loss=0.04402, over 22829.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2345, pruned_loss=0.0377, over 4692939.75 frames. ], batch size: 322, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:23:34,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:23:38,914 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.45 vs. limit=22.5 2023-10-04 01:23:39,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:23:41,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 01:23:44,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:23:45,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:23:45,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:23:47,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:23:47,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:23:49,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1475720.0, ans=0.125 2023-10-04 01:23:51,087 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1475720.0, ans=0.125 2023-10-04 01:23:52,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:23:52,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:23:53,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:23:53,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 01:23:55,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:23:59,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1475720.0, ans=0.1 2023-10-04 01:24:02,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:24:02,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:24:04,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:24:04,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:24:04,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:24:04,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 01:24:05,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:24:07,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:24:08,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 01:24:13,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:24:15,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:24:15,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:24:16,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 01:24:17,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:24:19,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 01:24:20,445 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.89 vs. limit=15.0 2023-10-04 01:24:21,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:24:22,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:24:22,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:24:25,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 01:24:27,020 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 1.930e+02 2.101e+02 2.469e+02 3.261e+02, threshold=4.203e+02, percent-clipped=0.0 2023-10-04 01:24:27,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:24:32,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:24:32,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 01:24:32,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:24:38,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:24:40,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 01:24:45,550 INFO [train.py:1046] (1/4) Epoch 42, batch 3600, loss[loss=0.1453, simple_loss=0.2313, pruned_loss=0.02969, over 24637.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2347, pruned_loss=0.03731, over 4698832.86 frames. ], batch size: 65, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:24:47,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 01:24:47,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:24:48,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:24:50,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:24:50,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:24:52,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:24:55,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:24:58,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:24:59,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:25:00,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:25:00,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:25:00,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 01:25:03,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:25:05,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:25:09,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:25:12,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:25:13,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:25:13,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:25:13,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 01:25:13,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:25:18,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:25:19,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:25:19,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:25:20,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:25:22,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:25:22,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 01:25:22,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1476120.0, ans=0.1 2023-10-04 01:25:30,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:25:30,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:25:30,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1476186.6666666667, ans=0.125 2023-10-04 01:25:31,010 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:25:32,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 01:25:36,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:25:42,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:25:44,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:25:47,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:25:47,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:25:47,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 01:25:49,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 01:25:49,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 01:25:50,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:25:50,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:25:54,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 01:25:54,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:25:54,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:25:54,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:25:55,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 01:25:57,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 01:26:00,058 INFO [train.py:1046] (1/4) Epoch 42, batch 3650, loss[loss=0.164, simple_loss=0.2481, pruned_loss=0.03995, over 24068.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2347, pruned_loss=0.03723, over 4705981.88 frames. ], batch size: 80, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:26:00,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:26:01,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 01:26:04,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 01:26:04,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1476320.0, ans=0.015 2023-10-04 01:26:05,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:26:07,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1476320.0, ans=0.1 2023-10-04 01:26:09,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 01:26:10,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 01:26:10,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1476320.0, ans=0.1 2023-10-04 01:26:13,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:26:13,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:26:13,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:26:14,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1476386.6666666667, ans=0.1 2023-10-04 01:26:16,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 01:26:17,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:26:17,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 01:26:17,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:26:19,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:26:19,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 01:26:19,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 01:26:20,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:26:20,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:26:22,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:26:25,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 01:26:27,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 01:26:28,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:26:30,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 01:26:31,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:26:31,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:26:35,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:26:38,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:26:38,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:26:39,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:26:40,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:26:44,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:26:46,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:26:48,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:26:48,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:26:49,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 01:26:50,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:26:52,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:26:55,667 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.914e+02 2.089e+02 2.349e+02 3.091e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-04 01:26:58,999 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 01:27:01,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:27:01,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:27:01,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:27:03,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:27:04,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:27:05,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.22 vs. limit=22.5 2023-10-04 01:27:06,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:27:08,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 01:27:08,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:27:10,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:27:12,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:27:13,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:27:14,724 INFO [train.py:1046] (1/4) Epoch 42, batch 3700, loss[loss=0.1563, simple_loss=0.2354, pruned_loss=0.03864, over 23443.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2363, pruned_loss=0.03771, over 4708947.34 frames. ], batch size: 119, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:27:17,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:27:17,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 01:27:17,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:27:17,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 01:27:17,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1476653.3333333333, ans=0.2 2023-10-04 01:27:18,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:27:22,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:27:22,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1476653.3333333333, ans=0.125 2023-10-04 01:27:25,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:27:25,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:27:26,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:27:26,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:27:26,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 01:27:30,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:27:31,738 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 01:27:37,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:27:37,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:27:40,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:27:40,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 01:27:40,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:27:43,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:27:43,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 01:27:44,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:27:48,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:27:49,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:27:49,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:27:52,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 01:27:55,555 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.97 vs. limit=6.0 2023-10-04 01:27:56,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:27:56,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 01:27:58,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:27:58,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 01:28:05,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:28:05,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:28:06,009 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1476853.3333333333, ans=10.0 2023-10-04 01:28:07,905 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.16 vs. limit=15.0 2023-10-04 01:28:08,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:28:08,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 01:28:09,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:28:09,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:28:09,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:28:11,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:28:11,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1476853.3333333333, ans=0.125 2023-10-04 01:28:15,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:28:15,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 01:28:17,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 01:28:18,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:28:18,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:28:18,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1476920.0, ans=0.125 2023-10-04 01:28:20,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:28:20,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:28:24,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:28:24,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:28:26,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:28:27,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 01:28:28,709 INFO [train.py:1046] (1/4) Epoch 42, batch 3750, loss[loss=0.1508, simple_loss=0.2365, pruned_loss=0.03252, over 24506.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2368, pruned_loss=0.03741, over 4730243.99 frames. ], batch size: 66, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:28:28,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 01:28:33,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 01:28:33,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 01:28:35,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:28:36,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:28:37,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:28:39,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:28:40,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:28:43,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:28:46,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:28:49,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:28:52,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:28:52,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 01:28:54,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:28:54,984 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.82 vs. limit=6.0 2023-10-04 01:28:55,150 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.91 vs. limit=6.0 2023-10-04 01:28:55,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:28:55,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:28:59,241 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.90 vs. limit=6.0 2023-10-04 01:28:59,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 01:29:03,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1477120.0, ans=0.1 2023-10-04 01:29:04,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 01:29:04,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:29:06,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:29:07,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:29:10,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:29:12,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 01:29:16,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 01:29:17,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:29:22,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:29:22,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1477186.6666666667, ans=0.0 2023-10-04 01:29:24,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:29:25,560 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.079e+02 2.348e+02 2.758e+02 4.520e+02, threshold=4.696e+02, percent-clipped=4.0 2023-10-04 01:29:26,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:29:31,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 01:29:33,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:29:33,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1477253.3333333333, ans=0.125 2023-10-04 01:29:34,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:29:36,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:29:37,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:29:43,151 INFO [train.py:1046] (1/4) Epoch 42, batch 3800, loss[loss=0.1412, simple_loss=0.2186, pruned_loss=0.03191, over 24299.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2367, pruned_loss=0.03734, over 4730919.21 frames. ], batch size: 56, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:29:46,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:29:50,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:29:51,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 01:29:51,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 01:29:53,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:29:56,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:29:56,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 01:29:59,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 01:29:59,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:30:01,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:30:03,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:30:04,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:30:04,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:30:06,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 01:30:09,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 01:30:09,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:30:10,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:30:13,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:30:14,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:30:16,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:30:16,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:30:20,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:30:20,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:30:24,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 01:30:24,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 01:30:24,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1477453.3333333333, ans=0.1 2023-10-04 01:30:26,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:30:32,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:30:36,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:30:39,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 01:30:41,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 01:30:42,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:30:44,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:30:44,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1477586.6666666667, ans=0.0 2023-10-04 01:30:45,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:30:47,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 01:30:51,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 01:30:51,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 01:30:51,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:30:51,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:30:57,788 INFO [train.py:1046] (1/4) Epoch 42, batch 3850, loss[loss=0.159, simple_loss=0.2497, pruned_loss=0.03416, over 24709.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2355, pruned_loss=0.03726, over 4708770.48 frames. ], batch size: 73, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:30:57,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:30:59,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:31:03,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:31:03,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 01:31:05,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:31:05,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:31:09,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:31:10,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:31:14,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 01:31:14,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 01:31:21,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:22,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:31:24,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:31:24,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:31:26,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:27,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:31:29,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:31:29,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:31:29,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:31:32,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:31:33,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:33,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:31:35,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 01:31:35,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 01:31:35,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1477786.6666666667, ans=0.2 2023-10-04 01:31:36,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:31:36,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:39,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:31:39,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:40,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 01:31:43,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 01:31:44,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:31:46,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 01:31:46,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 01:31:52,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:31:52,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:53,982 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.866e+02 2.015e+02 2.383e+02 4.192e+02, threshold=4.030e+02, percent-clipped=0.0 2023-10-04 01:31:56,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:31:56,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 01:32:00,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 01:32:03,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:32:03,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:32:06,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:32:06,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:32:06,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:07,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:07,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:32:07,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 01:32:08,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:32:10,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 01:32:10,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:10,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:32:12,085 INFO [train.py:1046] (1/4) Epoch 42, batch 3900, loss[loss=0.1478, simple_loss=0.2288, pruned_loss=0.03344, over 24309.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2342, pruned_loss=0.03709, over 4707094.10 frames. ], batch size: 61, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:32:12,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:32:13,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:13,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:32:14,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:32:14,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:32:14,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:32:14,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 01:32:16,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:19,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:32:20,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:32:20,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:32:20,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1477986.6666666667, ans=0.125 2023-10-04 01:32:22,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:32:23,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:32:24,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:25,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1478053.3333333333, ans=0.1 2023-10-04 01:32:26,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:32:29,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 01:32:29,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:32:31,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 01:32:33,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:34,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 01:32:34,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 01:32:38,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:32:40,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:32:40,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:32:41,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:32:44,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:32:46,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:32:49,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:32:49,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:32:49,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:32:49,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1478120.0, ans=0.0 2023-10-04 01:32:56,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:32:56,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:32:58,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1478186.6666666667, ans=0.2 2023-10-04 01:33:01,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1478186.6666666667, ans=0.2 2023-10-04 01:33:02,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:33:03,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:33:13,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:33:16,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:33:16,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 01:33:16,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 01:33:16,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:33:18,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 01:33:20,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:33:20,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 01:33:27,076 INFO [train.py:1046] (1/4) Epoch 42, batch 3950, loss[loss=0.1517, simple_loss=0.2349, pruned_loss=0.03422, over 24669.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2339, pruned_loss=0.0368, over 4694883.72 frames. ], batch size: 65, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:33:27,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:33:29,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 01:33:29,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:33:32,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:33:34,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:33:40,027 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 01:33:41,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:33:41,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 01:33:42,695 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 01:33:42,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:33:45,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:33:45,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:33:45,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:33:48,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 01:33:51,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:33:51,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:33:52,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:33:52,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:33:54,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:33:54,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1478453.3333333333, ans=0.2 2023-10-04 01:34:04,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:34:04,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:34:10,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 01:34:14,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 01:34:14,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 01:34:16,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:34:17,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:34:19,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1478520.0, ans=0.125 2023-10-04 01:34:21,630 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.959e+02 2.268e+02 2.678e+02 3.550e+02, threshold=4.535e+02, percent-clipped=0.0 2023-10-04 01:34:24,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:34:24,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:34:24,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:34:26,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:34:26,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 01:34:28,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1478586.6666666667, ans=0.1 2023-10-04 01:34:30,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:34:33,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:34:35,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 01:34:40,807 INFO [train.py:1046] (1/4) Epoch 42, batch 4000, loss[loss=0.1427, simple_loss=0.227, pruned_loss=0.02916, over 24631.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2344, pruned_loss=0.03676, over 4713800.23 frames. ], batch size: 60, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:34:45,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:34:50,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:34:57,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:34:57,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:34:57,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:34:57,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 01:34:59,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:34:59,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 01:34:59,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:34:59,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 01:35:01,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:35:04,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:35:04,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:35:04,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:35:06,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:35:06,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 01:35:08,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:35:08,893 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 01:35:10,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:35:10,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:35:13,643 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 01:35:13,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:35:14,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:35:20,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 01:35:20,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:35:23,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:35:25,227 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 01:35:26,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:35:26,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 01:35:27,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:35:27,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:35:29,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:35:32,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:35:32,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:35:32,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1478853.3333333333, ans=0.1 2023-10-04 01:35:32,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1478853.3333333333, ans=0.125 2023-10-04 01:35:34,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:35:35,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 01:35:35,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:35:38,406 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 01:35:40,567 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.64 vs. limit=15.0 2023-10-04 01:35:44,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:35:46,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 01:35:48,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:35:48,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:35:49,499 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.77 vs. limit=15.0 2023-10-04 01:35:50,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:35:51,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:35:54,038 INFO [train.py:1046] (1/4) Epoch 42, batch 4050, loss[loss=0.1596, simple_loss=0.2485, pruned_loss=0.03533, over 24579.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.236, pruned_loss=0.03733, over 4719299.00 frames. ], batch size: 71, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:35:54,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.27 vs. limit=15.0 2023-10-04 01:35:55,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:35:58,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 01:35:58,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 01:36:01,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:36:01,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:36:02,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:36:04,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:36:05,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:36:08,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:36:10,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:36:10,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 01:36:11,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:36:12,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1479053.3333333333, ans=0.0 2023-10-04 01:36:13,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:36:16,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:36:19,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:36:22,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 01:36:23,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 01:36:23,808 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 01:36:24,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1479120.0, ans=0.2 2023-10-04 01:36:26,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:36:31,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 01:36:33,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:36:36,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:36:39,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:36:39,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:36:39,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:36:41,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:36:46,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 01:36:46,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:36:48,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:36:48,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1479186.6666666667, ans=0.125 2023-10-04 01:36:49,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 01:36:50,640 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.894e+02 2.127e+02 2.287e+02 3.562e+02, threshold=4.254e+02, percent-clipped=0.0 2023-10-04 01:36:53,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:36:53,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1479253.3333333333, ans=0.125 2023-10-04 01:36:54,246 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.83 vs. limit=15.0 2023-10-04 01:37:00,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 01:37:01,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1479253.3333333333, ans=0.125 2023-10-04 01:37:02,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:37:02,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:37:04,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 01:37:04,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 01:37:04,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:37:05,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:37:07,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:07,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:37:08,900 INFO [train.py:1046] (1/4) Epoch 42, batch 4100, loss[loss=0.1443, simple_loss=0.2271, pruned_loss=0.03076, over 24659.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2369, pruned_loss=0.03781, over 4722086.13 frames. ], batch size: 65, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:37:16,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 01:37:16,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 01:37:17,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 01:37:19,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 01:37:19,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:37:19,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:20,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:20,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:37:21,730 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 01:37:24,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:37:27,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:37:27,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:37:27,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:37:28,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1479386.6666666667, ans=0.0 2023-10-04 01:37:31,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:37:33,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:37:33,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:37:34,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 01:37:34,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:34,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:37:35,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:37:35,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:37:38,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 01:37:38,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1479453.3333333333, ans=0.125 2023-10-04 01:37:39,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:37:40,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 01:37:41,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:37:45,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:37:45,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 01:37:46,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:37:47,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1479453.3333333333, ans=0.0 2023-10-04 01:37:48,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:37:48,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:37:48,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1479453.3333333333, ans=0.125 2023-10-04 01:37:49,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 01:37:50,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1479453.3333333333, ans=0.0 2023-10-04 01:37:51,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:37:51,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:37:53,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 01:37:54,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:54,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:37:56,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:38:01,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:38:04,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:38:04,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1479520.0, ans=0.125 2023-10-04 01:38:06,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:38:13,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:38:13,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:38:18,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:38:19,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:38:22,316 INFO [train.py:1046] (1/4) Epoch 42, batch 4150, loss[loss=0.1494, simple_loss=0.228, pruned_loss=0.03536, over 24285.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2367, pruned_loss=0.03763, over 4722645.08 frames. ], batch size: 56, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:38:23,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:38:25,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:38:25,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:38:25,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:38:26,049 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.18 vs. limit=15.0 2023-10-04 01:38:29,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 01:38:29,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:38:30,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 01:38:30,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 01:38:30,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 01:38:33,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:38:33,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1479653.3333333333, ans=0.0 2023-10-04 01:38:36,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:38:36,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:38:41,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:38:42,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:38:42,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:38:45,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 01:38:45,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:38:47,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 01:38:50,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:38:53,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:38:54,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 01:38:56,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 01:38:56,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:38:57,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 01:38:57,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:38:57,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:39:01,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:39:01,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:39:04,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 01:39:08,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:39:10,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:39:10,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 01:39:11,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:39:12,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 01:39:15,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:39:17,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:39:18,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:39:19,992 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.917e+02 2.149e+02 2.587e+02 4.183e+02, threshold=4.298e+02, percent-clipped=0.0 2023-10-04 01:39:20,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 01:39:20,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:39:20,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:39:22,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 01:39:24,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 01:39:25,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:39:25,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:39:25,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:39:26,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 01:39:27,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:39:27,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:39:27,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1479920.0, ans=0.0 2023-10-04 01:39:28,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:39:29,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:39:29,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 01:39:31,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:39:35,914 INFO [train.py:1046] (1/4) Epoch 42, batch 4200, loss[loss=0.1428, simple_loss=0.2203, pruned_loss=0.03261, over 24466.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2357, pruned_loss=0.03734, over 4729781.17 frames. ], batch size: 58, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:39:37,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:39:37,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 01:39:40,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:39:42,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:39:42,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:39:44,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:39:44,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:39:45,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 01:39:48,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 01:39:50,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:39:51,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:39:53,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:39:55,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 01:39:58,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:39:58,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:39:59,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 01:39:59,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:40:01,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:40:01,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:40:01,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:40:04,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:40:06,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 01:40:06,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:40:11,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:40:11,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:40:14,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:40:15,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:40:16,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:40:16,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 01:40:16,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:40:18,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:40:20,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1480186.6666666667, ans=0.125 2023-10-04 01:40:24,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 01:40:25,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:40:30,210 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.69 vs. limit=12.0 2023-10-04 01:40:32,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:40:33,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 01:40:37,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:40:42,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:40:43,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:40:44,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 01:40:49,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:40:50,720 INFO [train.py:1046] (1/4) Epoch 42, batch 4250, loss[loss=0.1274, simple_loss=0.1844, pruned_loss=0.03518, over 19266.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2345, pruned_loss=0.03721, over 4731613.31 frames. ], batch size: 388, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:40:52,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:40:52,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:40:56,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:40:59,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:40:59,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 01:40:59,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:41:03,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:41:09,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:41:13,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:41:13,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:41:14,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:41:14,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:41:17,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:41:17,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:41:19,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:41:22,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:41:23,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:41:25,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 01:41:27,339 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.40 vs. limit=15.0 2023-10-04 01:41:29,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 01:41:29,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:41:29,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:41:29,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:41:30,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:41:30,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:41:31,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:41:34,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 01:41:36,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:41:39,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:41:41,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:41:43,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 01:41:43,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:41:43,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 01:41:44,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:41:45,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:41:48,628 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 2.009e+02 2.213e+02 2.557e+02 3.155e+02, threshold=4.427e+02, percent-clipped=0.0 2023-10-04 01:41:48,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:41:48,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:41:51,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 01:41:52,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:41:52,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:41:57,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:41:58,113 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.92 vs. limit=15.0 2023-10-04 01:41:58,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:42:00,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:42:01,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:42:02,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1480586.6666666667, ans=0.0 2023-10-04 01:42:03,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:42:04,513 INFO [train.py:1046] (1/4) Epoch 42, batch 4300, loss[loss=0.1374, simple_loss=0.2203, pruned_loss=0.02727, over 24604.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2343, pruned_loss=0.03705, over 4724258.60 frames. ], batch size: 60, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:42:04,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:42:05,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:42:05,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 01:42:07,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:42:09,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1480653.3333333333, ans=0.04949747468305833 2023-10-04 01:42:13,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:42:13,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:42:17,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:42:21,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1480720.0, ans=0.0 2023-10-04 01:42:24,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:42:24,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 01:42:24,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:42:26,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:42:26,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:42:26,277 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 01:42:29,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:42:31,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:42:34,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 01:42:34,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:42:34,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 01:42:37,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 01:42:39,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:42:42,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:42:42,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:42:43,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:42:44,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:42:46,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:42:46,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 01:42:46,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 01:42:49,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:42:52,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:42:52,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:42:52,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:42:54,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:42:54,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 01:42:54,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 01:42:54,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 01:42:54,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:42:54,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 01:42:55,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 01:42:59,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:43:00,962 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 01:43:02,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:43:03,100 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.85 vs. limit=10.0 2023-10-04 01:43:03,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:43:03,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:43:03,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1480920.0, ans=0.125 2023-10-04 01:43:06,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 01:43:06,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:43:06,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:43:07,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:43:07,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:43:07,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:43:08,595 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.66 vs. limit=15.0 2023-10-04 01:43:11,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:43:14,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:43:16,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:43:16,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:43:19,050 INFO [train.py:1046] (1/4) Epoch 42, batch 4350, loss[loss=0.1269, simple_loss=0.2091, pruned_loss=0.02235, over 21815.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2349, pruned_loss=0.03726, over 4719951.95 frames. ], batch size: 48, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:43:21,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 01:43:21,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 01:43:26,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:43:27,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:43:29,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:43:29,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:43:32,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:43:36,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:43:40,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.69 vs. limit=12.0 2023-10-04 01:43:41,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:43:41,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:43:41,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1481053.3333333333, ans=0.125 2023-10-04 01:43:45,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:43:47,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:43:48,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:43:52,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 01:43:52,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:43:52,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1481120.0, ans=0.0 2023-10-04 01:43:53,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:43:53,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1481120.0, ans=0.0 2023-10-04 01:43:59,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:01,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1481120.0, ans=0.1 2023-10-04 01:44:02,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 01:44:04,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:44:05,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 01:44:09,751 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 01:44:11,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:44:12,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:44:14,179 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 01:44:15,577 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 01:44:15,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:44:15,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:44:16,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:44:17,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:44:18,232 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.906e+02 2.107e+02 2.374e+02 3.775e+02, threshold=4.214e+02, percent-clipped=0.0 2023-10-04 01:44:18,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:44:18,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:44:21,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 01:44:21,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:21,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:44:22,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:23,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1481253.3333333333, ans=0.125 2023-10-04 01:44:24,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 01:44:24,398 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 01:44:24,405 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 01:44:24,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 01:44:27,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:44:27,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1481253.3333333333, ans=0.2 2023-10-04 01:44:29,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:44:29,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:44:29,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:44:30,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 01:44:31,930 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 01:44:31,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:33,260 INFO [train.py:1046] (1/4) Epoch 42, batch 4400, loss[loss=0.171, simple_loss=0.2584, pruned_loss=0.04179, over 24347.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2358, pruned_loss=0.03762, over 4719711.14 frames. ], batch size: 77, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:44:36,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:44:36,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:36,875 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=9.19 vs. limit=15.0 2023-10-04 01:44:38,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:44:40,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 01:44:40,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 01:44:41,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 01:44:41,054 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 01:44:43,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 01:44:43,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:44:44,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 01:44:47,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:48,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:44:48,421 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 01:44:50,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1481386.6666666667, ans=0.1 2023-10-04 01:44:51,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:44:51,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 01:44:51,690 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 01:44:54,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 01:44:54,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 01:44:55,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 01:44:55,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:44:55,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:44:57,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:44:59,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:45:00,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 01:45:00,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 01:45:01,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:45:04,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:45:05,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:45:07,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:45:07,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:45:07,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 01:45:07,219 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 01:45:13,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:45:19,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:45:20,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 01:45:23,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:45:26,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:45:26,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1481520.0, ans=0.0 2023-10-04 01:45:28,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:45:29,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 01:45:29,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:45:29,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:45:29,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:45:31,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:45:34,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 01:45:36,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 01:45:38,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 01:45:38,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:45:38,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 01:45:38,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:45:41,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:45:44,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 01:45:47,216 INFO [train.py:1046] (1/4) Epoch 42, batch 4450, loss[loss=0.1628, simple_loss=0.2434, pruned_loss=0.04109, over 23934.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2358, pruned_loss=0.03748, over 4725958.96 frames. ], batch size: 86, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:45:47,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:45:50,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:45:50,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:45:54,074 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.33 vs. limit=15.0 2023-10-04 01:45:57,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:45:57,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:45:59,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:46:02,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:46:05,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:46:06,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:46:06,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 01:46:06,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:46:06,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1481720.0, ans=0.1 2023-10-04 01:46:08,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:46:08,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:46:08,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:46:11,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:46:11,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1481720.0, ans=0.125 2023-10-04 01:46:15,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:46:15,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:46:17,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:46:17,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:46:17,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1481786.6666666667, ans=0.1 2023-10-04 01:46:19,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:46:20,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1481786.6666666667, ans=0.125 2023-10-04 01:46:23,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 01:46:23,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 01:46:24,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 01:46:24,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:46:29,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:46:29,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 01:46:31,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:46:32,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1481853.3333333333, ans=0.0 2023-10-04 01:46:36,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:46:36,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 01:46:36,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:46:36,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:46:36,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:46:36,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:46:38,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:46:41,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:46:42,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 01:46:44,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:46:44,357 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:46:45,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:46:46,653 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 2.039e+02 2.233e+02 2.622e+02 3.908e+02, threshold=4.466e+02, percent-clipped=0.0 2023-10-04 01:46:46,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:46:50,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:46:50,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 01:46:54,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:46:56,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 01:46:58,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:47:01,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1481986.6666666667, ans=0.2 2023-10-04 01:47:02,288 INFO [train.py:1046] (1/4) Epoch 42, batch 4500, loss[loss=0.1575, simple_loss=0.2487, pruned_loss=0.0332, over 24655.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2371, pruned_loss=0.03766, over 4729398.19 frames. ], batch size: 68, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:47:02,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:47:02,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1481986.6666666667, ans=0.125 2023-10-04 01:47:03,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 01:47:03,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 01:47:05,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:47:09,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:47:11,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:47:12,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:47:12,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:47:12,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:47:13,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:47:25,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:47:26,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:47:29,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:47:31,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:47:31,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:47:38,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:47:42,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:47:46,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:47:49,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:47:49,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 01:47:50,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:47:50,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:47:53,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:47:53,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:47:55,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:47:56,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 01:47:56,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 01:47:56,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:00,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:48:00,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:48:01,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:04,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:48:04,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:48:07,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 01:48:08,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 01:48:08,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 01:48:12,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 01:48:16,357 INFO [train.py:1046] (1/4) Epoch 42, batch 4550, loss[loss=0.151, simple_loss=0.2257, pruned_loss=0.03814, over 18974.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2359, pruned_loss=0.03744, over 4721566.57 frames. ], batch size: 41, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:48:16,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 01:48:16,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:48:20,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:48:20,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:48:22,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:48:23,130 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.92 vs. limit=22.5 2023-10-04 01:48:27,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:48:31,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:48:31,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:48:31,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:48:31,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:34,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:48:34,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:48:38,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:48:41,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 01:48:41,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 01:48:42,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:48:44,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 01:48:47,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 01:48:47,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:48:50,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 01:48:52,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:48:55,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:55,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:56,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:48:56,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1482453.3333333333, ans=0.0 2023-10-04 01:48:57,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 01:49:01,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:49:04,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:49:05,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:49:05,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1482520.0, ans=0.0 2023-10-04 01:49:06,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:49:07,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 01:49:08,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 01:49:08,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:49:09,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 01:49:12,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 01:49:12,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:49:12,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1482520.0, ans=0.125 2023-10-04 01:49:14,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:49:14,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:49:15,353 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 1.972e+02 2.159e+02 2.425e+02 3.623e+02, threshold=4.318e+02, percent-clipped=0.0 2023-10-04 01:49:15,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:49:15,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:49:17,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:49:17,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1482586.6666666667, ans=0.2 2023-10-04 01:49:18,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 01:49:20,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:49:20,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 01:49:20,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 01:49:20,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:49:20,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 01:49:23,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:49:23,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:49:26,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:49:26,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:49:26,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:49:27,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:49:30,393 INFO [train.py:1046] (1/4) Epoch 42, batch 4600, loss[loss=0.1546, simple_loss=0.2459, pruned_loss=0.03168, over 24654.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2352, pruned_loss=0.03737, over 4724736.22 frames. ], batch size: 68, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:49:30,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:49:30,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1482653.3333333333, ans=0.0 2023-10-04 01:49:30,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1482653.3333333333, ans=0.1 2023-10-04 01:49:33,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:49:34,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:49:36,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:49:36,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:49:37,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:49:38,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 01:49:40,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:49:42,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:49:44,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:49:47,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:49:49,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1482720.0, ans=0.0 2023-10-04 01:49:53,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 01:49:54,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:49:56,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:49:58,705 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.29 vs. limit=12.0 2023-10-04 01:50:00,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:50:00,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:50:02,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1482786.6666666667, ans=0.0 2023-10-04 01:50:05,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 01:50:05,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:50:05,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:50:12,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:12,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:50:14,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:50:18,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 01:50:20,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 01:50:24,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:50:25,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:50:29,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:50:29,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 01:50:29,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1482920.0, ans=0.125 2023-10-04 01:50:30,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:30,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 01:50:30,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:50:30,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:50:33,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:50:34,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:50:35,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:50:35,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 01:50:36,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 01:50:37,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 01:50:37,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:50:37,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:50:39,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:50:40,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:50:45,196 INFO [train.py:1046] (1/4) Epoch 42, batch 4650, loss[loss=0.1583, simple_loss=0.236, pruned_loss=0.04026, over 23748.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2349, pruned_loss=0.03718, over 4719482.26 frames. ], batch size: 135, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:50:51,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:50:51,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1482986.6666666667, ans=0.125 2023-10-04 01:50:52,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:50:52,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:52,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:50:52,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:50:52,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:50:54,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:56,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 01:51:01,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:51:02,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 01:51:02,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:51:03,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 01:51:03,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:51:03,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 01:51:03,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 01:51:03,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:51:04,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:51:04,914 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.49 vs. limit=12.0 2023-10-04 01:51:09,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:51:11,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:51:11,295 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 01:51:14,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:51:15,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 01:51:17,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:51:17,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:51:18,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 01:51:20,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:51:22,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:51:26,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:51:27,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1483186.6666666667, ans=0.125 2023-10-04 01:51:30,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:51:32,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:51:32,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:51:33,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:51:33,721 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.74 vs. limit=15.0 2023-10-04 01:51:34,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=1483186.6666666667, ans=0.05 2023-10-04 01:51:37,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 01:51:37,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 01:51:37,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 01:51:37,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 01:51:40,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:51:45,017 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.878e+02 2.086e+02 2.466e+02 3.529e+02, threshold=4.172e+02, percent-clipped=0.0 2023-10-04 01:51:47,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:51:47,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:51:49,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 01:51:49,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:51:51,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:51:51,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:51:52,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:51:55,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:51:55,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:51:56,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:51:57,876 INFO [train.py:1046] (1/4) Epoch 42, batch 4700, loss[loss=0.14, simple_loss=0.2265, pruned_loss=0.02673, over 24440.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2349, pruned_loss=0.03738, over 4712633.09 frames. ], batch size: 63, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:51:59,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:52:01,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:52:01,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:52:01,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 01:52:02,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:52:04,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 01:52:06,609 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.36 vs. limit=15.0 2023-10-04 01:52:11,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:52:13,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:52:14,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:52:14,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:52:14,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:52:18,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 01:52:20,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 01:52:20,999 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.90 vs. limit=12.0 2023-10-04 01:52:22,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:52:23,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:52:23,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:52:25,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:52:30,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:52:31,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1483453.3333333333, ans=0.125 2023-10-04 01:52:32,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 01:52:32,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:52:36,064 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.59 vs. limit=15.0 2023-10-04 01:52:39,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 01:52:39,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:52:42,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:52:46,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 01:52:48,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:52:52,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:52:54,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 01:52:56,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:52:56,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:52:58,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:52:58,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:52:58,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 01:53:00,270 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 01:53:01,217 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.15 vs. limit=15.0 2023-10-04 01:53:01,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:53:01,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1483586.6666666667, ans=0.125 2023-10-04 01:53:04,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:53:04,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:53:04,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 01:53:05,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:53:08,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 01:53:10,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1483653.3333333333, ans=0.1 2023-10-04 01:53:11,542 INFO [train.py:1046] (1/4) Epoch 42, batch 4750, loss[loss=0.1342, simple_loss=0.2139, pruned_loss=0.02727, over 24320.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2357, pruned_loss=0.03795, over 4701192.45 frames. ], batch size: 56, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:53:11,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:53:12,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:53:13,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1483653.3333333333, ans=0.0 2023-10-04 01:53:16,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1483653.3333333333, ans=0.125 2023-10-04 01:53:17,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:53:17,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:53:20,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 01:53:20,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:53:22,165 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.72 vs. limit=6.0 2023-10-04 01:53:23,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 01:53:26,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:53:26,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:53:28,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:53:33,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 01:53:37,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:53:39,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 01:53:40,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:53:43,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:53:43,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:53:43,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:53:44,577 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 01:53:44,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 01:53:48,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1483786.6666666667, ans=0.125 2023-10-04 01:53:50,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 01:53:52,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:53:54,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:53:56,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:53:56,805 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 01:53:56,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:01,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:54:04,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:54:05,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 01:54:06,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 01:54:06,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:54:06,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:54:07,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:54:08,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:54:08,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 01:54:09,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 01:54:11,197 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.906e+02 2.067e+02 2.489e+02 4.239e+02, threshold=4.133e+02, percent-clipped=1.0 2023-10-04 01:54:11,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:54:12,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:54:12,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 01:54:14,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:54:15,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:15,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:54:15,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:54:17,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:54:21,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:54:21,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 01:54:23,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 01:54:23,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 01:54:24,968 INFO [train.py:1046] (1/4) Epoch 42, batch 4800, loss[loss=0.153, simple_loss=0.2371, pruned_loss=0.03446, over 24478.00 frames. ], tot_loss[loss=0.156, simple_loss=0.236, pruned_loss=0.03798, over 4690183.80 frames. ], batch size: 69, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:54:28,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:54:29,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:54:31,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 01:54:35,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:54:35,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:54:35,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1483986.6666666667, ans=0.125 2023-10-04 01:54:40,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:54:42,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:54:42,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:54:42,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 01:54:43,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:54:43,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:54:45,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:54:48,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:54:50,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:50,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:54:51,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:51,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 01:54:51,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:54:53,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:54:56,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:59,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:55:00,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:55:00,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:55:00,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:55:03,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:55:03,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 01:55:03,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 01:55:06,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:55:06,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:55:06,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:55:06,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:55:06,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:55:06,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1484120.0, ans=0.0 2023-10-04 01:55:10,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:55:10,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:55:13,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:55:16,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:17,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:55:17,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1484186.6666666667, ans=0.2 2023-10-04 01:55:20,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 01:55:20,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:55:21,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:21,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:55:21,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:55:21,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1484186.6666666667, ans=0.0 2023-10-04 01:55:27,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:55:27,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:55:27,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:28,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:55:28,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:55:30,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:55:33,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:55:33,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:34,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:55:34,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 01:55:34,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1484253.3333333333, ans=0.125 2023-10-04 01:55:37,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 01:55:37,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:55:37,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:55:38,762 INFO [train.py:1046] (1/4) Epoch 42, batch 4850, loss[loss=0.1663, simple_loss=0.2561, pruned_loss=0.03826, over 24671.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2363, pruned_loss=0.03792, over 4696453.68 frames. ], batch size: 73, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:55:38,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:55:38,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:40,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:55:48,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 01:55:50,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:55:50,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1484320.0, ans=0.2 2023-10-04 01:55:53,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:55:53,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:55:53,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:57,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:55:59,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:56:01,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:56:01,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 01:56:06,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:56:09,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:56:09,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:56:09,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:56:09,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 01:56:10,266 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.77 vs. limit=6.0 2023-10-04 01:56:11,087 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1484453.3333333333, ans=0.0 2023-10-04 01:56:12,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:56:12,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1484453.3333333333, ans=0.0 2023-10-04 01:56:13,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:56:16,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:56:16,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 01:56:16,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 01:56:17,239 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.28 vs. limit=10.0 2023-10-04 01:56:18,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:56:25,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:56:27,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 01:56:27,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:56:28,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:56:29,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1484520.0, ans=0.125 2023-10-04 01:56:30,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:56:31,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 01:56:31,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:56:33,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 01:56:33,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:56:33,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:56:35,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 01:56:40,323 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.976e+02 2.279e+02 2.618e+02 4.353e+02, threshold=4.559e+02, percent-clipped=2.0 2023-10-04 01:56:44,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:56:46,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1484586.6666666667, ans=0.1 2023-10-04 01:56:46,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1484586.6666666667, ans=0.125 2023-10-04 01:56:48,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:56:48,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:56:51,074 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.62 vs. limit=8.0 2023-10-04 01:56:51,971 INFO [train.py:1046] (1/4) Epoch 42, batch 4900, loss[loss=0.1409, simple_loss=0.2067, pruned_loss=0.03758, over 23528.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2355, pruned_loss=0.03758, over 4706529.67 frames. ], batch size: 256, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:56:54,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 01:56:54,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:56:59,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:56:59,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:57:01,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:57:04,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 01:57:08,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 01:57:11,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1484720.0, ans=0.125 2023-10-04 01:57:12,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 01:57:14,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 01:57:14,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:57:15,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:57:15,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:57:15,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:57:15,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:57:16,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 01:57:18,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 01:57:19,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:57:21,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:57:21,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:57:24,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:57:24,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:57:26,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:57:26,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 01:57:26,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1484786.6666666667, ans=0.125 2023-10-04 01:57:28,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:57:28,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:57:28,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 01:57:28,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 01:57:33,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 01:57:34,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:57:34,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1484786.6666666667, ans=0.0 2023-10-04 01:57:36,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:57:36,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:57:37,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:57:37,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 01:57:37,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:57:37,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 01:57:40,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:57:40,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1484853.3333333333, ans=0.125 2023-10-04 01:57:41,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 01:57:43,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:57:45,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 01:57:47,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:57:49,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 01:57:49,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 01:57:54,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:57:56,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:57:57,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1484920.0, ans=0.0 2023-10-04 01:57:58,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 01:57:59,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:57:59,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:58:01,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:58:04,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1484920.0, ans=0.125 2023-10-04 01:58:05,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:58:07,048 INFO [train.py:1046] (1/4) Epoch 42, batch 4950, loss[loss=0.1473, simple_loss=0.2342, pruned_loss=0.03018, over 24645.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2348, pruned_loss=0.03743, over 4713870.02 frames. ], batch size: 68, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:58:07,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:58:07,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:58:07,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 01:58:08,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:58:12,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:58:12,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:58:14,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 01:58:14,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 01:58:14,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:58:15,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 01:58:16,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:16,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:58:16,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:58:16,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:58:19,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:58:20,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:58:22,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:58:22,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:58:24,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:24,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:58:28,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:58:33,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:33,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:58:35,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:36,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:58:36,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:58:39,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 01:58:39,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 01:58:42,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:58:43,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:58:43,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:58:43,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:58:45,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:58:46,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:58:46,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:58:49,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:58:51,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:58:54,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:54,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:58:55,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 01:58:55,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:58:57,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:59:00,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:59:03,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:59:03,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:59:03,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:59:03,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:59:05,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:59:06,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:59:07,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:59:07,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:59:09,058 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.943e+02 2.204e+02 2.524e+02 4.078e+02, threshold=4.408e+02, percent-clipped=0.0 2023-10-04 01:59:09,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 01:59:10,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1485253.3333333333, ans=0.0 2023-10-04 01:59:12,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:59:16,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 01:59:16,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 01:59:20,895 INFO [train.py:1046] (1/4) Epoch 42, batch 5000, loss[loss=0.1578, simple_loss=0.235, pruned_loss=0.04036, over 14246.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2339, pruned_loss=0.03727, over 4695066.61 frames. ], batch size: 30, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:59:24,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:59:24,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 01:59:25,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 01:59:27,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 01:59:28,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:59:30,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 01:59:31,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:59:31,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:59:31,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 01:59:33,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:59:33,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:59:36,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 01:59:36,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:59:36,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:59:37,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 01:59:37,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 01:59:39,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:59:39,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 01:59:40,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:59:40,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:59:40,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:59:40,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 01:59:40,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 01:59:40,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1485386.6666666667, ans=0.04949747468305833 2023-10-04 01:59:43,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 01:59:43,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:59:44,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:59:44,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 01:59:45,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 01:59:49,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:59:49,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:59:49,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 01:59:49,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1485453.3333333333, ans=0.0 2023-10-04 01:59:52,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 01:59:52,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:59:53,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:59:54,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1485453.3333333333, ans=0.0 2023-10-04 01:59:57,852 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 02:00:02,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:00:02,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:00:02,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:08,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 02:00:08,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:00:08,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:00:08,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:00:11,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 02:00:11,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:00:14,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:00:15,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:00:19,433 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.28 vs. limit=12.0 2023-10-04 02:00:19,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 02:00:23,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:31,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:00:32,185 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.56 vs. limit=15.0 2023-10-04 02:00:34,920 INFO [train.py:1046] (1/4) Epoch 42, batch 5050, loss[loss=0.1706, simple_loss=0.2563, pruned_loss=0.04245, over 23712.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2349, pruned_loss=0.03735, over 4718667.08 frames. ], batch size: 85, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:00:35,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:35,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:00:35,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:00:35,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:00:35,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:00:36,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:41,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:41,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 02:00:42,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:00:43,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:00:45,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:00:46,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 02:00:46,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:00:47,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:00:50,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:00:52,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:00:53,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:00:55,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1485720.0, ans=0.0 2023-10-04 02:01:00,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 02:01:01,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:01:02,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:01:04,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 02:01:04,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:01:06,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:01:06,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:01:06,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:01:06,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 02:01:07,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 02:01:08,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:01:11,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:01:13,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:01:13,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 02:01:16,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:01:18,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 02:01:18,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:01:19,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:01:19,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:01:19,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:01:22,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:01:24,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:01:25,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:01:26,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:01:26,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:01:26,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 02:01:28,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:01:30,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:01:33,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1485920.0, ans=10.0 2023-10-04 02:01:34,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:01:36,273 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 02:01:36,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:01:37,597 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.637e+02 1.948e+02 2.169e+02 2.465e+02 3.458e+02, threshold=4.337e+02, percent-clipped=0.0 2023-10-04 02:01:37,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:01:37,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:01:37,769 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 02:01:40,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:01:40,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 02:01:40,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:01:43,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:01:45,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:01:45,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 02:01:46,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 02:01:48,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:01:48,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:01:49,450 INFO [train.py:1046] (1/4) Epoch 42, batch 5100, loss[loss=0.1644, simple_loss=0.2305, pruned_loss=0.0492, over 22785.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2361, pruned_loss=0.0375, over 4716235.55 frames. ], batch size: 322, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:01:49,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:01:52,243 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 02:01:53,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:01:56,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 02:01:56,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 02:01:58,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:01:59,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:02:02,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:02:02,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 02:02:04,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 02:02:08,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:02:09,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:02:11,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:02:16,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 02:02:16,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:02:17,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:02:17,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 02:02:19,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:02:20,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:02:20,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 02:02:22,254 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 02:02:22,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:02:22,885 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.55 vs. limit=12.0 2023-10-04 02:02:24,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 02:02:24,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 02:02:28,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:02:34,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:02:35,495 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.51 vs. limit=15.0 2023-10-04 02:02:36,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 02:02:37,517 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 02:02:37,527 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 02:02:38,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 02:02:38,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:02:40,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 02:02:43,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 02:02:46,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 02:02:47,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:02:49,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 02:02:50,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:02:52,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 02:02:57,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:02:57,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:02:57,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:02:59,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:02:59,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:02:59,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:03:01,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 02:03:01,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 02:03:03,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 02:03:03,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:03:03,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 02:03:04,531 INFO [train.py:1046] (1/4) Epoch 42, batch 5150, loss[loss=0.1616, simple_loss=0.2403, pruned_loss=0.0415, over 23249.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2367, pruned_loss=0.03798, over 4716590.19 frames. ], batch size: 105, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:03:05,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:03:05,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 02:03:08,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:03:10,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:03:14,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 02:03:14,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 02:03:17,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:03:17,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:03:20,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 02:03:20,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:03:20,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:03:21,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:03:21,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:03:21,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 02:03:22,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1486386.6666666667, ans=0.125 2023-10-04 02:03:23,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:03:23,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:03:23,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 02:03:25,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 02:03:27,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:03:31,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1486453.3333333333, ans=0.0 2023-10-04 02:03:33,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:03:34,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 02:03:39,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:03:44,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:03:44,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:03:48,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:03:48,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:03:48,837 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.58 vs. limit=22.5 2023-10-04 02:03:50,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 02:03:51,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1486520.0, ans=0.125 2023-10-04 02:03:55,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:03:56,119 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.06 vs. limit=6.0 2023-10-04 02:03:56,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:03:56,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:03:57,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1486520.0, ans=0.0 2023-10-04 02:03:59,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:04:01,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:04:02,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 02:04:05,033 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.977e+02 2.120e+02 2.430e+02 3.829e+02, threshold=4.240e+02, percent-clipped=0.0 2023-10-04 02:04:08,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:04:08,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1486586.6666666667, ans=0.125 2023-10-04 02:04:09,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 02:04:11,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:04:11,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:04:13,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:04:13,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:04:13,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:04:13,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:04:16,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:04:16,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:04:17,989 INFO [train.py:1046] (1/4) Epoch 42, batch 5200, loss[loss=0.1595, simple_loss=0.2404, pruned_loss=0.03932, over 24338.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2375, pruned_loss=0.03828, over 4709905.74 frames. ], batch size: 56, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 02:04:19,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:04:23,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 02:04:23,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:04:25,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:04:27,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1486653.3333333333, ans=0.0 2023-10-04 02:04:28,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:04:28,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:04:29,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:04:29,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 02:04:32,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 02:04:33,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:04:35,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 02:04:37,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:04:39,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 02:04:39,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 02:04:40,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 02:04:42,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 02:04:43,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:04:43,795 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 02:04:43,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:04:45,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:04:45,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:04:46,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 02:04:46,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:04:49,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:04:49,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1486786.6666666667, ans=0.2 2023-10-04 02:04:52,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 02:04:52,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 02:04:52,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 02:04:55,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1486786.6666666667, ans=0.04949747468305833 2023-10-04 02:04:56,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1486786.6666666667, ans=0.0 2023-10-04 02:04:58,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 02:04:59,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:05:06,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:05:06,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:05:09,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 02:05:09,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:05:09,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 02:05:09,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:05:11,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:05:14,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:05:15,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:05:18,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:05:19,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:05:19,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:05:23,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:05:24,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 02:05:24,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:05:24,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:05:27,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:05:29,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:05:30,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:05:31,692 INFO [train.py:1046] (1/4) Epoch 42, batch 5250, loss[loss=0.1597, simple_loss=0.241, pruned_loss=0.03922, over 23381.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2366, pruned_loss=0.03793, over 4713799.48 frames. ], batch size: 119, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:05:33,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:05:37,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:05:37,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:05:38,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:05:43,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:05:45,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:05:46,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:05:47,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:05:51,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 02:05:51,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:05:51,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:06:16,827 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.31 vs. limit=10.0 2023-10-04 02:06:19,528 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.14 vs. limit=15.0 2023-10-04 02:06:29,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1487253.3333333333, ans=10.0 2023-10-04 02:06:30,435 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 1.997e+02 2.174e+02 2.692e+02 4.160e+02, threshold=4.348e+02, percent-clipped=0.0 2023-10-04 02:06:32,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1487253.3333333333, ans=0.04949747468305833 2023-10-04 02:06:33,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1487253.3333333333, ans=0.125 2023-10-04 02:06:39,932 INFO [train.py:1046] (1/4) Epoch 42, batch 5300, loss[loss=0.1418, simple_loss=0.2064, pruned_loss=0.0386, over 23568.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2361, pruned_loss=0.03797, over 4712811.24 frames. ], batch size: 256, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:06:54,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:06:54,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 02:06:54,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 02:06:54,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:06:54,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:06:54,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:06:54,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:06:54,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:06:54,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:06:54,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:06:54,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:06:55,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:06:55,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 02:06:55,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 02:06:55,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 02:06:55,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:06:55,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 02:06:55,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 02:06:55,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:06:56,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:06:56,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:06:56,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:06:56,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:06:56,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:06:56,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:06:56,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:06:56,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:06:57,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:06:57,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:06:57,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:06:57,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:06:57,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 02:06:57,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:06:57,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:06:58,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 02:06:58,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 02:06:58,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:06:58,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:06:58,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 02:06:58,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 02:06:58,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 02:06:59,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:06:59,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:06:59,393 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 02:06:59,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 02:06:59,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 02:06:59,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:06:59,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 02:06:59,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 02:06:59,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 02:06:59,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 02:07:06,279 INFO [train.py:1046] (1/4) Epoch 43, batch 0, loss[loss=0.1586, simple_loss=0.2361, pruned_loss=0.04061, over 23650.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2361, pruned_loss=0.04061, over 23650.00 frames. ], batch size: 256, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:07:06,280 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 02:07:13,468 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.2600, 4.3760, 5.0814, 4.6813], device='cuda:1') 2023-10-04 02:07:17,997 INFO [train.py:1078] (1/4) Epoch 43, validation: loss=0.318, simple_loss=0.2688, pruned_loss=0.1836, over 1125622.00 frames. 2023-10-04 02:07:17,998 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 02:07:18,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 02:07:18,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:07:18,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1487400.0, ans=0.125 2023-10-04 02:07:19,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:07:25,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:07:26,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:07:26,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:07:26,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 02:07:27,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 02:07:30,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:07:30,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1487466.6666666667, ans=0.0 2023-10-04 02:07:31,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:07:34,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:07:34,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:07:36,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:07:36,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:07:37,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 02:07:38,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:07:46,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:07:48,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:07:49,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 02:07:54,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:07:54,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:07:55,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:08:01,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:08:05,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:08:09,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 02:08:09,894 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.71 vs. limit=10.0 2023-10-04 02:08:13,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 02:08:14,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:08:14,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:08:14,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:08:16,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:08:18,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 02:08:21,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:08:23,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:08:25,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:08:27,576 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.26 vs. limit=15.0 2023-10-04 02:08:28,318 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 02:08:29,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:08:31,050 INFO [train.py:1046] (1/4) Epoch 43, batch 50, loss[loss=0.1644, simple_loss=0.2487, pruned_loss=0.03999, over 23634.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2351, pruned_loss=0.03692, over 1074096.76 frames. ], batch size: 85, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:08:32,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:08:35,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:08:35,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 02:08:36,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:08:37,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:08:39,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:08:39,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:08:42,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:08:43,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 02:08:43,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:08:51,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:08:52,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 02:08:54,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 02:08:55,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:08:57,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:08:57,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:08:59,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:09:00,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:09:00,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 02:09:00,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:09:06,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:09:06,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1487866.6666666667, ans=0.2 2023-10-04 02:09:07,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:09:07,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:09:08,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 02:09:11,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:09:11,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1487866.6666666667, ans=0.125 2023-10-04 02:09:13,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:09:13,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 02:09:13,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:09:14,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 02:09:14,743 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1487933.3333333333, ans=0.0 2023-10-04 02:09:15,780 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.079e+02 2.255e+02 2.464e+02 4.467e+02, threshold=4.509e+02, percent-clipped=1.0 2023-10-04 02:09:21,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:09:23,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:09:24,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:09:26,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:09:26,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:09:29,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 02:09:29,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 02:09:32,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:09:32,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:09:33,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:09:33,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:09:33,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 02:09:34,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1488000.0, ans=0.125 2023-10-04 02:09:35,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 02:09:35,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 02:09:36,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:09:36,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:09:38,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 02:09:38,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 02:09:38,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1488000.0, ans=0.125 2023-10-04 02:09:40,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:09:40,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:09:42,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 02:09:42,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:09:44,787 INFO [train.py:1046] (1/4) Epoch 43, batch 100, loss[loss=0.1624, simple_loss=0.2366, pruned_loss=0.04416, over 23730.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.237, pruned_loss=0.03725, over 1884320.22 frames. ], batch size: 164, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:09:44,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:09:48,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:09:50,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:09:52,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 02:09:52,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:09:55,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:09:55,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:09:55,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:09:55,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:09:57,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:09:57,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 02:10:00,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 02:10:00,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:10:01,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:10:01,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:10:04,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 02:10:05,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:10:07,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:10:07,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:10:10,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:10:12,899 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 02:10:12,916 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 02:10:14,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:10:14,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:10:17,590 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.30 vs. limit=15.0 2023-10-04 02:10:18,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:10:19,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:10:21,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:27,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1488200.0, ans=0.2 2023-10-04 02:10:28,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:28,135 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 02:10:30,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 02:10:35,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:10:36,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:10:39,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:42,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:10:46,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:10:48,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:10:49,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:50,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:10:52,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:10:52,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:10:52,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:52,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 02:10:52,433 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 02:10:52,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:10:53,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:10:55,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:10:55,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:10:55,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 02:10:56,320 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.93 vs. limit=15.0 2023-10-04 02:10:57,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 02:10:57,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:10:57,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:10:58,914 INFO [train.py:1046] (1/4) Epoch 43, batch 150, loss[loss=0.1735, simple_loss=0.2428, pruned_loss=0.05215, over 23800.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2381, pruned_loss=0.0379, over 2521734.41 frames. ], batch size: 164, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:10:59,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:11:00,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:00,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:11:00,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:11:03,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:06,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:11:06,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:11:06,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1488400.0, ans=0.2 2023-10-04 02:11:07,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:12,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:11:13,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:13,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1488466.6666666667, ans=0.2 2023-10-04 02:11:14,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:11:16,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:19,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 02:11:20,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 02:11:20,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 02:11:21,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:11:21,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:11:23,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:11:24,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:11:25,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:11:25,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:27,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:29,306 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 02:11:32,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:11:35,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:11:38,049 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.97 vs. limit=15.0 2023-10-04 02:11:39,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:11:40,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 02:11:43,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:11:43,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:11:43,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:11:43,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1488600.0, ans=0.125 2023-10-04 02:11:44,687 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 1.926e+02 2.079e+02 2.360e+02 3.858e+02, threshold=4.157e+02, percent-clipped=0.0 2023-10-04 02:11:44,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:11:46,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:11:46,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:11:47,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:49,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 02:11:53,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:54,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:11:54,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:11:54,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:11:57,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:59,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 02:12:02,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:12:03,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:12:04,433 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.49 vs. limit=15.0 2023-10-04 02:12:06,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:12:09,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:12:09,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 02:12:09,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:12:10,552 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 02:12:12,462 INFO [train.py:1046] (1/4) Epoch 43, batch 200, loss[loss=0.1603, simple_loss=0.2348, pruned_loss=0.04285, over 23695.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2385, pruned_loss=0.03779, over 2999844.53 frames. ], batch size: 164, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:12:13,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:12:18,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:12:18,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:12:20,255 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.58 vs. limit=15.0 2023-10-04 02:12:20,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 02:12:21,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1488733.3333333333, ans=0.07 2023-10-04 02:12:22,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:12:22,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:12:22,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=1488733.3333333333, ans=10.0 2023-10-04 02:12:23,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 02:12:26,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:12:26,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1488800.0, ans=0.0 2023-10-04 02:12:27,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:12:27,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:12:30,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:12:31,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:12:31,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:12:47,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:12:48,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:12:48,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:12:48,732 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1488866.6666666667, ans=0.125 2023-10-04 02:12:49,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:12:51,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 02:12:51,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:12:53,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:12:54,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1488933.3333333333, ans=0.2 2023-10-04 02:12:55,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:12:56,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:12:56,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:12:57,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1488933.3333333333, ans=0.95 2023-10-04 02:12:58,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 02:12:58,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 02:12:58,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:13:01,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:13:01,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1488933.3333333333, ans=0.125 2023-10-04 02:13:07,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:13:10,942 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:13:11,161 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.79 vs. limit=15.0 2023-10-04 02:13:15,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:15,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:13:22,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:23,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1489066.6666666667, ans=0.1 2023-10-04 02:13:24,821 INFO [train.py:1046] (1/4) Epoch 43, batch 250, loss[loss=0.1537, simple_loss=0.2438, pruned_loss=0.03184, over 24644.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2388, pruned_loss=0.03785, over 3378378.78 frames. ], batch size: 68, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:13:24,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 02:13:24,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:13:24,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:13:24,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:13:25,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:13:26,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 02:13:27,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:13:27,790 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 02:13:29,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1489066.6666666667, ans=0.0 2023-10-04 02:13:30,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:31,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:13:33,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:35,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:13:38,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:13:38,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:39,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:13:42,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:13:50,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1489133.3333333333, ans=0.09899494936611666 2023-10-04 02:13:51,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:13:54,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:13:54,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:14:01,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:14:01,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:14:02,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:14:04,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:14:04,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:14:04,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:14:06,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:14:07,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:14:10,525 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 2.128e+02 2.316e+02 2.574e+02 3.711e+02, threshold=4.632e+02, percent-clipped=0.0 2023-10-04 02:14:11,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 02:14:11,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:14:13,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:14:13,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:14:15,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:14:15,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:14:16,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:14:16,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:14:18,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:14:19,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:14:19,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1489266.6666666667, ans=0.0 2023-10-04 02:14:20,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:14:21,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1489266.6666666667, ans=0.1 2023-10-04 02:14:22,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1489333.3333333333, ans=0.0 2023-10-04 02:14:23,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:14:24,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1489333.3333333333, ans=0.0 2023-10-04 02:14:26,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:14:30,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:14:33,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:14:35,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:14:35,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1489333.3333333333, ans=0.2 2023-10-04 02:14:38,801 INFO [train.py:1046] (1/4) Epoch 43, batch 300, loss[loss=0.1432, simple_loss=0.2091, pruned_loss=0.03862, over 23441.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2359, pruned_loss=0.0371, over 3687964.87 frames. ], batch size: 285, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:14:38,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 02:14:39,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1489400.0, ans=0.125 2023-10-04 02:14:40,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:14:40,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:14:41,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 02:14:41,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 02:14:43,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:14:43,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 02:14:46,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1489400.0, ans=0.1 2023-10-04 02:14:47,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:14:49,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:14:53,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:14:53,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 02:14:54,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:14:56,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:14:56,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 02:14:57,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:15:00,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 02:15:04,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:15:04,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 02:15:08,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 02:15:08,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:10,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:15:10,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1489533.3333333333, ans=0.0 2023-10-04 02:15:11,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:11,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 02:15:11,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:15:14,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:15:14,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1489533.3333333333, ans=0.2 2023-10-04 02:15:16,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:15:16,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:15:21,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 02:15:21,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 02:15:22,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:15:24,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:27,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 02:15:28,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:15:31,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:15:34,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:15:34,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 02:15:35,162 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.09 vs. limit=15.0 2023-10-04 02:15:37,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:37,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:15:40,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:40,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1489666.6666666667, ans=0.015 2023-10-04 02:15:42,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:15:42,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1489666.6666666667, ans=0.0 2023-10-04 02:15:43,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 02:15:43,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 02:15:43,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:15:44,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 02:15:46,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:46,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:15:49,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:15:49,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:15:49,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1489666.6666666667, ans=0.125 2023-10-04 02:15:50,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:15:52,149 INFO [train.py:1046] (1/4) Epoch 43, batch 350, loss[loss=0.1395, simple_loss=0.2123, pruned_loss=0.03341, over 23818.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2346, pruned_loss=0.03687, over 3921163.44 frames. ], batch size: 212, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:15:53,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:15:53,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 02:15:53,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1489733.3333333333, ans=0.125 2023-10-04 02:15:56,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:02,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:16:02,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:16:02,850 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:16:03,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:05,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 02:16:06,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:16:07,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 02:16:09,079 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:16:10,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:11,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 02:16:11,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:16:14,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 02:16:16,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:16:17,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:16:18,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:16:21,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:16:21,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:16:21,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:16:21,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:16:22,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:16:25,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:16:25,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:30,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1489866.6666666667, ans=0.125 2023-10-04 02:16:31,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:16:31,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:16:33,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:16:33,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:16:37,241 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.923e+02 2.044e+02 2.262e+02 2.758e+02, threshold=4.089e+02, percent-clipped=0.0 2023-10-04 02:16:39,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 02:16:39,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:44,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:16:44,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:16:45,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:16:47,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 02:16:49,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:16:50,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1490000.0, ans=0.2 2023-10-04 02:16:51,108 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 02:16:51,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 02:16:52,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:16:53,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:16:53,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 02:16:55,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:16:56,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:17:00,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:17:01,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:17:01,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:17:02,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:17:05,352 INFO [train.py:1046] (1/4) Epoch 43, batch 400, loss[loss=0.1582, simple_loss=0.2492, pruned_loss=0.0336, over 24440.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2341, pruned_loss=0.03696, over 4088561.92 frames. ], batch size: 69, lr: 2.37e-03, grad_scale: 32.0 2023-10-04 02:17:07,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:17:08,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:17:09,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 02:17:09,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:17:11,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:17:12,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:17:12,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:17:16,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:17:17,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:17:19,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 02:17:21,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 02:17:21,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:17:23,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 02:17:23,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:17:26,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:17:26,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:17:26,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 02:17:26,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:17:27,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:17:27,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:17:27,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:17:31,016 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 02:17:31,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 02:17:36,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:17:37,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:17:37,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 02:17:39,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 02:17:43,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:17:47,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:17:53,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 02:17:53,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_na.min_abs, batch_count=1490266.6666666667, ans=0.02 2023-10-04 02:17:56,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:17:57,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 02:17:58,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:18:00,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:18:02,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 02:18:05,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:18:07,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:18:09,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:18:11,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:18:12,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 02:18:15,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1490333.3333333333, ans=0.0 2023-10-04 02:18:16,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 02:18:17,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 02:18:19,705 INFO [train.py:1046] (1/4) Epoch 43, batch 450, loss[loss=0.1415, simple_loss=0.2225, pruned_loss=0.03022, over 24501.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2345, pruned_loss=0.03677, over 4233885.19 frames. ], batch size: 63, lr: 2.37e-03, grad_scale: 32.0 2023-10-04 02:18:19,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:18:19,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:18:21,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 02:18:22,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:18:22,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:18:24,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 02:18:26,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 02:18:27,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:18:27,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:18:28,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:18:28,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 02:18:28,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:18:30,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:18:33,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:18:40,018 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.09 vs. limit=15.0 2023-10-04 02:18:43,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:18:43,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:18:43,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1490466.6666666667, ans=0.125 2023-10-04 02:18:44,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 02:18:46,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 02:18:49,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:18:51,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:18:53,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:18:56,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:18:56,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:18:58,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 02:18:58,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1490533.3333333333, ans=0.05 2023-10-04 02:18:59,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 02:18:59,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 02:19:01,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:19:01,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:19:02,407 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=3.96 vs. limit=12.0 2023-10-04 02:19:02,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:19:02,954 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 02:19:02,962 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 02:19:04,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:19:06,077 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.944e+02 2.181e+02 2.558e+02 3.848e+02, threshold=4.361e+02, percent-clipped=0.0 2023-10-04 02:19:06,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:19:07,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 02:19:10,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:19:10,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:19:10,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1490600.0, ans=0.1 2023-10-04 02:19:11,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 02:19:11,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 02:19:14,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:19:18,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:19:18,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:19:20,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 02:19:22,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:19:24,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 02:19:25,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 02:19:25,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:19:30,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:19:31,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:19:33,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:19:34,857 INFO [train.py:1046] (1/4) Epoch 43, batch 500, loss[loss=0.1522, simple_loss=0.2376, pruned_loss=0.03337, over 24673.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.236, pruned_loss=0.03713, over 4342191.75 frames. ], batch size: 65, lr: 2.37e-03, grad_scale: 32.0 2023-10-04 02:19:34,905 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 02:19:37,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:19:40,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:19:40,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:19:41,456 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 02:19:42,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 02:19:43,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:19:43,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1490733.3333333333, ans=0.1 2023-10-04 02:19:45,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 02:19:52,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 02:19:52,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:19:54,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:19:55,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:19:55,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:19:56,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1490800.0, ans=0.07 2023-10-04 02:19:59,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1490800.0, ans=0.1 2023-10-04 02:20:04,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:04,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:20:04,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 02:20:04,628 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.07 vs. limit=10.0 2023-10-04 02:20:05,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:05,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 02:20:05,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:20:08,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:20:10,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:20:11,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:20:11,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:11,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 02:20:15,248 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 02:20:17,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:20:20,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:20,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:21,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:21,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:20:24,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 02:20:27,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:20:28,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:20:29,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1490933.3333333333, ans=0.1 2023-10-04 02:20:33,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:20:36,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:38,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1491000.0, ans=0.04949747468305833 2023-10-04 02:20:41,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:20:43,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1491000.0, ans=0.1 2023-10-04 02:20:44,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 02:20:45,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:20:45,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:20:47,644 INFO [train.py:1046] (1/4) Epoch 43, batch 550, loss[loss=0.1529, simple_loss=0.2267, pruned_loss=0.0396, over 23723.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2371, pruned_loss=0.03744, over 4421854.85 frames. ], batch size: 149, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:20:47,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 02:20:47,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 02:20:49,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:20:52,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 02:20:55,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 02:20:55,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:20:57,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 02:20:57,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:20:57,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:20:58,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:58,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:58,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:21:00,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:21:01,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:21:02,273 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.46 vs. limit=15.0 2023-10-04 02:21:02,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 02:21:02,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:21:08,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:08,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:21:10,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:21:11,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:21:15,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 02:21:17,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 02:21:18,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:21:23,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:21:23,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:21:25,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:21:27,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:27,997 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 02:21:28,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:21:30,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 02:21:32,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:21:32,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:21:32,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:21:33,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:35,575 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.973e+02 2.172e+02 2.445e+02 3.955e+02, threshold=4.345e+02, percent-clipped=0.0 2023-10-04 02:21:35,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 02:21:35,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1491266.6666666667, ans=0.125 2023-10-04 02:21:37,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1491266.6666666667, ans=0.0 2023-10-04 02:21:38,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 02:21:38,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:21:38,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:21:38,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:21:38,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:21:39,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1491266.6666666667, ans=0.2 2023-10-04 02:21:42,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:21:42,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:21:45,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:21:45,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:46,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 02:21:48,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:21:50,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:21:51,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:21:52,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:54,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:21:54,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 02:22:00,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 02:22:01,639 INFO [train.py:1046] (1/4) Epoch 43, batch 600, loss[loss=0.1375, simple_loss=0.2047, pruned_loss=0.03511, over 22672.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2381, pruned_loss=0.03792, over 4486962.15 frames. ], batch size: 322, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:22:02,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1491400.0, ans=0.125 2023-10-04 02:22:03,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 02:22:03,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:22:03,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:22:05,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:22:06,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1491400.0, ans=0.2 2023-10-04 02:22:10,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:22:13,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:22:15,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 02:22:17,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:22:19,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:22:20,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1491466.6666666667, ans=0.125 2023-10-04 02:22:21,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:22:23,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 02:22:23,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:22:28,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1491466.6666666667, ans=0.125 2023-10-04 02:22:29,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 02:22:32,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:22:32,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:22:32,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:22:37,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:22:38,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:22:38,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:22:40,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1491533.3333333333, ans=0.2 2023-10-04 02:22:44,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:22:48,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:22:48,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:22:48,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:22:50,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1491600.0, ans=0.035 2023-10-04 02:22:52,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1491600.0, ans=0.2 2023-10-04 02:22:56,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 02:23:00,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:23:00,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:23:04,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1491666.6666666667, ans=0.125 2023-10-04 02:23:05,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 02:23:06,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:23:08,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 02:23:08,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:23:09,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:23:14,845 INFO [train.py:1046] (1/4) Epoch 43, batch 650, loss[loss=0.1566, simple_loss=0.2094, pruned_loss=0.05187, over 19316.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2368, pruned_loss=0.03765, over 4520451.40 frames. ], batch size: 389, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:23:14,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 02:23:15,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1491733.3333333333, ans=0.0 2023-10-04 02:23:16,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 02:23:19,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:23:20,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:23:22,912 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.94 vs. limit=6.0 2023-10-04 02:23:23,118 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.82 vs. limit=15.0 2023-10-04 02:23:23,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:23:23,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1491733.3333333333, ans=0.125 2023-10-04 02:23:25,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 02:23:26,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:23:31,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:23:31,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:23:33,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1491800.0, ans=0.125 2023-10-04 02:23:34,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:23:37,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 02:23:38,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:23:40,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:23:41,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:23:41,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 02:23:41,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1491800.0, ans=0.0 2023-10-04 02:23:44,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:23:44,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:23:45,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 02:23:45,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:23:47,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1491866.6666666667, ans=0.2 2023-10-04 02:23:48,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:23:48,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1491866.6666666667, ans=0.125 2023-10-04 02:23:48,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1491866.6666666667, ans=0.125 2023-10-04 02:23:51,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:23:51,323 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 02:23:51,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:23:51,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:23:52,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1491866.6666666667, ans=0.1 2023-10-04 02:23:54,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:23:55,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:23:55,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:23:57,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:23:57,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 02:23:59,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:23:59,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:24:01,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:24:01,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:24:02,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 02:24:03,910 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.962e+02 2.234e+02 2.555e+02 3.806e+02, threshold=4.468e+02, percent-clipped=0.0 2023-10-04 02:24:04,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 02:24:04,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 02:24:05,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:05,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:24:05,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:24:05,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:24:08,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:24:11,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1491933.3333333333, ans=0.1 2023-10-04 02:24:15,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:15,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:24:16,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:24:17,100 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:24:18,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1492000.0, ans=0.0 2023-10-04 02:24:19,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:24:20,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:24:21,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:24:21,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1492000.0, ans=0.0 2023-10-04 02:24:27,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:24:27,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:24:27,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1492066.6666666667, ans=0.0 2023-10-04 02:24:28,404 INFO [train.py:1046] (1/4) Epoch 43, batch 700, loss[loss=0.1583, simple_loss=0.243, pruned_loss=0.03677, over 23393.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2356, pruned_loss=0.03717, over 4563026.54 frames. ], batch size: 93, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:24:28,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:24:28,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:24:33,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 02:24:33,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 02:24:36,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 02:24:37,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:38,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:24:40,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 02:24:45,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:24:48,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:24:49,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:51,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:24:51,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:24:54,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:55,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 02:24:55,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:24:59,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 02:25:04,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 02:25:05,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1492200.0, ans=0.125 2023-10-04 02:25:06,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:25:06,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:25:08,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:25:11,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:25:11,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 02:25:15,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:25:15,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:25:15,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 02:25:18,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:25:20,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:25:21,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:25:21,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1492266.6666666667, ans=0.125 2023-10-04 02:25:25,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1492333.3333333333, ans=0.2 2023-10-04 02:25:28,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:25:28,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 02:25:32,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 02:25:34,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 02:25:35,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:25:37,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:25:37,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:25:39,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:25:39,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 02:25:42,704 INFO [train.py:1046] (1/4) Epoch 43, batch 750, loss[loss=0.1548, simple_loss=0.2417, pruned_loss=0.03398, over 24446.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2348, pruned_loss=0.03719, over 4589115.54 frames. ], batch size: 66, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:25:42,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 02:25:44,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 02:25:44,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 02:25:44,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 02:25:45,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 02:25:45,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:25:48,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 02:25:48,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:25:50,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:25:51,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:25:54,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:25:54,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 02:25:54,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:25:56,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:25:58,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:25:58,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1492466.6666666667, ans=0.2 2023-10-04 02:26:00,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:26:01,105 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.38 vs. limit=15.0 2023-10-04 02:26:03,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:26:03,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:26:03,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 02:26:05,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:26:05,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:26:07,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:26:09,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 02:26:09,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 02:26:09,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:26:12,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 02:26:12,575 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 02:26:12,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 02:26:12,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:26:13,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 02:26:15,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:26:22,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:26:23,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:26:23,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:26:25,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:26:26,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:26:28,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 02:26:28,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:26:31,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 02:26:31,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:26:32,460 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.914e+02 2.119e+02 2.380e+02 3.754e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-04 02:26:33,113 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.26 vs. limit=22.5 2023-10-04 02:26:35,033 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.60 vs. limit=15.0 2023-10-04 02:26:35,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:26:35,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 02:26:35,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:26:41,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:26:41,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:26:43,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:26:44,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:26:44,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1492666.6666666667, ans=0.125 2023-10-04 02:26:48,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 02:26:48,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:26:50,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:26:52,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:26:52,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:26:52,791 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.66 vs. limit=10.0 2023-10-04 02:26:53,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1492666.6666666667, ans=0.125 2023-10-04 02:26:53,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1492666.6666666667, ans=0.1 2023-10-04 02:26:56,304 INFO [train.py:1046] (1/4) Epoch 43, batch 800, loss[loss=0.1722, simple_loss=0.2568, pruned_loss=0.0438, over 24037.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2351, pruned_loss=0.03764, over 4623704.43 frames. ], batch size: 80, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:26:56,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:26:56,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:26:59,612 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.48 vs. limit=15.0 2023-10-04 02:27:05,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:27:05,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:07,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:27:07,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:27:08,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:08,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:08,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:13,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:27:13,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1492800.0, ans=0.125 2023-10-04 02:27:14,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:27:17,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 02:27:18,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:18,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:27:19,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:27:19,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:27:19,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 02:27:19,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:27:21,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 02:27:24,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:27,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:27:27,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1492866.6666666667, ans=0.035 2023-10-04 02:27:28,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:27:28,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:27:33,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:33,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:37,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:27:39,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:27:39,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 02:27:39,194 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 02:27:40,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 02:27:40,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:27:40,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:27:43,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:43,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:27:46,750 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 02:27:48,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 02:27:49,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:27:50,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:27:54,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:27:58,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:59,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 02:27:59,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:28:02,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 02:28:05,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1493000.0, ans=0.0 2023-10-04 02:28:08,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:28:10,104 INFO [train.py:1046] (1/4) Epoch 43, batch 850, loss[loss=0.138, simple_loss=0.216, pruned_loss=0.03002, over 24581.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2357, pruned_loss=0.03728, over 4654029.85 frames. ], batch size: 60, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:28:11,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:28:13,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 02:28:13,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:28:16,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:28:16,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 02:28:16,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:28:16,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1493066.6666666667, ans=0.0 2023-10-04 02:28:18,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:28:19,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1493066.6666666667, ans=0.1 2023-10-04 02:28:20,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:28:21,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:28:23,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:28:23,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 02:28:24,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 02:28:24,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 02:28:26,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:28:26,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:28:27,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:28:27,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:28:29,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:28:32,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:28:33,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:28:33,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 02:28:36,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 02:28:38,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:28:39,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 02:28:42,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 02:28:44,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 02:28:46,292 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 02:28:47,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:28:47,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:28:47,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 02:28:50,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:28:50,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:28:51,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 02:28:54,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:28:56,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:28:56,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:28:56,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:28:57,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:28:59,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:29:00,302 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.958e+02 2.154e+02 2.504e+02 4.006e+02, threshold=4.308e+02, percent-clipped=0.0 2023-10-04 02:29:00,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 02:29:03,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.whiten.whitening_limit, batch_count=1493266.6666666667, ans=12.0 2023-10-04 02:29:05,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:29:05,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:29:06,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:29:06,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:29:07,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:29:13,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:29:14,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:29:16,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:29:16,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:29:16,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:29:19,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1493333.3333333333, ans=0.0 2023-10-04 02:29:24,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 02:29:24,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:29:26,257 INFO [train.py:1046] (1/4) Epoch 43, batch 900, loss[loss=0.1452, simple_loss=0.2269, pruned_loss=0.03177, over 24491.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2365, pruned_loss=0.0376, over 4670339.11 frames. ], batch size: 66, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:29:26,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 02:29:27,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:29:27,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:29:27,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 02:29:28,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1493400.0, ans=0.0 2023-10-04 02:29:32,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:29:37,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:29:37,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 02:29:39,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:29:40,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 02:29:41,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 02:29:43,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:29:43,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:29:43,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:29:43,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:29:52,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:29:52,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:29:53,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:29:53,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1493466.6666666667, ans=0.125 2023-10-04 02:29:56,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:30:02,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 02:30:03,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:30:06,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:30:07,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:30:08,336 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 02:30:09,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 02:30:09,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1493600.0, ans=0.125 2023-10-04 02:30:10,553 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=7.66 vs. limit=22.5 2023-10-04 02:30:11,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1493600.0, ans=0.0 2023-10-04 02:30:13,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1493600.0, ans=0.025 2023-10-04 02:30:14,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:30:14,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:30:16,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:30:16,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1493600.0, ans=0.0 2023-10-04 02:30:23,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:30:23,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:30:24,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 02:30:24,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:30:27,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 02:30:30,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:30:30,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:30:33,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:30:33,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:30:36,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 02:30:36,291 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 02:30:37,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 02:30:37,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 02:30:40,919 INFO [train.py:1046] (1/4) Epoch 43, batch 950, loss[loss=0.1659, simple_loss=0.2524, pruned_loss=0.03965, over 23723.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.237, pruned_loss=0.03784, over 4690997.05 frames. ], batch size: 85, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:30:42,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:30:43,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 02:30:50,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:30:51,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:30:51,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:30:53,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 02:30:54,786 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 02:30:57,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:30:57,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:30:59,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:30:59,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:30:59,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 02:31:00,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 02:31:02,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:03,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 02:31:04,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:31:09,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:09,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:31:09,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:31:11,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 02:31:13,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 02:31:15,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:31:17,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:31:21,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:31:21,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:31:24,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=1493933.3333333333, ans=0.95 2023-10-04 02:31:26,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 02:31:26,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1493933.3333333333, ans=0.0 2023-10-04 02:31:27,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 02:31:27,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:31:28,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:31:30,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:30,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:31:32,264 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 1.990e+02 2.144e+02 2.470e+02 4.825e+02, threshold=4.288e+02, percent-clipped=1.0 2023-10-04 02:31:33,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 02:31:33,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:31:36,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:31:36,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:36,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 02:31:36,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:31:36,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:31:36,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 02:31:37,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1493933.3333333333, ans=0.125 2023-10-04 02:31:42,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:31:46,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:31:50,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:31:52,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 02:31:52,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 02:31:52,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1494000.0, ans=0.0 2023-10-04 02:31:55,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:56,474 INFO [train.py:1046] (1/4) Epoch 43, batch 1000, loss[loss=0.1505, simple_loss=0.2308, pruned_loss=0.0351, over 23579.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2366, pruned_loss=0.03822, over 4690467.33 frames. ], batch size: 149, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:31:59,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 02:31:59,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1494066.6666666667, ans=0.125 2023-10-04 02:32:00,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:04,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:32:05,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 02:32:05,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 02:32:10,130 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:32:11,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:32:11,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:32:11,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1494133.3333333333, ans=0.0 2023-10-04 02:32:12,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:32:14,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 02:32:19,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 02:32:20,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 02:32:20,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:32:22,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 02:32:24,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 02:32:24,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 02:32:26,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:32:27,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:33,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:32:35,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:32:35,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:35,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:32:36,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 02:32:36,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:32:38,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:32:38,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:32:38,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1494200.0, ans=0.07 2023-10-04 02:32:39,459 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 02:32:42,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 02:32:44,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 02:32:46,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 02:32:48,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:32:54,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:54,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:32:56,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:57,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:32:59,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 02:32:59,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:32:59,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 02:33:00,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 02:33:02,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:33:02,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:33:04,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:33:08,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:33:09,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:33:10,795 INFO [train.py:1046] (1/4) Epoch 43, batch 1050, loss[loss=0.1585, simple_loss=0.2433, pruned_loss=0.03683, over 23319.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2364, pruned_loss=0.03763, over 4711480.85 frames. ], batch size: 93, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:33:13,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:33:15,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:33:16,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:33:18,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:33:20,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:33:22,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:33:22,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:33:25,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:33:25,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1494466.6666666667, ans=0.5 2023-10-04 02:33:26,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:33:26,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:33:28,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:33:28,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 02:33:28,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1494466.6666666667, ans=0.125 2023-10-04 02:33:29,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:33:29,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 02:33:32,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:33:32,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 02:33:32,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 02:33:37,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1494466.6666666667, ans=0.125 2023-10-04 02:33:37,868 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.77 vs. limit=10.0 2023-10-04 02:33:39,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:33:39,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:33:40,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:33:42,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 02:33:42,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 02:33:42,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:33:45,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 02:33:50,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 02:33:50,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:33:52,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1494533.3333333333, ans=0.1 2023-10-04 02:33:52,774 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.23 vs. limit=22.5 2023-10-04 02:33:53,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 02:33:55,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 02:33:55,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:33:56,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:33:59,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:34:01,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1494600.0, ans=0.1 2023-10-04 02:34:02,038 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.937e+02 2.146e+02 2.350e+02 6.827e+02, threshold=4.291e+02, percent-clipped=1.0 2023-10-04 02:34:03,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 02:34:04,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 02:34:06,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 02:34:06,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:34:06,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:34:08,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 02:34:12,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:34:12,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:34:14,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:34:14,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:34:14,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:34:20,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:34:20,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 02:34:22,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:34:22,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 02:34:22,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 02:34:22,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:34:23,321 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.99 vs. limit=15.0 2023-10-04 02:34:24,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1494733.3333333333, ans=0.2 2023-10-04 02:34:25,355 INFO [train.py:1046] (1/4) Epoch 43, batch 1100, loss[loss=0.1588, simple_loss=0.2485, pruned_loss=0.03454, over 24477.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.236, pruned_loss=0.03749, over 4714637.96 frames. ], batch size: 69, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:34:26,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:34:32,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:34:36,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:34:37,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:34:37,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:34:37,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 02:34:39,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:34:41,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:34:43,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:34:44,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1494800.0, ans=0.125 2023-10-04 02:34:45,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:34:47,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 02:34:47,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 02:34:48,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:34:48,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:34:50,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:34:50,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1494800.0, ans=0.125 2023-10-04 02:34:53,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:34:59,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:35:01,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 02:35:01,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=1494866.6666666667, ans=0.95 2023-10-04 02:35:02,430 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 02:35:02,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:35:05,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:35:05,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 02:35:05,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:35:06,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 02:35:06,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:35:06,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:35:06,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:35:06,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1494866.6666666667, ans=0.0 2023-10-04 02:35:08,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:35:08,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 02:35:08,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1494933.3333333333, ans=0.125 2023-10-04 02:35:10,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1494933.3333333333, ans=0.0 2023-10-04 02:35:14,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:35:15,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 02:35:15,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:35:22,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:35:25,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 02:35:25,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 02:35:25,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:35:27,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:35:27,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1495000.0, ans=0.035 2023-10-04 02:35:27,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1495000.0, ans=0.125 2023-10-04 02:35:28,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:35:29,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 02:35:29,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:35:31,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:35:31,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 02:35:32,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:35:32,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 02:35:33,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:35:33,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:35:35,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:35:38,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1495066.6666666667, ans=0.125 2023-10-04 02:35:39,338 INFO [train.py:1046] (1/4) Epoch 43, batch 1150, loss[loss=0.166, simple_loss=0.2541, pruned_loss=0.03896, over 23995.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.236, pruned_loss=0.03767, over 4713406.18 frames. ], batch size: 80, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:35:41,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:35:42,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:35:44,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:35:45,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:35:45,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 02:35:47,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:35:49,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 02:35:51,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:35:51,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:35:59,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 02:36:01,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:36:03,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:36:03,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1495133.3333333333, ans=0.0 2023-10-04 02:36:04,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:36:04,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 02:36:04,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:36:06,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:36:10,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 02:36:11,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:36:12,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:36:22,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:36:28,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:36:28,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 02:36:30,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:36:30,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:36:31,604 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.978e+02 2.214e+02 2.534e+02 4.016e+02, threshold=4.429e+02, percent-clipped=0.0 2023-10-04 02:36:36,099 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 02:36:37,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1495333.3333333333, ans=0.1 2023-10-04 02:36:38,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:36:44,577 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 02:36:49,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:36:52,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:36:52,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:36:52,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:36:53,398 INFO [train.py:1046] (1/4) Epoch 43, batch 1200, loss[loss=0.1583, simple_loss=0.2435, pruned_loss=0.03653, over 24680.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2362, pruned_loss=0.03775, over 4711705.09 frames. ], batch size: 65, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:36:55,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:36:56,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1495400.0, ans=0.125 2023-10-04 02:36:58,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1495400.0, ans=0.07 2023-10-04 02:37:02,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:37:02,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:37:03,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:37:03,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:37:03,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:37:03,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1495400.0, ans=0.0 2023-10-04 02:37:04,528 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.77 vs. limit=6.0 2023-10-04 02:37:04,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:37:05,654 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.64 vs. limit=15.0 2023-10-04 02:37:06,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:37:07,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:37:07,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:37:11,482 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 02:37:12,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 02:37:17,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:37:19,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:37:20,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:37:20,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1495466.6666666667, ans=0.125 2023-10-04 02:37:23,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:37:23,405 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 02:37:23,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1495533.3333333333, ans=0.125 2023-10-04 02:37:25,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:37:32,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 02:37:32,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:37:32,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 02:37:33,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:37:36,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 02:37:40,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 02:37:40,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:37:42,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:37:43,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:37:45,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:37:45,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:37:46,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:37:46,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:37:46,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 02:37:47,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:37:47,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:37:47,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:37:51,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:37:51,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:37:52,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 02:37:54,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1495666.6666666667, ans=0.125 2023-10-04 02:37:55,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:37:57,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1495666.6666666667, ans=0.125 2023-10-04 02:37:58,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 02:37:58,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1495666.6666666667, ans=0.2 2023-10-04 02:38:01,794 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 02:38:03,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:38:06,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:38:07,764 INFO [train.py:1046] (1/4) Epoch 43, batch 1250, loss[loss=0.177, simple_loss=0.2502, pruned_loss=0.05191, over 23438.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2373, pruned_loss=0.03843, over 4708808.02 frames. ], batch size: 285, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:38:07,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:38:09,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1495733.3333333333, ans=0.95 2023-10-04 02:38:10,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:38:10,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 02:38:14,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:38:16,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:38:17,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 02:38:19,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:38:20,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:38:24,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:38:26,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:38:28,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:38:28,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:38:32,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:38:34,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 02:38:34,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:38:34,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:38:38,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:38:38,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:38:41,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:38:42,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 02:38:46,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 02:38:46,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:38:49,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:38:50,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 02:38:52,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:38:52,700 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 02:38:52,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:38:52,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:38:55,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:38:58,747 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.956e+02 2.157e+02 2.335e+02 3.543e+02, threshold=4.314e+02, percent-clipped=0.0 2023-10-04 02:38:58,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:38:58,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:38:59,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1495933.3333333333, ans=0.1 2023-10-04 02:39:00,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 02:39:00,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 02:39:01,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 02:39:03,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:39:04,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 02:39:04,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:39:08,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 02:39:08,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:39:11,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 02:39:11,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 02:39:11,815 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.58 vs. limit=15.0 2023-10-04 02:39:12,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:39:12,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 02:39:12,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:39:15,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 02:39:16,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:39:16,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1496000.0, ans=0.1 2023-10-04 02:39:18,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:39:18,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:39:20,706 INFO [train.py:1046] (1/4) Epoch 43, batch 1300, loss[loss=0.1484, simple_loss=0.2414, pruned_loss=0.02768, over 24326.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2374, pruned_loss=0.03832, over 4718395.87 frames. ], batch size: 74, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:39:20,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:39:25,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:39:26,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 02:39:30,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:39:32,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:39:32,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:39:33,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:39:35,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:39:36,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 02:39:41,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:39:42,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:39:43,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 02:39:47,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 02:39:50,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:39:51,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:39:52,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:39:54,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:39:55,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:39:55,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 02:39:56,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 02:40:00,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1496200.0, ans=0.0 2023-10-04 02:40:02,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:40:02,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:40:04,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 02:40:06,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 02:40:07,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:40:09,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:40:09,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 02:40:11,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:40:11,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 02:40:12,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:40:12,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1496266.6666666667, ans=10.0 2023-10-04 02:40:17,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:40:17,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:40:20,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 02:40:22,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 02:40:23,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 02:40:28,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:40:30,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 02:40:30,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:40:30,971 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=9.66 vs. limit=12.0 2023-10-04 02:40:36,454 INFO [train.py:1046] (1/4) Epoch 43, batch 1350, loss[loss=0.1581, simple_loss=0.2355, pruned_loss=0.04038, over 23326.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2359, pruned_loss=0.03792, over 4713783.02 frames. ], batch size: 93, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:40:37,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 02:40:40,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:40:42,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:40:45,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:40:45,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:40:46,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:40:46,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:40:51,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:40:52,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 02:40:53,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:40:55,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:40:56,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_na.min_abs, batch_count=1496466.6666666667, ans=0.02 2023-10-04 02:40:57,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 02:40:58,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:41:01,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:41:01,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 02:41:03,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 02:41:06,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 02:41:06,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:41:07,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 02:41:07,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1496533.3333333333, ans=0.125 2023-10-04 02:41:10,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1496533.3333333333, ans=0.125 2023-10-04 02:41:17,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1496533.3333333333, ans=0.125 2023-10-04 02:41:18,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:41:24,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1496600.0, ans=0.5 2023-10-04 02:41:27,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:41:28,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:41:29,142 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.909e+02 2.129e+02 2.419e+02 3.786e+02, threshold=4.258e+02, percent-clipped=0.0 2023-10-04 02:41:29,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 02:41:32,069 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.02 vs. limit=15.0 2023-10-04 02:41:32,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:41:32,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 02:41:32,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:41:32,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:41:35,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:41:37,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 02:41:38,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:41:39,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1496666.6666666667, ans=0.125 2023-10-04 02:41:43,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1496666.6666666667, ans=0.125 2023-10-04 02:41:44,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 02:41:47,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 02:41:50,249 INFO [train.py:1046] (1/4) Epoch 43, batch 1400, loss[loss=0.1513, simple_loss=0.24, pruned_loss=0.03136, over 24302.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2344, pruned_loss=0.03755, over 4709823.54 frames. ], batch size: 74, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:41:53,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 02:41:54,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:41:56,703 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:41:57,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:41:57,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:42:02,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 02:42:03,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 02:42:05,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1496800.0, ans=0.125 2023-10-04 02:42:14,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:42:14,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1496800.0, ans=0.125 2023-10-04 02:42:15,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:42:18,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:42:18,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 02:42:24,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:42:24,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 02:42:29,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1496866.6666666667, ans=0.125 2023-10-04 02:42:29,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1496866.6666666667, ans=0.1 2023-10-04 02:42:32,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:42:32,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:42:36,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 02:42:37,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:42:37,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:42:37,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:42:39,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:42:40,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:42:40,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:42:42,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:42:43,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 02:42:43,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:42:49,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:42:50,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1497000.0, ans=0.125 2023-10-04 02:42:52,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:42:58,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 02:42:58,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:43:00,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:43:00,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1497000.0, ans=0.125 2023-10-04 02:43:03,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 02:43:03,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:43:04,351 INFO [train.py:1046] (1/4) Epoch 43, batch 1450, loss[loss=0.1381, simple_loss=0.1915, pruned_loss=0.0424, over 19175.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2328, pruned_loss=0.03741, over 4692809.33 frames. ], batch size: 388, lr: 2.37e-03, grad_scale: 4.0 2023-10-04 02:43:05,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:43:08,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:43:10,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:43:10,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:10,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 02:43:10,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1497066.6666666667, ans=0.125 2023-10-04 02:43:14,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:43:14,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:43:16,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:43:16,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 02:43:18,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:43:18,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 02:43:18,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:18,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1497133.3333333333, ans=0.05 2023-10-04 02:43:19,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:43:19,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 02:43:21,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:43:22,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:43:22,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 02:43:22,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:43:24,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:43:25,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:28,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:43:31,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:43:31,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:43:34,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:43:34,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:36,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:43:36,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:43:36,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:36,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:43:41,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 02:43:43,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:43:46,685 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 02:43:47,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:43:49,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:43:51,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:43:52,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 02:43:56,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:43:57,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 02:43:59,139 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 2.025e+02 2.199e+02 2.536e+02 7.667e+02, threshold=4.399e+02, percent-clipped=1.0 2023-10-04 02:44:00,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 02:44:02,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:44:05,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:44:06,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:44:07,519 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.63 vs. limit=15.0 2023-10-04 02:44:08,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 02:44:09,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 02:44:10,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 02:44:12,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:44:12,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 02:44:18,498 INFO [train.py:1046] (1/4) Epoch 43, batch 1500, loss[loss=0.1509, simple_loss=0.2414, pruned_loss=0.03025, over 24653.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2337, pruned_loss=0.03724, over 4704276.72 frames. ], batch size: 73, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:44:23,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 02:44:23,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:44:23,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:44:24,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:44:26,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:44:26,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:44:27,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 02:44:29,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:44:29,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 02:44:29,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:44:30,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:44:32,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:44:33,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:44:40,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:44:40,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 02:44:41,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:44:41,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:44:43,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:44:47,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 02:44:50,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 02:44:51,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:44:51,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 02:44:53,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 02:44:56,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:44:57,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:44:57,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:44:59,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 02:44:59,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:44:59,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:45:00,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 02:45:02,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:45:06,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:45:06,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 02:45:11,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:45:12,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:45:16,922 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 02:45:18,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:45:18,268 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 02:45:18,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1497666.6666666667, ans=0.0 2023-10-04 02:45:19,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:45:21,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:45:21,144 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 02:45:22,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:45:27,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 02:45:28,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:45:31,671 INFO [train.py:1046] (1/4) Epoch 43, batch 1550, loss[loss=0.1534, simple_loss=0.2259, pruned_loss=0.04039, over 23848.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2342, pruned_loss=0.03705, over 4716921.14 frames. ], batch size: 195, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:45:31,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:45:31,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:45:33,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:45:33,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:45:34,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:45:36,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 02:45:36,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 02:45:36,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:45:38,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 02:45:39,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 02:45:40,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:45:42,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:45:42,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:45:42,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:45:43,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:45:43,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:45:46,376 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 02:45:48,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:45:48,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:45:48,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1497800.0, ans=0.0 2023-10-04 02:45:49,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:45:51,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:45:51,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1497800.0, ans=0.125 2023-10-04 02:45:52,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 02:45:53,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:45:53,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 02:45:54,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 02:45:54,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 02:45:55,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:45:55,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:45:59,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1497800.0, ans=0.0 2023-10-04 02:46:00,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:46:02,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 02:46:02,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 02:46:06,329 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:46:10,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:46:13,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:46:13,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:46:13,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:46:14,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 02:46:19,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:46:22,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:46:23,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:46:26,650 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.685e+02 1.915e+02 2.072e+02 2.347e+02 3.023e+02, threshold=4.143e+02, percent-clipped=0.0 2023-10-04 02:46:26,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:46:26,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:46:28,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 02:46:28,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:46:29,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:46:29,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:46:31,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 02:46:31,605 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 02:46:31,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1498000.0, ans=0.0 2023-10-04 02:46:34,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:46:39,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 02:46:43,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:46:44,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:46:44,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 02:46:45,976 INFO [train.py:1046] (1/4) Epoch 43, batch 1600, loss[loss=0.1483, simple_loss=0.222, pruned_loss=0.03733, over 23713.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2353, pruned_loss=0.03763, over 4699038.46 frames. ], batch size: 149, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:46:47,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:46:49,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:46:49,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:46:49,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:46:50,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:46:54,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:46:55,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 02:46:56,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 02:46:57,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 02:46:59,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:47:01,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 02:47:02,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:47:04,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:47:05,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1498133.3333333333, ans=0.2 2023-10-04 02:47:09,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:47:11,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 02:47:14,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:47:16,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 02:47:16,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:47:16,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 02:47:19,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1498200.0, ans=0.1 2023-10-04 02:47:22,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 02:47:29,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:47:29,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 02:47:31,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:47:31,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:47:31,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:47:32,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 02:47:37,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 02:47:39,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:47:39,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:47:40,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:47:40,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:47:42,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:47:43,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:47:44,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:47:52,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:47:52,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:47:53,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1498333.3333333333, ans=0.2 2023-10-04 02:47:55,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 02:47:55,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:47:56,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 02:47:57,250 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.47 vs. limit=22.5 2023-10-04 02:48:00,928 INFO [train.py:1046] (1/4) Epoch 43, batch 1650, loss[loss=0.1556, simple_loss=0.2472, pruned_loss=0.03198, over 24525.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2359, pruned_loss=0.03759, over 4710925.72 frames. ], batch size: 71, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:48:03,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:48:05,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:48:05,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:48:05,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 02:48:05,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 02:48:05,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 02:48:05,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 02:48:06,534 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.16 vs. limit=15.0 2023-10-04 02:48:09,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:48:09,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:48:09,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:48:09,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:48:14,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:48:16,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 02:48:17,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:48:19,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:48:19,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:48:19,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:48:19,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 02:48:20,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 02:48:20,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1498466.6666666667, ans=0.125 2023-10-04 02:48:26,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:48:27,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:48:35,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 02:48:35,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:48:38,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 02:48:41,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:48:43,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:48:43,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:48:43,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:48:43,793 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.89 vs. limit=6.0 2023-10-04 02:48:45,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:48:45,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:48:49,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:48:49,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:48:50,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:48:50,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:48:50,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:48:51,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:48:54,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:48:55,938 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.993e+02 2.180e+02 2.514e+02 3.925e+02, threshold=4.360e+02, percent-clipped=0.0 2023-10-04 02:48:56,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 02:48:57,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:48:58,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 02:49:00,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 02:49:00,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 02:49:00,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:49:02,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:49:02,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:49:03,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:49:03,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 02:49:06,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:49:08,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:49:08,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:49:11,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 02:49:14,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:49:14,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:49:14,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 02:49:15,718 INFO [train.py:1046] (1/4) Epoch 43, batch 1700, loss[loss=0.1549, simple_loss=0.2253, pruned_loss=0.04225, over 23656.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2347, pruned_loss=0.03743, over 4718692.94 frames. ], batch size: 232, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:49:15,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:49:15,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:49:15,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:49:20,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:49:20,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:49:21,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 02:49:23,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:49:28,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1498800.0, ans=0.125 2023-10-04 02:49:31,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:49:34,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:49:41,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:49:41,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:49:41,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:49:41,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:49:42,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1498800.0, ans=0.1 2023-10-04 02:49:44,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 02:49:46,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:49:46,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:49:48,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:49:50,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 02:49:52,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 02:49:52,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 02:49:54,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:49:55,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 02:49:56,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:50:03,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1498933.3333333333, ans=0.125 2023-10-04 02:50:04,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:50:04,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:50:05,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:50:06,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 02:50:06,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 02:50:07,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:50:10,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:50:10,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 02:50:11,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:50:11,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:50:11,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:50:11,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:50:13,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:50:13,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:50:14,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:50:16,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:50:16,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:50:18,717 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.05 vs. limit=6.0 2023-10-04 02:50:20,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:50:21,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 02:50:24,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:50:24,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:50:26,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 02:50:29,690 INFO [train.py:1046] (1/4) Epoch 43, batch 1750, loss[loss=0.1633, simple_loss=0.2497, pruned_loss=0.03842, over 24422.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2337, pruned_loss=0.03714, over 4716354.65 frames. ], batch size: 77, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:50:30,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1499066.6666666667, ans=0.125 2023-10-04 02:50:31,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:50:34,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:50:34,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 02:50:35,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 02:50:36,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:50:36,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1499066.6666666667, ans=0.125 2023-10-04 02:50:39,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:50:39,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:50:39,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1499066.6666666667, ans=0.07 2023-10-04 02:50:43,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 02:50:45,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:50:48,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 02:50:48,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:50:50,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:50:53,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 02:50:53,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 02:50:56,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:50:56,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 02:51:04,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:51:07,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:51:07,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:51:11,370 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.23 vs. limit=15.0 2023-10-04 02:51:12,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:51:12,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:51:14,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:51:16,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:51:17,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1499266.6666666667, ans=0.1 2023-10-04 02:51:19,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:51:20,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:51:20,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 02:51:22,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:51:24,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 02:51:24,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:51:26,147 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.43 vs. limit=15.0 2023-10-04 02:51:26,924 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 2.005e+02 2.231e+02 2.661e+02 3.753e+02, threshold=4.462e+02, percent-clipped=0.0 2023-10-04 02:51:27,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:51:27,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:51:29,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:51:29,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 02:51:31,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:51:32,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1499333.3333333333, ans=0.125 2023-10-04 02:51:33,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:51:36,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:51:38,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:51:40,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:51:41,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 02:51:41,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:51:43,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:51:43,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:51:43,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:51:43,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:51:44,687 INFO [train.py:1046] (1/4) Epoch 43, batch 1800, loss[loss=0.1564, simple_loss=0.2439, pruned_loss=0.03449, over 24435.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2336, pruned_loss=0.03702, over 4727688.59 frames. ], batch size: 69, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:51:44,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:51:49,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:51:49,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:51:51,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 02:51:54,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:51:55,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 02:51:55,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1499400.0, ans=0.2 2023-10-04 02:51:56,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:51:59,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:52:01,339 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:52:02,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:52:03,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:52:03,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:52:06,794 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.34 vs. limit=15.0 2023-10-04 02:52:07,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:52:07,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 02:52:08,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:11,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:15,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 02:52:18,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 02:52:18,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 02:52:18,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:52:19,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:52:19,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:52:19,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:52:19,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1499533.3333333333, ans=0.125 2023-10-04 02:52:24,860 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.05 vs. limit=12.0 2023-10-04 02:52:25,530 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 02:52:26,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:52:28,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:29,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 02:52:29,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 02:52:31,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:52:32,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:52:33,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:52:38,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 02:52:44,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:52:45,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 02:52:45,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:52:45,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:52:45,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:52:45,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 02:52:50,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:52:50,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:52:53,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 02:52:53,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:52:56,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:52:56,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:52:56,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:57,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:57,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:52:58,936 INFO [train.py:1046] (1/4) Epoch 43, batch 1850, loss[loss=0.1587, simple_loss=0.2476, pruned_loss=0.03489, over 24521.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2337, pruned_loss=0.03672, over 4722146.93 frames. ], batch size: 71, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:53:00,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:53:00,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:53:03,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:53:04,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:53:10,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:53:10,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 02:53:13,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 02:53:13,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1499800.0, ans=0.0 2023-10-04 02:53:16,121 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1499800.0, ans=0.125 2023-10-04 02:53:17,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 02:53:21,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:53:21,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 02:53:21,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 02:53:31,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1499866.6666666667, ans=0.125 2023-10-04 02:53:32,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:53:33,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 02:53:36,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:53:37,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:53:41,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 02:53:41,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:53:41,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 02:53:42,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:53:45,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:53:46,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:53:50,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:53:51,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:53:51,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 02:53:51,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:53:52,675 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.89 vs. limit=15.0 2023-10-04 02:53:53,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:53:55,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:53:56,296 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.697e+02 1.939e+02 2.109e+02 2.438e+02 4.084e+02, threshold=4.217e+02, percent-clipped=0.0 2023-10-04 02:53:58,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 02:53:59,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:54:03,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:54:03,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:54:03,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 02:54:03,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 02:54:04,953 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 02:54:06,379 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 02:54:09,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:54:09,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:54:09,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:54:09,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:54:10,371 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 02:54:10,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:54:10,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:54:11,669 INFO [train.py:1046] (1/4) Epoch 43, batch 1900, loss[loss=0.1651, simple_loss=0.2407, pruned_loss=0.04477, over 23834.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2343, pruned_loss=0.03667, over 4736726.15 frames. ], batch size: 212, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:54:12,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:54:13,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:54:15,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:54:15,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 02:54:15,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1500066.6666666667, ans=0.0 2023-10-04 02:54:17,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:54:17,893 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 02:54:17,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:54:18,425 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.95 vs. limit=15.0 2023-10-04 02:54:19,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:54:24,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:54:27,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:54:27,794 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 02:54:29,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 02:54:30,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:54:31,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:54:31,772 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 02:54:31,807 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 02:54:36,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 02:54:36,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1500133.3333333333, ans=0.125 2023-10-04 02:54:37,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:54:40,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 02:54:41,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 02:54:51,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 02:54:53,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 02:54:53,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:54:55,521 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 02:54:55,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 02:54:55,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 02:54:55,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 02:54:55,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:55:01,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 02:55:02,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:55:05,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:55:05,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 02:55:07,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:55:11,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 02:55:11,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:55:18,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:55:18,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:55:20,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:55:20,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:55:21,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:55:21,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 02:55:23,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:55:25,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:55:25,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:55:26,726 INFO [train.py:1046] (1/4) Epoch 43, batch 1950, loss[loss=0.1409, simple_loss=0.2275, pruned_loss=0.02718, over 24502.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2348, pruned_loss=0.0366, over 4732343.40 frames. ], batch size: 66, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:55:28,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:55:29,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:55:29,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:55:31,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:55:32,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:55:35,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:55:35,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:55:35,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:55:38,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 02:55:38,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 02:55:39,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:55:40,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:55:42,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:55:42,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:55:42,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:55:45,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:55:48,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:55:48,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:55:48,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:55:48,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:55:48,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1500466.6666666667, ans=0.0 2023-10-04 02:55:53,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:55:54,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1500533.3333333333, ans=0.0 2023-10-04 02:55:56,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:55:56,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:55:56,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 02:55:56,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 02:55:56,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1500533.3333333333, ans=0.125 2023-10-04 02:55:57,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 02:55:57,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:55:58,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:56:02,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:56:05,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:56:08,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1500533.3333333333, ans=0.1 2023-10-04 02:56:09,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:56:12,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:56:12,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:56:12,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 02:56:12,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:56:16,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:56:16,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1500600.0, ans=0.1 2023-10-04 02:56:17,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:56:17,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:56:20,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1500600.0, ans=0.125 2023-10-04 02:56:25,116 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.031e+02 2.275e+02 2.589e+02 3.753e+02, threshold=4.549e+02, percent-clipped=0.0 2023-10-04 02:56:25,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:56:25,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:56:27,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:56:30,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:56:32,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:56:33,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:56:34,501 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.17 vs. limit=22.5 2023-10-04 02:56:35,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 02:56:35,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:56:35,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:56:36,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 02:56:38,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1500666.6666666667, ans=0.125 2023-10-04 02:56:39,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:56:40,761 INFO [train.py:1046] (1/4) Epoch 43, batch 2000, loss[loss=0.1383, simple_loss=0.2147, pruned_loss=0.03093, over 24403.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2353, pruned_loss=0.03682, over 4742301.30 frames. ], batch size: 58, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:56:41,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1500733.3333333333, ans=0.09899494936611666 2023-10-04 02:56:42,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:56:43,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:56:43,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:56:46,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:56:49,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:56:52,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 02:56:53,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:56:57,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:56:58,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 02:57:00,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:57:00,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:57:04,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:57:05,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 02:57:06,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:07,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:07,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:09,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 02:57:09,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 02:57:10,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 02:57:10,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:57:13,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:57:13,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:57:13,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:14,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:57:16,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:57:17,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 02:57:20,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 02:57:20,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:57:20,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:57:25,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:57:26,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:57:26,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:57:28,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:57:30,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:57:30,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:57:30,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:57:30,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:57:32,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:35,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:57:35,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 02:57:39,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 02:57:40,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:57:43,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:57:43,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:57:46,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:47,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:57:47,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:49,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:57:51,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:57:53,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:57:54,720 INFO [train.py:1046] (1/4) Epoch 43, batch 2050, loss[loss=0.1366, simple_loss=0.2158, pruned_loss=0.02874, over 24472.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2344, pruned_loss=0.03703, over 4733407.01 frames. ], batch size: 58, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:57:54,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:58,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:57:58,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:58:03,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:58:04,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:58:05,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:58:07,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:58:10,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 02:58:10,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:58:11,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:58:12,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1501133.3333333333, ans=0.0 2023-10-04 02:58:13,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:58:16,568 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.52 vs. limit=15.0 2023-10-04 02:58:17,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1501133.3333333333, ans=0.04949747468305833 2023-10-04 02:58:21,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:58:22,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:58:25,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 02:58:26,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:58:26,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 02:58:26,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:58:31,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:58:32,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:58:33,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:58:34,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:58:35,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:58:35,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:58:35,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:58:38,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:58:41,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:58:43,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:58:44,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:58:48,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:58:55,253 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 2.035e+02 2.186e+02 2.516e+02 3.792e+02, threshold=4.373e+02, percent-clipped=0.0 2023-10-04 02:58:55,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:58:56,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 02:58:57,350 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.91 vs. limit=15.0 2023-10-04 02:59:01,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:59:02,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:59:05,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:59:06,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 02:59:08,167 INFO [train.py:1046] (1/4) Epoch 43, batch 2100, loss[loss=0.1499, simple_loss=0.2346, pruned_loss=0.03266, over 24480.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2329, pruned_loss=0.03698, over 4719035.52 frames. ], batch size: 66, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:59:09,610 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 02:59:09,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:59:09,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:59:10,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:59:12,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:59:12,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 02:59:12,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 02:59:14,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:59:16,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1501400.0, ans=0.125 2023-10-04 02:59:18,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:59:18,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:59:21,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:59:23,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:59:23,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 02:59:23,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:59:24,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 02:59:24,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 02:59:26,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:59:26,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:59:26,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 02:59:26,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 02:59:32,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 02:59:32,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:59:35,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:59:35,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:59:37,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1501533.3333333333, ans=0.0 2023-10-04 02:59:39,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:59:39,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 02:59:39,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:59:39,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 02:59:42,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 02:59:42,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:59:42,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 02:59:42,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 02:59:43,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=1501533.3333333333, ans=22.5 2023-10-04 02:59:44,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 02:59:45,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:59:46,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:59:50,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:59:51,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:59:52,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:59:56,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:59:56,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 02:59:56,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:59:56,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:59:56,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:59:57,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 02:59:57,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 02:59:59,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 03:00:03,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:00:06,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:00:06,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 03:00:10,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:00:13,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:00:13,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:00:13,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:00:15,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 03:00:16,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:00:16,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:00:17,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:00:17,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:00:17,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:00:21,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 03:00:22,578 INFO [train.py:1046] (1/4) Epoch 43, batch 2150, loss[loss=0.1477, simple_loss=0.235, pruned_loss=0.0302, over 24460.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2322, pruned_loss=0.03696, over 4714748.37 frames. ], batch size: 69, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 03:00:23,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 03:00:23,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:00:27,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:00:27,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:00:27,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:00:27,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:00:27,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1501733.3333333333, ans=0.125 2023-10-04 03:00:32,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 03:00:34,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:00:34,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:00:36,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:00:36,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:00:37,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:00:37,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1501800.0, ans=0.2 2023-10-04 03:00:38,233 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.63 vs. limit=12.0 2023-10-04 03:00:40,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:00:41,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:00:41,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:00:44,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:00:44,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 03:00:48,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:00:49,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:00:51,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:00:51,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:00:51,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:00:51,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:00:53,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:00:53,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:00:53,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:00:55,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 03:00:57,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 03:00:58,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:00:58,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:00:59,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:01:01,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:01:04,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:01:04,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:01:07,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:01:07,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 03:01:07,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 03:01:10,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:01:10,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:11,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:01:12,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:01:13,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1501933.3333333333, ans=0.0 2023-10-04 03:01:14,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:15,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:15,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 03:01:16,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 03:01:18,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:01:18,595 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 03:01:18,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:18,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:01:19,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 03:01:19,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:01:19,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 03:01:20,016 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 03:01:20,016 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 03:01:20,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 03:01:21,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1502000.0, ans=0.125 2023-10-04 03:01:22,556 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.924e+02 2.146e+02 2.514e+02 4.521e+02, threshold=4.293e+02, percent-clipped=1.0 2023-10-04 03:01:22,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:24,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:01:24,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:01:24,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:25,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 03:01:26,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:26,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1502000.0, ans=0.0 2023-10-04 03:01:27,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:34,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:01:34,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 03:01:36,193 INFO [train.py:1046] (1/4) Epoch 43, batch 2200, loss[loss=0.1372, simple_loss=0.2167, pruned_loss=0.0289, over 24422.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2324, pruned_loss=0.03699, over 4705703.76 frames. ], batch size: 58, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 03:01:39,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:01:39,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1502066.6666666667, ans=0.125 2023-10-04 03:01:40,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1502066.6666666667, ans=0.125 2023-10-04 03:01:41,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:43,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:01:43,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:01:43,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:01:46,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:47,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:01:47,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 03:01:51,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 03:01:54,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:02:01,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 03:02:03,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:02:05,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:02:05,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:02:06,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1502200.0, ans=0.0 2023-10-04 03:02:08,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:02:09,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 03:02:11,332 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.80 vs. limit=15.0 2023-10-04 03:02:12,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:02:13,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:02:13,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1502200.0, ans=0.125 2023-10-04 03:02:15,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 03:02:15,734 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.49 vs. limit=6.0 2023-10-04 03:02:19,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:02:21,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:02:21,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:02:24,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:02:24,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1502266.6666666667, ans=0.0 2023-10-04 03:02:25,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 03:02:27,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:02:27,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 03:02:30,165 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.32 vs. limit=15.0 2023-10-04 03:02:30,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:02:30,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:02:30,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:02:32,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:02:32,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:02:32,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:02:32,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:02:35,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 03:02:35,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:02:37,159 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=1.98 vs. limit=12.0 2023-10-04 03:02:38,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:02:40,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 03:02:40,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:02:43,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:02:43,641 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 03:02:43,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1502333.3333333333, ans=0.125 2023-10-04 03:02:46,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:02:47,751 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 03:02:49,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 03:02:49,126 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 03:02:50,436 INFO [train.py:1046] (1/4) Epoch 43, batch 2250, loss[loss=0.1542, simple_loss=0.2279, pruned_loss=0.04019, over 23825.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2333, pruned_loss=0.03731, over 4711075.71 frames. ], batch size: 232, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 03:02:50,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:02:51,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 03:02:53,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:02:55,866 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 03:02:58,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:02:59,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:03:00,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1502400.0, ans=0.125 2023-10-04 03:03:00,714 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.65 vs. limit=15.0 2023-10-04 03:03:04,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:03:06,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:03:08,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:03:08,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:03:10,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:03:12,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 03:03:12,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:03:14,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:03:15,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 03:03:17,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:03:17,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:03:17,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1502466.6666666667, ans=0.125 2023-10-04 03:03:18,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:03:19,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1502533.3333333333, ans=0.2 2023-10-04 03:03:20,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1502533.3333333333, ans=0.0 2023-10-04 03:03:22,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:03:22,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1502533.3333333333, ans=0.125 2023-10-04 03:03:23,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:03:23,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 03:03:25,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 03:03:27,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:03:30,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:03:35,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:03:36,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:03:37,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:03:37,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:03:38,747 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.51 vs. limit=15.0 2023-10-04 03:03:39,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:03:41,094 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.66 vs. limit=22.5 2023-10-04 03:03:41,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:03:46,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:03:47,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 03:03:49,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1502666.6666666667, ans=0.125 2023-10-04 03:03:50,214 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.661e+02 2.039e+02 2.231e+02 2.498e+02 4.606e+02, threshold=4.463e+02, percent-clipped=1.0 2023-10-04 03:03:53,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:03:53,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:03:54,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:03:58,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:04:01,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:04:01,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 03:04:01,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:04:01,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:04:03,916 INFO [train.py:1046] (1/4) Epoch 43, batch 2300, loss[loss=0.1441, simple_loss=0.2264, pruned_loss=0.0309, over 24654.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2343, pruned_loss=0.03738, over 4713088.86 frames. ], batch size: 65, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:04:04,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 03:04:08,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:04:08,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:04:14,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:04:14,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:04:15,769 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 03:04:17,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:04:20,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1502800.0, ans=0.125 2023-10-04 03:04:22,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:04:22,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:04:24,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:04:24,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:04:24,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 03:04:25,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:04:27,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:04:28,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:04:31,514 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.04 vs. limit=15.0 2023-10-04 03:04:32,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:04:34,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:04:38,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:04:38,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1502866.6666666667, ans=0.125 2023-10-04 03:04:43,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:04:43,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:04:45,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:04:48,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:04:52,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:04:52,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:04:52,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:04:52,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 03:04:52,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1502933.3333333333, ans=0.125 2023-10-04 03:04:57,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:04:57,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:04:57,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:04:57,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:04:59,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:05:00,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 03:05:00,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:05:00,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 03:05:00,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:05:00,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:05:01,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 03:05:09,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:05:12,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:05:12,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1503000.0, ans=0.0 2023-10-04 03:05:15,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:05:15,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:05:15,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:05:17,891 INFO [train.py:1046] (1/4) Epoch 43, batch 2350, loss[loss=0.1519, simple_loss=0.2246, pruned_loss=0.03961, over 23590.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2354, pruned_loss=0.03799, over 4714034.22 frames. ], batch size: 256, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:05:17,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:05:18,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:05:18,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:05:18,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 03:05:25,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:05:25,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 03:05:26,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1503066.6666666667, ans=0.125 2023-10-04 03:05:30,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 03:05:33,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:05:36,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:05:36,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:05:36,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:05:36,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:05:38,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 03:05:40,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:05:42,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1503133.3333333333, ans=0.125 2023-10-04 03:05:46,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 03:05:46,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:05:49,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:05:49,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:05:50,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1503200.0, ans=0.125 2023-10-04 03:05:51,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:05:53,194 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.82 vs. limit=15.0 2023-10-04 03:05:53,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 03:05:53,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:05:55,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:05:55,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:05:56,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:05:59,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:06:01,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 03:06:02,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:06:04,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:06:04,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:06:06,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 03:06:07,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:06:10,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 03:06:10,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:06:14,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 03:06:19,066 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 1.962e+02 2.138e+02 2.370e+02 3.005e+02, threshold=4.276e+02, percent-clipped=0.0 2023-10-04 03:06:19,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 03:06:19,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:06:19,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:06:19,232 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 03:06:19,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 03:06:23,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 03:06:25,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:06:31,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:06:32,276 INFO [train.py:1046] (1/4) Epoch 43, batch 2400, loss[loss=0.1354, simple_loss=0.1952, pruned_loss=0.03778, over 22632.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.235, pruned_loss=0.03799, over 4695374.84 frames. ], batch size: 322, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:06:33,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:06:37,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:06:37,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 03:06:38,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 03:06:45,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:06:45,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:06:47,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1503466.6666666667, ans=0.125 2023-10-04 03:06:48,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 03:06:48,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:06:48,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:06:48,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 03:06:54,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:06:55,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 03:06:57,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1503466.6666666667, ans=0.2 2023-10-04 03:07:00,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:07:05,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 03:07:08,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:07:09,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:07:13,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:07:15,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 03:07:15,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:07:24,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:07:26,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:07:29,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:07:29,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:07:29,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:07:31,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:07:31,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:07:31,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:07:31,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:07:34,487 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.08 vs. limit=15.0 2023-10-04 03:07:36,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:07:38,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:07:38,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 03:07:38,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 03:07:40,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1503666.6666666667, ans=0.125 2023-10-04 03:07:41,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:07:41,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:07:41,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 03:07:41,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 03:07:43,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 03:07:43,205 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 03:07:44,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 03:07:44,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:07:45,910 INFO [train.py:1046] (1/4) Epoch 43, batch 2450, loss[loss=0.1528, simple_loss=0.2375, pruned_loss=0.03406, over 24669.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2338, pruned_loss=0.03723, over 4713236.97 frames. ], batch size: 65, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:07:46,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:07:46,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:07:47,281 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 03:07:47,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:07:48,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:07:52,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:07:52,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:07:55,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:07:55,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:07:57,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 03:08:03,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:08:03,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:08:06,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:08:06,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:08:06,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:08:07,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 03:08:09,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1503800.0, ans=0.2 2023-10-04 03:08:11,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:08:14,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:08:15,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:08:18,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:08:18,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:08:19,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:08:19,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:08:21,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 03:08:21,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:08:28,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:08:29,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:08:29,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:08:30,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:08:31,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:08:33,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:08:33,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 03:08:37,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:08:37,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:08:40,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1503933.3333333333, ans=0.125 2023-10-04 03:08:42,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:08:42,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:08:47,283 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 1.943e+02 2.149e+02 2.494e+02 4.938e+02, threshold=4.298e+02, percent-clipped=1.0 2023-10-04 03:08:48,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:08:48,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 03:08:48,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:08:50,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:08:50,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 03:08:51,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:08:51,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:08:53,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1504000.0, ans=0.0 2023-10-04 03:08:54,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:08:54,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1504000.0, ans=0.0 2023-10-04 03:08:56,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1504000.0, ans=0.0 2023-10-04 03:08:57,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:08:57,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:09:00,292 INFO [train.py:1046] (1/4) Epoch 43, batch 2500, loss[loss=0.1592, simple_loss=0.2293, pruned_loss=0.04453, over 23697.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2338, pruned_loss=0.03736, over 4702584.71 frames. ], batch size: 232, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:09:01,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 03:09:01,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:09:05,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:09:15,055 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.18 vs. limit=22.5 2023-10-04 03:09:15,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:09:16,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:09:17,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:09:17,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 03:09:22,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:09:23,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:09:23,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 03:09:23,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 03:09:25,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 03:09:26,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:09:26,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:09:28,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 03:09:28,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:09:28,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 03:09:28,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:09:31,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:09:32,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:09:34,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:09:36,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 03:09:37,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:09:38,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:09:42,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:09:46,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:09:47,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1504266.6666666667, ans=0.125 2023-10-04 03:09:48,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:09:51,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1504266.6666666667, ans=0.125 2023-10-04 03:09:53,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1504266.6666666667, ans=0.0 2023-10-04 03:09:55,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 03:09:57,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 03:09:57,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:09:58,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 03:09:59,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:09:59,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:10:00,805 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 03:10:00,806 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 03:10:00,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 03:10:03,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:10:05,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 03:10:05,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 03:10:06,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:10:06,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 03:10:07,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1504333.3333333333, ans=0.125 2023-10-04 03:10:11,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 03:10:14,240 INFO [train.py:1046] (1/4) Epoch 43, batch 2550, loss[loss=0.1598, simple_loss=0.2392, pruned_loss=0.04025, over 23381.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2333, pruned_loss=0.03748, over 4687951.06 frames. ], batch size: 106, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:10:14,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:10:15,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:10:15,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:10:17,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:10:18,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 03:10:18,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:10:19,573 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.02 vs. limit=15.0 2023-10-04 03:10:21,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 03:10:24,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:10:24,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1504400.0, ans=0.125 2023-10-04 03:10:27,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:10:29,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:10:29,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 03:10:29,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:10:31,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:10:32,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:10:33,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:10:33,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 03:10:35,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 03:10:35,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:10:35,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 03:10:49,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:10:54,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:10:54,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:10:54,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:10:56,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:11:01,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:11:03,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:11:03,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:11:03,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:11:05,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 03:11:06,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:11:09,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:11:09,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:11:13,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:11:13,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 03:11:14,420 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.895e+02 2.103e+02 2.314e+02 4.132e+02, threshold=4.207e+02, percent-clipped=0.0 2023-10-04 03:11:14,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:11:14,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:11:15,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 03:11:15,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:11:17,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1504666.6666666667, ans=0.125 2023-10-04 03:11:17,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1504666.6666666667, ans=0.125 2023-10-04 03:11:19,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:11:19,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1504666.6666666667, ans=0.2 2023-10-04 03:11:24,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:11:26,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:11:27,928 INFO [train.py:1046] (1/4) Epoch 43, batch 2600, loss[loss=0.1584, simple_loss=0.2347, pruned_loss=0.04107, over 23784.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2346, pruned_loss=0.03783, over 4695210.25 frames. ], batch size: 212, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:11:29,550 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 03:11:30,932 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 03:11:30,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:11:30,989 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 03:11:31,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 03:11:32,286 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 03:11:33,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:11:33,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1504733.3333333333, ans=0.125 2023-10-04 03:11:35,151 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 03:11:36,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 03:11:37,852 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 03:11:41,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:11:42,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 03:11:42,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 03:11:44,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:11:45,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 03:11:46,965 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 03:11:48,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 03:11:50,331 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.52 vs. limit=15.0 2023-10-04 03:11:50,452 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.71 vs. limit=22.5 2023-10-04 03:11:56,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:11:56,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:11:56,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:11:56,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 03:11:56,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1504866.6666666667, ans=0.0 2023-10-04 03:11:59,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:12:03,494 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 03:12:07,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:12:09,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:12:10,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 03:12:10,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:12:10,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:12:12,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 03:12:15,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:12:16,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:12:19,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:12:22,683 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 03:12:22,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:12:22,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:12:27,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1505000.0, ans=0.0 2023-10-04 03:12:30,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:12:30,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:12:30,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 03:12:31,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:12:32,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:12:34,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:12:39,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 03:12:41,239 INFO [train.py:1046] (1/4) Epoch 43, batch 2650, loss[loss=0.1599, simple_loss=0.2348, pruned_loss=0.04249, over 23879.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2349, pruned_loss=0.03778, over 4691713.99 frames. ], batch size: 195, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:12:41,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:12:41,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:12:41,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1505066.6666666667, ans=0.0 2023-10-04 03:12:44,970 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:12:46,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 03:12:47,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:12:49,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:12:49,174 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 03:12:49,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:12:50,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:12:55,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:12:56,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:12:58,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:12:58,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 03:12:59,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:12:59,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:13:01,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 03:13:03,817 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.64 vs. limit=22.5 2023-10-04 03:13:04,237 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 03:13:05,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:13:08,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 03:13:08,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:13:09,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 03:13:13,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:13:13,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:13:13,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:13:14,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:18,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 03:13:18,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 03:13:22,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:13:24,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 03:13:24,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:13:26,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:26,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:13:26,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:13:28,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:13:28,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:13:31,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:13:32,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:13:33,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:13:34,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:13:37,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:13:37,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:13:38,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:13:38,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:13:39,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:13:42,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:44,751 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.943e+02 2.072e+02 2.278e+02 3.072e+02, threshold=4.144e+02, percent-clipped=0.0 2023-10-04 03:13:44,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:13:44,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:13:46,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 03:13:50,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:13:52,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:53,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:54,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:13:56,273 INFO [train.py:1046] (1/4) Epoch 43, batch 2700, loss[loss=0.1569, simple_loss=0.2248, pruned_loss=0.04448, over 23726.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2349, pruned_loss=0.03743, over 4706047.20 frames. ], batch size: 232, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:13:56,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:13:56,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:13:58,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:13:58,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 03:14:01,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:14:03,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 03:14:05,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:14:05,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:14:05,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:14:07,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:14:07,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:14:08,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:14:08,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 03:14:08,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 03:14:09,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:14:11,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:14:11,634 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.91 vs. limit=22.5 2023-10-04 03:14:12,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:14:12,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:14:16,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:14:18,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 03:14:18,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:14:22,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1505466.6666666667, ans=0.0 2023-10-04 03:14:24,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:14:24,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:14:29,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:14:29,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:14:29,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:14:29,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:14:33,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:14:35,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:14:35,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:14:35,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:14:39,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:14:39,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:14:40,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1505600.0, ans=0.1 2023-10-04 03:14:46,229 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.04 vs. limit=22.5 2023-10-04 03:14:48,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:14:48,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:14:51,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:14:51,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:14:55,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:14:56,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:14:57,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:14:58,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:14:59,597 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.99 vs. limit=15.0 2023-10-04 03:15:00,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:15:00,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:15:01,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:15:03,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:15:03,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:15:06,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 03:15:06,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:15:09,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:15:09,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 03:15:10,709 INFO [train.py:1046] (1/4) Epoch 43, batch 2750, loss[loss=0.1465, simple_loss=0.2118, pruned_loss=0.04057, over 22720.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2349, pruned_loss=0.03755, over 4709998.83 frames. ], batch size: 322, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:15:10,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 03:15:10,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:15:12,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1505733.3333333333, ans=0.125 2023-10-04 03:15:14,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:15:14,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:15:18,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:18,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:15:18,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:21,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:15:23,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:15:23,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:15:23,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:23,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 03:15:24,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:15:24,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:15:29,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 03:15:30,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:15:30,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:31,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:15:31,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:15:33,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:15:33,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:15:34,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:15:35,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:15:37,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1505800.0, ans=0.125 2023-10-04 03:15:40,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:15:40,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:15:40,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:15:40,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1505866.6666666667, ans=0.0 2023-10-04 03:15:41,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:43,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:15:50,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:15:52,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:15:52,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:15:57,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:57,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:15:57,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:16:03,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:16:03,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:16:03,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 03:16:04,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1505933.3333333333, ans=0.125 2023-10-04 03:16:07,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:16:08,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 03:16:10,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1506000.0, ans=0.0 2023-10-04 03:16:12,648 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.953e+02 2.174e+02 2.380e+02 3.470e+02, threshold=4.348e+02, percent-clipped=0.0 2023-10-04 03:16:14,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 03:16:15,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1506000.0, ans=0.125 2023-10-04 03:16:17,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:16:17,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 03:16:17,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:16:18,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:16:18,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1506000.0, ans=0.2 2023-10-04 03:16:20,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 03:16:21,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:16:23,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1506066.6666666667, ans=0.1 2023-10-04 03:16:24,596 INFO [train.py:1046] (1/4) Epoch 43, batch 2800, loss[loss=0.1551, simple_loss=0.2283, pruned_loss=0.04094, over 23753.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.234, pruned_loss=0.03761, over 4701591.34 frames. ], batch size: 164, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:16:24,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 03:16:24,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:16:25,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:16:26,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 03:16:26,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:16:26,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:16:30,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:16:30,115 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 03:16:30,115 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 03:16:32,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:16:35,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:16:35,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:16:38,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1506133.3333333333, ans=0.0 2023-10-04 03:16:39,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:16:40,158 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.37 vs. limit=6.0 2023-10-04 03:16:42,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 03:16:42,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 03:16:43,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 03:16:45,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:16:46,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:16:46,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:16:48,512 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.62 vs. limit=6.0 2023-10-04 03:16:49,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:16:49,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:16:49,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 03:16:51,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:16:58,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:17:00,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:17:01,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:17:03,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:17:03,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:17:05,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1506200.0, ans=0.125 2023-10-04 03:17:10,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:17:10,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 03:17:10,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:17:12,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:17:12,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:17:15,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:17:15,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:17:18,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:17:20,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1506266.6666666667, ans=0.125 2023-10-04 03:17:21,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:17:21,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:17:21,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:17:21,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 03:17:23,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:17:23,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:17:23,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 03:17:24,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:17:24,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:17:24,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:17:25,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 03:17:25,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:17:27,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:17:27,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:17:27,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 03:17:34,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:17:34,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:17:36,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:17:37,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:17:38,858 INFO [train.py:1046] (1/4) Epoch 43, batch 2850, loss[loss=0.1325, simple_loss=0.2142, pruned_loss=0.02545, over 24606.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2327, pruned_loss=0.0371, over 4690027.86 frames. ], batch size: 60, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:17:40,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:17:40,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:17:40,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:17:43,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:17:44,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:17:44,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:17:46,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 03:17:50,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1506400.0, ans=10.0 2023-10-04 03:17:52,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 03:17:52,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:17:54,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 03:17:55,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:17:56,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 03:17:58,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 03:17:59,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:18:14,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:18:14,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:18:14,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:18:14,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1506533.3333333333, ans=0.1 2023-10-04 03:18:16,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:18:16,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:18:17,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:18:19,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:18:19,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 03:18:20,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:18:22,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:18:22,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:18:22,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:18:25,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:18:25,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:18:26,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:18:28,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:18:28,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:18:29,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:18:30,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:18:33,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:18:37,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:18:38,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 03:18:38,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 03:18:41,188 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.979e+02 2.245e+02 2.523e+02 4.092e+02, threshold=4.490e+02, percent-clipped=0.0 2023-10-04 03:18:41,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:18:42,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:18:42,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 03:18:42,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:18:44,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:18:44,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:18:44,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:18:44,148 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 03:18:44,196 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 03:18:44,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:18:46,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:18:52,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 03:18:52,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:18:52,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:18:53,445 INFO [train.py:1046] (1/4) Epoch 43, batch 2900, loss[loss=0.1421, simple_loss=0.216, pruned_loss=0.03406, over 24456.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2335, pruned_loss=0.03714, over 4713765.81 frames. ], batch size: 58, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:18:53,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 03:18:56,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:18:57,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 03:18:57,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 03:18:59,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:18:59,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:19:00,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:19:02,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:19:05,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:19:06,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:19:08,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1506800.0, ans=0.125 2023-10-04 03:19:10,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:19:10,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 03:19:11,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:19:12,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:19:12,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1506800.0, ans=0.0 2023-10-04 03:19:14,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 03:19:15,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 03:19:18,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:19:18,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 03:19:18,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:19:22,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:19:22,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 03:19:26,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:19:27,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:19:30,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:19:33,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:19:34,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 03:19:34,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 03:19:34,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:19:38,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1506933.3333333333, ans=0.125 2023-10-04 03:19:39,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:19:40,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1506933.3333333333, ans=0.125 2023-10-04 03:19:42,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 03:19:43,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:19:48,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:19:57,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:19:57,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:19:59,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 03:20:02,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:20:02,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 03:20:04,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:20:04,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:20:06,830 INFO [train.py:1046] (1/4) Epoch 43, batch 2950, loss[loss=0.1421, simple_loss=0.2208, pruned_loss=0.0317, over 24342.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2342, pruned_loss=0.03709, over 4717151.96 frames. ], batch size: 56, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:20:08,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:20:09,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1507066.6666666667, ans=0.0 2023-10-04 03:20:10,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 03:20:10,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1507066.6666666667, ans=0.1 2023-10-04 03:20:11,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:20:11,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:20:11,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1507066.6666666667, ans=0.125 2023-10-04 03:20:13,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:20:15,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:20:16,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 03:20:17,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 03:20:19,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:20:19,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:20:23,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:20:25,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:20:28,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:20:28,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:20:31,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:20:31,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:20:32,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:20:32,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1507133.3333333333, ans=0.0 2023-10-04 03:20:33,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:20:33,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:20:36,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 03:20:42,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 03:20:42,666 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 03:20:45,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:20:45,485 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 03:20:47,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 03:20:47,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:20:47,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:20:47,311 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 03:20:47,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:20:50,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 03:20:51,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:20:51,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:20:54,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:20:56,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:20:56,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:20:56,201 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 03:20:57,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:20:57,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 03:21:01,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1507266.6666666667, ans=0.125 2023-10-04 03:21:02,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:21:03,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:21:03,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 03:21:05,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:21:06,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 03:21:06,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1507333.3333333333, ans=0.125 2023-10-04 03:21:09,080 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.953e+02 2.204e+02 2.575e+02 5.460e+02, threshold=4.408e+02, percent-clipped=1.0 2023-10-04 03:21:09,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:21:12,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:21:12,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:21:13,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:21:13,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 03:21:15,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:21:16,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:21:16,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:21:16,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:21:16,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:21:18,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:21:18,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:21:19,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1507333.3333333333, ans=0.0 2023-10-04 03:21:20,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 03:21:20,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:21:21,617 INFO [train.py:1046] (1/4) Epoch 43, batch 3000, loss[loss=0.1649, simple_loss=0.2366, pruned_loss=0.04662, over 23574.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2355, pruned_loss=0.03792, over 4700594.71 frames. ], batch size: 256, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:21:21,618 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 03:21:33,114 INFO [train.py:1078] (1/4) Epoch 43, validation: loss=0.3299, simple_loss=0.2679, pruned_loss=0.196, over 1125622.00 frames. 2023-10-04 03:21:33,115 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 03:21:34,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:21:34,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:21:39,056 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 03:21:39,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 03:21:40,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:21:42,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:21:42,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 03:21:42,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:21:48,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1507466.6666666667, ans=0.1 2023-10-04 03:21:49,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:21:50,394 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.67 vs. limit=12.0 2023-10-04 03:21:58,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:22:03,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 03:22:03,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:22:03,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1507533.3333333333, ans=0.0 2023-10-04 03:22:06,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:22:08,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:22:08,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:22:09,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:22:09,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 03:22:13,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 03:22:14,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:22:14,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:22:19,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:22:19,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:22:19,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1507600.0, ans=0.0 2023-10-04 03:22:19,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1507600.0, ans=0.125 2023-10-04 03:22:20,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:22:20,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:22:24,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:22:24,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:22:24,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:22:26,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:22:28,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1507600.0, ans=0.125 2023-10-04 03:22:30,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 03:22:30,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:22:30,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:22:31,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:22:34,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:22:35,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:22:35,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 03:22:35,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 03:22:35,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:22:37,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 03:22:37,766 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.98 vs. limit=15.0 2023-10-04 03:22:38,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:22:39,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 03:22:43,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:22:44,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:22:45,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 03:22:45,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 03:22:45,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 03:22:47,140 INFO [train.py:1046] (1/4) Epoch 43, batch 3050, loss[loss=0.154, simple_loss=0.2445, pruned_loss=0.03169, over 24632.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2361, pruned_loss=0.03792, over 4698945.41 frames. ], batch size: 68, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:22:47,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:22:47,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:22:47,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 03:22:48,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:22:48,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:22:51,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 03:22:53,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:22:54,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1507733.3333333333, ans=0.125 2023-10-04 03:22:55,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:22:55,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:22:58,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:23:01,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 03:23:04,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1507800.0, ans=0.125 2023-10-04 03:23:04,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1507800.0, ans=15.0 2023-10-04 03:23:05,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 03:23:06,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 03:23:06,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:23:09,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:23:11,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:23:11,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:23:12,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:23:13,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1507800.0, ans=0.125 2023-10-04 03:23:13,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1507800.0, ans=0.1 2023-10-04 03:23:15,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:23:17,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:23:17,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:23:17,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:23:17,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:23:20,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:23:22,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:23:25,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:23:25,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 03:23:27,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:23:27,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:23:28,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:23:29,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:23:31,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:23:31,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:23:36,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:23:38,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:23:44,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:23:45,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:23:45,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:23:47,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:23:47,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:23:47,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:23:48,518 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.989e+02 2.180e+02 2.446e+02 3.954e+02, threshold=4.360e+02, percent-clipped=0.0 2023-10-04 03:23:48,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 03:23:50,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:23:50,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:23:53,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 03:23:55,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:23:56,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1508000.0, ans=0.0 2023-10-04 03:23:59,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:24:00,712 INFO [train.py:1046] (1/4) Epoch 43, batch 3100, loss[loss=0.1503, simple_loss=0.2359, pruned_loss=0.03234, over 24474.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2359, pruned_loss=0.03792, over 4707319.37 frames. ], batch size: 66, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:24:00,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:24:03,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:24:04,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 03:24:06,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 03:24:06,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 03:24:09,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:24:13,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:24:13,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:24:15,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 03:24:19,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:24:25,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 03:24:27,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 03:24:29,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:24:29,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:24:29,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:24:30,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 03:24:32,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:24:32,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 03:24:32,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:24:32,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=11.53 vs. limit=15.0 2023-10-04 03:24:34,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:24:35,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 03:24:38,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:24:42,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:24:43,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 03:24:43,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 03:24:45,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:24:45,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:24:46,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1508266.6666666667, ans=0.0 2023-10-04 03:24:48,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:24:48,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:24:48,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:24:49,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:24:49,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:24:52,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:24:53,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:24:53,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:24:53,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 03:24:57,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:25:00,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 03:25:01,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:25:01,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 03:25:03,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:25:03,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:25:03,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 03:25:04,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1508333.3333333333, ans=0.125 2023-10-04 03:25:13,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 03:25:14,896 INFO [train.py:1046] (1/4) Epoch 43, batch 3150, loss[loss=0.1581, simple_loss=0.2471, pruned_loss=0.03458, over 24625.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2349, pruned_loss=0.0376, over 4703611.56 frames. ], batch size: 73, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:25:14,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:25:16,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:25:18,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:25:18,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:25:18,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 03:25:19,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:25:21,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 03:25:22,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 03:25:26,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:25:27,576 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 03:25:30,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 03:25:30,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:25:31,723 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 03:25:33,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 03:25:35,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 03:25:37,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 03:25:37,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 03:25:37,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:25:37,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:25:37,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:25:39,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 03:25:40,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:25:40,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:25:42,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:25:43,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:25:47,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 03:25:47,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:25:49,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:25:50,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:25:50,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 03:25:53,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 03:25:53,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:25:53,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 03:25:55,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 03:25:55,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:25:55,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:25:58,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:25:58,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:25:58,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 03:26:00,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:26:00,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:00,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:26:00,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:26:01,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 03:26:03,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:26:04,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 03:26:04,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:05,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 03:26:06,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 03:26:09,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:26:09,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:26:09,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 03:26:10,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 03:26:11,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:26:15,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:26:16,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1508666.6666666667, ans=0.0 2023-10-04 03:26:17,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:17,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:26:19,128 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.065e+02 2.333e+02 2.653e+02 4.086e+02, threshold=4.666e+02, percent-clipped=0.0 2023-10-04 03:26:20,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:26:22,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:23,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 03:26:29,805 INFO [train.py:1046] (1/4) Epoch 43, batch 3200, loss[loss=0.1541, simple_loss=0.2338, pruned_loss=0.03715, over 24691.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2333, pruned_loss=0.03728, over 4700336.04 frames. ], batch size: 65, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:26:29,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:26:29,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 03:26:34,187 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:26:35,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:37,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:26:37,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 03:26:39,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:26:41,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:26:45,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:54,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:26:54,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1508800.0, ans=0.0 2023-10-04 03:27:02,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 03:27:03,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:27:06,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 03:27:08,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:27:10,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:27:10,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:27:12,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:27:15,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1508933.3333333333, ans=0.025 2023-10-04 03:27:16,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 03:27:17,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 03:27:19,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 03:27:22,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 03:27:23,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:27:29,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:27:29,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:27:31,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:27:31,214 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 03:27:31,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:27:34,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:27:35,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 03:27:37,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 03:27:37,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 03:27:38,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 03:27:40,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:27:42,747 INFO [train.py:1046] (1/4) Epoch 43, batch 3250, loss[loss=0.1445, simple_loss=0.2258, pruned_loss=0.03156, over 22358.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2334, pruned_loss=0.03724, over 4711225.60 frames. ], batch size: 49, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:27:43,295 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.34 vs. limit=15.0 2023-10-04 03:27:44,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:27:44,171 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 03:27:44,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:27:44,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:27:45,646 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 03:27:48,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:27:50,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:28:00,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:28:00,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 03:28:02,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:28:02,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:28:02,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:28:04,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:28:05,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:28:07,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:28:07,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:28:08,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:28:08,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:28:08,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:28:08,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1509133.3333333333, ans=0.0 2023-10-04 03:28:09,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:28:11,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:28:12,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:28:14,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:28:14,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:28:15,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:28:15,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:28:16,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:28:21,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 03:28:21,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:28:21,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:28:22,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:28:24,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:28:31,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:28:33,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1509266.6666666667, ans=0.0 2023-10-04 03:28:35,359 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.32 vs. limit=15.0 2023-10-04 03:28:37,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:28:37,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:28:37,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 03:28:38,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:28:38,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 03:28:39,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:28:40,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 03:28:40,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 03:28:42,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:28:43,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:28:43,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:28:45,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 03:28:45,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:28:46,415 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 1.986e+02 2.142e+02 2.380e+02 3.154e+02, threshold=4.284e+02, percent-clipped=0.0 2023-10-04 03:28:49,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:28:49,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:28:50,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 03:28:50,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:28:52,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:28:52,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 03:28:56,939 INFO [train.py:1046] (1/4) Epoch 43, batch 3300, loss[loss=0.1561, simple_loss=0.2299, pruned_loss=0.04118, over 20713.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2343, pruned_loss=0.03718, over 4713869.64 frames. ], batch size: 45, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:28:57,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:28:57,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 03:28:58,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 03:29:00,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 03:29:00,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:29:02,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1509400.0, ans=0.1 2023-10-04 03:29:04,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:29:06,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:29:06,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:07,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:29:09,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:29:10,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:29:12,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:29:14,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 03:29:16,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:29:16,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:29:17,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:17,718 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 03:29:20,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:29:22,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 03:29:22,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:29:22,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:29:23,477 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 03:29:27,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:29:27,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:29:30,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:30,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 03:29:32,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 03:29:33,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:33,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:29:36,464 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 03:29:37,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 03:29:39,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:29:40,459 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.64 vs. limit=15.0 2023-10-04 03:29:43,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 03:29:44,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:29:47,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 03:29:47,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:29:48,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:29:48,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:29:48,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:29:48,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:29:52,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:29:52,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:53,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:29:54,368 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 03:29:56,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 03:29:58,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:29:58,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:29:58,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:30:01,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:30:01,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:30:01,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:30:02,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:02,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:30:04,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:30:06,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:30:08,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 03:30:08,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:09,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1509733.3333333333, ans=0.1 2023-10-04 03:30:09,420 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.56 vs. limit=6.0 2023-10-04 03:30:10,112 INFO [train.py:1046] (1/4) Epoch 43, batch 3350, loss[loss=0.162, simple_loss=0.2502, pruned_loss=0.03691, over 24389.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2353, pruned_loss=0.03736, over 4713452.79 frames. ], batch size: 77, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:30:10,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:12,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:30:13,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:30:13,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:30:16,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:30:16,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:17,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:30:19,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:20,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:30:23,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:23,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1509800.0, ans=0.125 2023-10-04 03:30:24,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:30:25,162 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.46 vs. limit=22.5 2023-10-04 03:30:26,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:30:26,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:30:27,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 03:30:29,322 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 03:30:30,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:30:31,302 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.92 vs. limit=6.0 2023-10-04 03:30:33,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 03:30:33,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 03:30:34,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:30:34,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:30:34,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:30:35,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1509800.0, ans=0.125 2023-10-04 03:30:36,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 03:30:36,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:36,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:30:40,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:41,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:42,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:42,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:30:42,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1509866.6666666667, ans=0.125 2023-10-04 03:30:47,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:30:47,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:48,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:30:53,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:30:53,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:54,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:54,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:30:57,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:31:00,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 03:31:00,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:31:00,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 03:31:00,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1509933.3333333333, ans=0.1 2023-10-04 03:31:02,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:31:03,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 03:31:04,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:31:07,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:31:12,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:31:14,107 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 2.008e+02 2.223e+02 2.622e+02 3.286e+02, threshold=4.446e+02, percent-clipped=0.0 2023-10-04 03:31:14,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 03:31:15,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:31:16,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:31:18,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:31:20,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1510000.0, ans=0.025 2023-10-04 03:31:23,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:31:24,375 INFO [train.py:1046] (1/4) Epoch 43, batch 3400, loss[loss=0.1584, simple_loss=0.2348, pruned_loss=0.041, over 23322.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2361, pruned_loss=0.03757, over 4716255.23 frames. ], batch size: 119, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:31:25,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 03:31:25,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:31:25,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:31:27,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:31:27,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 03:31:29,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:31:29,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 03:31:30,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:31:31,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:31:31,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:31:31,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:31:33,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 03:31:36,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 03:31:36,227 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 03:31:37,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:31:39,925 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.68 vs. limit=15.0 2023-10-04 03:31:42,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:31:42,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:31:42,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:31:42,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:31:44,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1510133.3333333333, ans=0.0 2023-10-04 03:31:47,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:31:48,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 03:31:53,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:31:53,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1510200.0, ans=0.0 2023-10-04 03:31:54,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:31:54,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:31:55,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 03:32:03,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:32:06,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 03:32:08,632 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.80 vs. limit=6.0 2023-10-04 03:32:14,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:32:14,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:32:14,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 03:32:15,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:32:15,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:32:16,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:32:16,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:32:18,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1510266.6666666667, ans=0.2 2023-10-04 03:32:20,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:32:21,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:32:21,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:32:23,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1510333.3333333333, ans=0.1 2023-10-04 03:32:25,008 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.76 vs. limit=15.0 2023-10-04 03:32:27,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:32:30,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 03:32:36,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 03:32:37,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 03:32:39,357 INFO [train.py:1046] (1/4) Epoch 43, batch 3450, loss[loss=0.1463, simple_loss=0.2264, pruned_loss=0.0331, over 24450.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2362, pruned_loss=0.03754, over 4717316.65 frames. ], batch size: 58, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:32:42,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 03:32:42,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:32:45,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:32:45,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 03:32:46,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:32:49,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:32:55,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:32:55,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:32:57,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:32:57,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:32:59,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1510466.6666666667, ans=0.0 2023-10-04 03:33:00,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:33:04,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 03:33:09,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 03:33:09,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:33:09,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:33:12,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:33:18,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 03:33:19,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:33:20,165 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.40 vs. limit=15.0 2023-10-04 03:33:22,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:33:23,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:33:25,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:33:25,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:33:27,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 03:33:27,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:33:30,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:33:30,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1510600.0, ans=0.125 2023-10-04 03:33:31,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1510600.0, ans=0.125 2023-10-04 03:33:32,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:33:34,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1510600.0, ans=0.125 2023-10-04 03:33:35,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 03:33:37,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1510666.6666666667, ans=0.125 2023-10-04 03:33:38,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:33:43,838 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 2.002e+02 2.243e+02 2.534e+02 3.921e+02, threshold=4.486e+02, percent-clipped=0.0 2023-10-04 03:33:44,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:33:45,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:33:48,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:33:50,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:33:51,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:33:52,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:33:53,702 INFO [train.py:1046] (1/4) Epoch 43, batch 3500, loss[loss=0.1686, simple_loss=0.2474, pruned_loss=0.0449, over 23290.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.235, pruned_loss=0.03738, over 4715336.12 frames. ], batch size: 105, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:33:53,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:33:57,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:33:59,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:34:00,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 03:34:01,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 03:34:04,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 03:34:08,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:34:08,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 03:34:12,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:34:12,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:34:13,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:34:13,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:34:14,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 03:34:14,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:14,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:34:16,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 03:34:19,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:19,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 03:34:20,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:34:23,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:23,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 03:34:25,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:34:26,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:34:28,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:34:29,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:31,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:34:32,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:34:35,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 03:34:35,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 03:34:37,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 03:34:37,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:34:39,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:39,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:34:39,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:34:44,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 03:34:44,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:34:48,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:34:49,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 03:34:49,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 03:34:49,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:34:54,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:34:54,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:34:57,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:58,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 03:34:59,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:35:01,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:35:02,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 03:35:04,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 03:35:06,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:35:06,856 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.60 vs. limit=6.0 2023-10-04 03:35:07,902 INFO [train.py:1046] (1/4) Epoch 43, batch 3550, loss[loss=0.1522, simple_loss=0.2278, pruned_loss=0.03833, over 23897.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2339, pruned_loss=0.03699, over 4704891.73 frames. ], batch size: 195, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:35:08,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:35:08,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:35:08,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:35:12,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:35:14,577 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.00 vs. limit=15.0 2023-10-04 03:35:19,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:35:20,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 03:35:23,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:35:23,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:35:25,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:35:26,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:35:27,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:35:30,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:35:32,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:35:32,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:35:32,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 03:35:33,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:35:39,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:35:39,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:35:41,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:35:41,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:35:41,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:35:41,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 03:35:41,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:35:43,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:35:44,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 03:35:48,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:35:50,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:35:50,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:35:52,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 03:35:52,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:35:54,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 03:35:54,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:35:57,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:35:58,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:36:01,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 03:36:01,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:36:01,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1511266.6666666667, ans=0.0 2023-10-04 03:36:04,038 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.22 vs. limit=15.0 2023-10-04 03:36:04,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=1511333.3333333333, ans=0.5 2023-10-04 03:36:08,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:36:09,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 03:36:09,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:36:14,575 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.748e+02 1.996e+02 2.248e+02 2.673e+02 4.535e+02, threshold=4.496e+02, percent-clipped=1.0 2023-10-04 03:36:14,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:36:14,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 03:36:16,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1511333.3333333333, ans=0.1 2023-10-04 03:36:20,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 03:36:20,957 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.15 vs. limit=15.0 2023-10-04 03:36:21,463 INFO [train.py:1046] (1/4) Epoch 43, batch 3600, loss[loss=0.1448, simple_loss=0.2322, pruned_loss=0.02873, over 23152.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2337, pruned_loss=0.03698, over 4689147.31 frames. ], batch size: 105, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:36:21,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:36:22,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:36:24,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:36:24,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:36:24,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:36:29,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:36:30,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:36:31,734 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.29 vs. limit=10.0 2023-10-04 03:36:32,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:36:32,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1511400.0, ans=0.125 2023-10-04 03:36:33,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:36:33,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:36:33,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 03:36:35,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1511466.6666666667, ans=0.2 2023-10-04 03:36:37,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1511466.6666666667, ans=0.125 2023-10-04 03:36:38,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:36:39,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:36:42,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:36:43,625 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:36:44,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:36:46,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:36:46,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:36:46,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1511466.6666666667, ans=0.0 2023-10-04 03:36:47,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 03:36:47,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:36:51,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:36:51,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:36:54,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:36:54,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1511533.3333333333, ans=0.125 2023-10-04 03:36:55,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:36:57,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:36:58,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 03:36:58,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1511533.3333333333, ans=0.0 2023-10-04 03:37:01,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1511533.3333333333, ans=0.125 2023-10-04 03:37:05,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:37:05,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:37:07,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 03:37:10,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:37:18,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:37:21,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:37:27,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:37:27,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:37:27,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 03:37:28,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 03:37:28,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 03:37:30,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:37:31,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:37:32,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 03:37:32,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:37:32,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:37:32,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:37:34,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 03:37:36,003 INFO [train.py:1046] (1/4) Epoch 43, batch 3650, loss[loss=0.1618, simple_loss=0.236, pruned_loss=0.04381, over 23459.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2349, pruned_loss=0.03751, over 4697021.12 frames. ], batch size: 285, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:37:36,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 03:37:39,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:37:40,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 03:37:41,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1511733.3333333333, ans=0.125 2023-10-04 03:37:44,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 03:37:47,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:37:48,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 03:37:51,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 03:37:54,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:37:54,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:37:55,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:37:58,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:37:58,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:37:59,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 03:37:59,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:37:59,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:38:01,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 03:38:02,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 03:38:02,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:38:02,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:38:04,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:38:08,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 03:38:10,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 03:38:10,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:38:13,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 03:38:14,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:38:14,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:38:19,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:38:21,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:38:21,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:38:23,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:38:25,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:38:27,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:38:30,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:38:30,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:38:30,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:38:30,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1511933.3333333333, ans=0.0 2023-10-04 03:38:32,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:38:33,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:38:35,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:38:41,209 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 03:38:42,407 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.039e+02 2.236e+02 2.475e+02 4.169e+02, threshold=4.472e+02, percent-clipped=0.0 2023-10-04 03:38:44,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:38:44,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:38:45,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:38:47,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:38:47,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:38:49,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:38:50,651 INFO [train.py:1046] (1/4) Epoch 43, batch 3700, loss[loss=0.1615, simple_loss=0.2413, pruned_loss=0.0409, over 23302.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2356, pruned_loss=0.03767, over 4711227.46 frames. ], batch size: 105, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:38:52,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 03:38:52,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:38:53,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:38:54,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:38:55,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:38:57,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:38:57,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 03:38:57,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:38:59,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 03:38:59,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:39:01,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:39:03,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1512133.3333333333, ans=0.05 2023-10-04 03:39:04,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:39:05,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:39:07,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:39:08,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:39:08,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 03:39:10,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1512133.3333333333, ans=0.125 2023-10-04 03:39:11,461 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.00 vs. limit=15.0 2023-10-04 03:39:11,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:39:13,334 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 03:39:15,507 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.94 vs. limit=15.0 2023-10-04 03:39:21,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:39:21,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 03:39:23,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:39:23,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 03:39:23,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:39:27,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:39:27,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 03:39:28,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:39:31,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:39:32,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:39:32,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:39:35,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 03:39:41,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:39:41,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 03:39:41,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:39:42,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 03:39:46,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:39:46,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:39:50,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:39:50,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 03:39:54,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:39:54,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:39:54,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:39:54,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:39:56,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:39:57,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 03:39:59,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 03:39:59,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:39:59,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:40:01,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:40:02,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:40:03,721 INFO [train.py:1046] (1/4) Epoch 43, batch 3750, loss[loss=0.1575, simple_loss=0.2312, pruned_loss=0.04189, over 23420.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2363, pruned_loss=0.0379, over 4709593.97 frames. ], batch size: 285, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:40:03,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:40:05,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:40:06,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:40:08,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 03:40:09,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 03:40:12,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 03:40:12,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 03:40:14,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:40:14,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1512400.0, ans=0.5 2023-10-04 03:40:16,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:40:17,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:40:17,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:40:21,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:40:26,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:40:26,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:40:27,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:40:30,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:40:30,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 03:40:32,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:40:33,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:40:33,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:40:36,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 03:40:40,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 03:40:42,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:40:43,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:40:45,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:40:48,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:40:50,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 03:40:54,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 03:40:57,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:40:59,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:40:59,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:41:04,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:41:08,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 03:41:10,052 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.075e+02 2.386e+02 2.913e+02 4.377e+02, threshold=4.772e+02, percent-clipped=0.0 2023-10-04 03:41:10,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:41:11,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:41:12,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:41:14,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:41:17,732 INFO [train.py:1046] (1/4) Epoch 43, batch 3800, loss[loss=0.1473, simple_loss=0.2291, pruned_loss=0.03273, over 23307.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2364, pruned_loss=0.03812, over 4687223.15 frames. ], batch size: 119, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:41:24,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:41:26,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:41:28,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 03:41:28,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 03:41:28,935 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.32 vs. limit=15.0 2023-10-04 03:41:29,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:41:32,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:41:32,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 03:41:33,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 03:41:33,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:41:35,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:41:36,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:41:36,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:41:36,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:41:40,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 03:41:40,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1512800.0, ans=0.0 2023-10-04 03:41:43,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 03:41:43,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:41:47,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:41:47,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:41:48,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:41:48,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:41:48,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:41:52,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:41:54,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:41:58,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:41:58,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 03:41:59,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:42:06,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:42:14,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:42:16,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 03:42:17,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 03:42:17,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:42:18,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:42:19,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:42:22,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 03:42:22,912 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.99 vs. limit=15.0 2023-10-04 03:42:26,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 03:42:26,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 03:42:26,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:42:28,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:42:32,370 INFO [train.py:1046] (1/4) Epoch 43, batch 3850, loss[loss=0.1619, simple_loss=0.2318, pruned_loss=0.04604, over 23791.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2351, pruned_loss=0.03772, over 4689142.74 frames. ], batch size: 164, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:42:33,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:42:33,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:42:38,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:42:38,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 03:42:40,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:42:40,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:42:41,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1513066.6666666667, ans=0.2 2023-10-04 03:42:43,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1513066.6666666667, ans=0.125 2023-10-04 03:42:44,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:42:44,798 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.43 vs. limit=15.0 2023-10-04 03:42:48,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:42:49,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 03:42:51,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 03:42:56,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1513133.3333333333, ans=0.2 2023-10-04 03:42:57,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:42:58,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:43:00,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:43:00,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:43:03,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:04,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:43:04,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:04,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:43:05,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:07,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1513200.0, ans=0.2 2023-10-04 03:43:08,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:09,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:09,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:43:10,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 03:43:10,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 03:43:12,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:43:12,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:14,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:14,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:15,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1513266.6666666667, ans=0.1 2023-10-04 03:43:16,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 03:43:16,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1513266.6666666667, ans=0.125 2023-10-04 03:43:17,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1513266.6666666667, ans=0.1 2023-10-04 03:43:18,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 03:43:20,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:20,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1513266.6666666667, ans=0.125 2023-10-04 03:43:20,871 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.06 vs. limit=15.0 2023-10-04 03:43:22,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 03:43:25,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 03:43:29,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:31,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:34,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1513333.3333333333, ans=0.2 2023-10-04 03:43:35,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:35,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 03:43:36,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 03:43:38,218 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.999e+02 2.160e+02 2.464e+02 3.825e+02, threshold=4.321e+02, percent-clipped=0.0 2023-10-04 03:43:39,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:39,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:42,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:43:42,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:43:44,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:44,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:44,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:43:44,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 03:43:45,921 INFO [train.py:1046] (1/4) Epoch 43, batch 3900, loss[loss=0.1584, simple_loss=0.238, pruned_loss=0.03935, over 23746.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2338, pruned_loss=0.03708, over 4703494.73 frames. ], batch size: 179, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:43:45,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:43:47,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 03:43:48,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:48,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:50,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:43:50,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:51,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:43:53,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:53,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:53,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:43:53,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 03:43:53,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:57,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:43:58,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:43:58,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:43:59,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:44:02,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:44:02,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:44:03,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:44:06,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 03:44:06,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:44:07,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 03:44:08,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:44:09,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 03:44:10,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 03:44:15,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:44:15,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:44:16,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:44:16,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1513533.3333333333, ans=0.125 2023-10-04 03:44:17,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:44:21,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:44:24,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:44:26,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:44:26,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:44:28,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:44:32,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:44:32,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:44:39,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:44:40,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:44:50,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1513666.6666666667, ans=0.125 2023-10-04 03:44:51,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:44:51,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1513666.6666666667, ans=0.2 2023-10-04 03:44:54,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:44:56,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 03:44:56,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 03:44:56,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:44:59,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 03:44:59,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:45:00,807 INFO [train.py:1046] (1/4) Epoch 43, batch 3950, loss[loss=0.1375, simple_loss=0.2128, pruned_loss=0.03109, over 24448.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2336, pruned_loss=0.03706, over 4704295.47 frames. ], batch size: 58, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:45:00,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 03:45:05,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:45:06,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 03:45:07,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:45:10,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:45:12,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:45:14,349 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.86 vs. limit=12.0 2023-10-04 03:45:16,180 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 03:45:16,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:45:16,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 03:45:18,119 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 03:45:18,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:45:21,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:45:22,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:45:22,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:45:25,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 03:45:27,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:45:29,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:45:29,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:45:30,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:45:30,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:45:32,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1513866.6666666667, ans=0.0 2023-10-04 03:45:37,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:45:37,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:45:43,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 03:45:49,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 03:45:49,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 03:45:49,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:45:51,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:45:58,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:45:59,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1514000.0, ans=0.125 2023-10-04 03:46:00,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:46:00,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:46:00,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:46:00,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 03:46:02,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1514000.0, ans=10.0 2023-10-04 03:46:07,398 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 2.005e+02 2.254e+02 2.527e+02 3.756e+02, threshold=4.508e+02, percent-clipped=0.0 2023-10-04 03:46:07,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:46:08,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:46:11,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 03:46:14,420 INFO [train.py:1046] (1/4) Epoch 43, batch 4000, loss[loss=0.166, simple_loss=0.2391, pruned_loss=0.04644, over 22857.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2341, pruned_loss=0.03708, over 4718548.34 frames. ], batch size: 322, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:46:20,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:46:28,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:46:33,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:46:33,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:46:33,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:46:34,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 03:46:34,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:46:35,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 03:46:35,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:46:35,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 03:46:38,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:46:39,530 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.20 vs. limit=12.0 2023-10-04 03:46:41,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:46:41,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:46:41,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:46:41,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:46:41,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 03:46:42,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:46:44,271 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 03:46:44,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1514200.0, ans=0.1 2023-10-04 03:46:45,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:46:45,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:46:50,377 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 03:46:50,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:46:51,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:46:55,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1514200.0, ans=0.125 2023-10-04 03:46:56,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 03:46:57,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:47:00,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:47:00,677 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 03:47:03,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:47:03,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 03:47:03,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:47:03,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:47:05,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:47:07,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:47:08,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:47:08,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:47:11,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 03:47:11,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:47:12,835 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 03:47:16,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:47:20,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 03:47:21,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:47:21,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:47:23,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:47:23,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:47:29,563 INFO [train.py:1046] (1/4) Epoch 43, batch 4050, loss[loss=0.2081, simple_loss=0.2771, pruned_loss=0.06956, over 19448.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2346, pruned_loss=0.03732, over 4720688.53 frames. ], batch size: 388, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:47:29,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:47:32,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 03:47:33,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 03:47:33,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1514400.0, ans=0.0 2023-10-04 03:47:35,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:47:35,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:47:36,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:47:38,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:47:38,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1514400.0, ans=0.1 2023-10-04 03:47:39,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:47:42,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:47:46,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:47:47,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:47:49,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:47:49,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:47:54,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:47:56,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:48:00,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 03:48:01,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 03:48:01,968 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 03:48:05,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:48:12,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 03:48:12,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:48:14,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:48:15,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:48:17,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:48:17,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:48:18,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1514600.0, ans=0.125 2023-10-04 03:48:20,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1514600.0, ans=0.2 2023-10-04 03:48:21,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:48:21,790 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.40 vs. limit=15.0 2023-10-04 03:48:24,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 03:48:24,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:48:25,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:48:28,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 03:48:31,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:48:36,608 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.949e+02 2.233e+02 2.507e+02 3.530e+02, threshold=4.466e+02, percent-clipped=0.0 2023-10-04 03:48:39,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 03:48:39,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:48:39,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:48:42,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 03:48:42,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 03:48:42,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:48:43,720 INFO [train.py:1046] (1/4) Epoch 43, batch 4100, loss[loss=0.1881, simple_loss=0.2601, pruned_loss=0.05803, over 19451.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2357, pruned_loss=0.03749, over 4711994.00 frames. ], batch size: 388, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:48:43,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1514733.3333333333, ans=0.0 2023-10-04 03:48:45,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:48:47,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:48:47,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:48:49,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1514733.3333333333, ans=0.125 2023-10-04 03:48:53,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 03:48:54,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1514733.3333333333, ans=0.1 2023-10-04 03:48:55,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 03:48:57,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 03:48:58,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 03:48:58,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:48:58,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:48:58,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:48:58,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:48:58,892 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 03:49:00,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1514800.0, ans=0.125 2023-10-04 03:49:02,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:49:04,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:49:05,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:49:05,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:49:07,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1514800.0, ans=0.125 2023-10-04 03:49:09,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:49:09,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:49:11,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:49:11,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 03:49:12,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:49:12,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:49:12,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:49:12,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:49:12,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 03:49:15,060 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.91 vs. limit=15.0 2023-10-04 03:49:15,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:49:15,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 03:49:18,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:49:19,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:49:19,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 03:49:21,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:49:21,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:49:22,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:49:23,544 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:49:24,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 03:49:25,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:49:25,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:49:28,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 03:49:28,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:49:30,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:49:33,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:49:38,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:49:38,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1514933.3333333333, ans=0.2 2023-10-04 03:49:42,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:49:42,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:49:45,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1515000.0, ans=0.125 2023-10-04 03:49:51,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:49:52,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:49:55,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:49:57,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:49:58,532 INFO [train.py:1046] (1/4) Epoch 43, batch 4150, loss[loss=0.205, simple_loss=0.2713, pruned_loss=0.06939, over 19770.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2367, pruned_loss=0.03801, over 4704532.47 frames. ], batch size: 388, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:49:58,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1515066.6666666667, ans=0.0 2023-10-04 03:50:01,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:50:02,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:50:02,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1515066.6666666667, ans=0.0 2023-10-04 03:50:04,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:50:04,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:50:06,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 03:50:07,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:50:09,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 03:50:09,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 03:50:09,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 03:50:10,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:50:12,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1515133.3333333333, ans=0.125 2023-10-04 03:50:14,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:50:15,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:50:18,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:50:18,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:50:19,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:50:21,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 03:50:21,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:50:23,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:50:24,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1515133.3333333333, ans=0.125 2023-10-04 03:50:25,933 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.23 vs. limit=15.0 2023-10-04 03:50:28,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:50:30,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:50:32,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 03:50:35,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 03:50:35,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:50:37,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 03:50:37,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:50:37,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:50:40,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:50:41,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:50:45,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 03:50:47,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:50:48,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:50:50,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 03:50:50,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:50:52,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 03:50:54,879 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.54 vs. limit=15.0 2023-10-04 03:50:55,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:50:56,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:50:56,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1515333.3333333333, ans=0.1 2023-10-04 03:50:58,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:50:58,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 03:50:58,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:50:58,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:51:00,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:51:02,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 03:51:02,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:51:02,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1515333.3333333333, ans=0.2 2023-10-04 03:51:04,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:51:04,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:51:04,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 03:51:05,315 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.071e+02 2.291e+02 2.754e+02 5.163e+02, threshold=4.583e+02, percent-clipped=2.0 2023-10-04 03:51:05,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:51:05,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:51:05,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:51:08,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:51:08,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 03:51:08,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:51:12,976 INFO [train.py:1046] (1/4) Epoch 43, batch 4200, loss[loss=0.1589, simple_loss=0.2341, pruned_loss=0.04182, over 23758.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2346, pruned_loss=0.03781, over 4703532.94 frames. ], batch size: 164, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 03:51:13,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:51:14,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 03:51:17,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:51:18,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:51:19,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:51:19,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:51:19,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:51:22,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 03:51:26,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 03:51:26,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:51:28,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:51:32,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:51:34,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 03:51:35,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:51:35,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:51:37,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 03:51:37,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:51:38,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:51:38,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:51:39,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:51:41,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:51:43,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 03:51:44,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:51:47,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:51:48,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:51:50,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:51:50,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1515533.3333333333, ans=0.125 2023-10-04 03:51:50,714 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.75 vs. limit=6.0 2023-10-04 03:51:52,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:51:54,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:51:54,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 03:51:54,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:51:56,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:52:00,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 03:52:02,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:52:08,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:52:11,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 03:52:12,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:52:17,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:52:18,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:52:18,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 03:52:18,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1515666.6666666667, ans=0.125 2023-10-04 03:52:26,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:52:27,724 INFO [train.py:1046] (1/4) Epoch 43, batch 4250, loss[loss=0.1696, simple_loss=0.2633, pruned_loss=0.0379, over 24665.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2339, pruned_loss=0.03725, over 4709162.65 frames. ], batch size: 73, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 03:52:30,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:52:30,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:52:31,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:52:34,109 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.45 vs. limit=15.0 2023-10-04 03:52:35,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:52:35,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 03:52:35,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:52:35,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1515733.3333333333, ans=0.2 2023-10-04 03:52:38,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:52:42,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:52:45,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:52:45,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:52:46,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:52:46,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:52:47,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:52:48,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:52:49,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:52:52,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:52:54,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:52:55,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 03:52:58,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 03:52:58,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:53:00,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:53:00,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:53:00,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:53:00,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:53:00,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:53:05,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 03:53:06,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:53:09,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:53:10,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:53:12,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 03:53:12,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:53:13,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 03:53:16,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:53:17,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:53:20,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:53:20,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:53:23,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1515933.3333333333, ans=0.125 2023-10-04 03:53:24,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 03:53:25,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:53:27,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:53:30,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:53:33,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:53:35,074 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.993e+02 2.241e+02 2.615e+02 3.958e+02, threshold=4.482e+02, percent-clipped=0.0 2023-10-04 03:53:35,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:53:35,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:53:36,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:53:37,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:53:39,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:53:39,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 03:53:41,885 INFO [train.py:1046] (1/4) Epoch 43, batch 4300, loss[loss=0.176, simple_loss=0.2416, pruned_loss=0.05521, over 19237.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2335, pruned_loss=0.03715, over 4696600.34 frames. ], batch size: 388, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 03:53:41,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:53:44,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:53:45,509 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.45 vs. limit=15.0 2023-10-04 03:53:46,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:53:50,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:53:50,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1516066.6666666667, ans=0.1 2023-10-04 03:53:53,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:53:53,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 03:53:55,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:53:58,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:53:58,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:53:58,420 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 03:54:01,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1516133.3333333333, ans=0.125 2023-10-04 03:54:03,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:54:04,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:54:04,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1516133.3333333333, ans=0.0 2023-10-04 03:54:07,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 03:54:09,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:54:09,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 03:54:11,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 03:54:13,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:54:14,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:54:14,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:54:16,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:54:17,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:54:20,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:54:20,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 03:54:20,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 03:54:22,072 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-04 03:54:22,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:54:26,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:54:26,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 03:54:26,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:54:26,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:54:26,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 03:54:26,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 03:54:27,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 03:54:27,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:54:29,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 03:54:29,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 03:54:33,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:54:35,067 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 03:54:36,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:54:37,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:54:37,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:54:39,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 03:54:41,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:54:41,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:54:41,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:54:42,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:54:42,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:54:45,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:54:46,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:54:48,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:54:48,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:54:48,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1516333.3333333333, ans=0.09899494936611666 2023-10-04 03:54:53,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 03:54:53,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1516400.0, ans=0.0 2023-10-04 03:54:55,434 INFO [train.py:1046] (1/4) Epoch 43, batch 4350, loss[loss=0.1542, simple_loss=0.2317, pruned_loss=0.03832, over 23817.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2348, pruned_loss=0.0373, over 4711810.47 frames. ], batch size: 195, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 03:54:55,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 03:54:59,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:55:01,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:55:04,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:55:04,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:55:09,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:55:13,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:55:14,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:55:14,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:55:17,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:55:18,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:55:21,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:55:26,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 03:55:27,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:55:28,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:55:30,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1516533.3333333333, ans=0.125 2023-10-04 03:55:34,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:55:35,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 03:55:39,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:55:41,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 03:55:44,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1516600.0, ans=0.2 2023-10-04 03:55:46,213 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 03:55:46,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:55:47,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:55:48,961 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 03:55:50,416 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 03:55:50,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:55:50,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:55:51,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:55:53,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:55:53,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:55:54,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:55:57,826 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 03:55:57,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:55:57,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:55:57,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:55:59,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 03:56:00,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1516666.6666666667, ans=0.125 2023-10-04 03:56:00,984 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 03:56:00,989 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 03:56:01,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 03:56:02,235 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 1.917e+02 2.050e+02 2.278e+02 3.303e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-04 03:56:02,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1516666.6666666667, ans=0.125 2023-10-04 03:56:05,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:56:05,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:56:05,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:56:06,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:56:08,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 03:56:08,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1516733.3333333333, ans=0.125 2023-10-04 03:56:09,580 INFO [train.py:1046] (1/4) Epoch 43, batch 4400, loss[loss=0.1589, simple_loss=0.2544, pruned_loss=0.03166, over 24348.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2352, pruned_loss=0.03725, over 4715195.64 frames. ], batch size: 74, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 03:56:10,970 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 03:56:10,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:56:14,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:56:15,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:56:16,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:56:19,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 03:56:19,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 03:56:21,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 03:56:21,097 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 03:56:21,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 03:56:21,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:56:23,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 03:56:25,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:56:25,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:56:25,408 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 03:56:28,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:56:29,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 03:56:29,523 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 03:56:31,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 03:56:31,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 03:56:32,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 03:56:32,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:56:32,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1516800.0, ans=0.0 2023-10-04 03:56:35,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:56:35,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:56:36,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:56:38,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 03:56:38,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 03:56:40,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:56:40,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:56:40,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:56:40,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1516866.6666666667, ans=0.2 2023-10-04 03:56:41,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:56:42,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:56:42,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 03:56:44,326 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 03:56:45,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1516866.6666666667, ans=0.125 2023-10-04 03:56:48,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:56:54,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:56:57,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 03:56:58,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1516933.3333333333, ans=0.125 2023-10-04 03:57:02,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:57:04,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:57:05,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1516933.3333333333, ans=0.0 2023-10-04 03:57:06,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:57:07,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 03:57:07,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:57:07,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:57:07,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:57:07,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:57:07,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1517000.0, ans=0.0 2023-10-04 03:57:12,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 03:57:16,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 03:57:17,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 03:57:17,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:57:17,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 03:57:17,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:57:20,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:57:22,696 INFO [train.py:1046] (1/4) Epoch 43, batch 4450, loss[loss=0.1583, simple_loss=0.2373, pruned_loss=0.03962, over 23491.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2358, pruned_loss=0.03786, over 4698747.40 frames. ], batch size: 106, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 03:57:22,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 03:57:26,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:57:28,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:57:29,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:57:34,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:57:34,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:57:38,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:57:39,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:57:41,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:57:43,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:57:43,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 03:57:43,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:57:43,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1517133.3333333333, ans=0.0 2023-10-04 03:57:44,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:57:44,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:57:44,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:57:48,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:57:52,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:57:53,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:57:53,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:57:55,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:57:57,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:58:02,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:58:04,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 03:58:04,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 03:58:04,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:58:06,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:58:07,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 03:58:10,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:58:14,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:58:14,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 03:58:14,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:58:14,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:58:15,080 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:58:16,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:58:16,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:58:17,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:58:20,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:58:20,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 03:58:23,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:58:26,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:58:26,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:58:28,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:58:28,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 03:58:29,420 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 2.027e+02 2.201e+02 2.513e+02 3.494e+02, threshold=4.403e+02, percent-clipped=0.0 2023-10-04 03:58:30,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:58:30,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1517333.3333333333, ans=0.2 2023-10-04 03:58:33,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 03:58:34,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:58:37,765 INFO [train.py:1046] (1/4) Epoch 43, batch 4500, loss[loss=0.1581, simple_loss=0.2233, pruned_loss=0.04642, over 23697.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2361, pruned_loss=0.03817, over 4687945.41 frames. ], batch size: 164, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 03:58:39,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:58:40,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 03:58:40,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 03:58:43,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:58:46,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:58:48,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:58:49,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:58:50,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:58:50,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:58:50,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:58:51,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1517466.6666666667, ans=0.125 2023-10-04 03:59:03,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1517466.6666666667, ans=0.125 2023-10-04 03:59:04,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:59:05,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:59:07,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:59:08,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:59:09,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:59:15,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:59:19,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:59:23,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:59:26,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:59:27,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 03:59:29,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:59:29,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:59:29,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:59:30,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:59:30,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:59:32,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 03:59:32,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 03:59:32,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:59:35,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:59:37,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:59:39,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:59:41,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:59:41,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:59:41,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 03:59:44,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 03:59:44,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 03:59:44,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1517666.6666666667, ans=0.125 2023-10-04 03:59:48,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 03:59:50,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1517733.3333333333, ans=0.0 2023-10-04 03:59:51,413 INFO [train.py:1046] (1/4) Epoch 43, batch 4550, loss[loss=0.1554, simple_loss=0.2444, pruned_loss=0.03321, over 24348.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2344, pruned_loss=0.03797, over 4673768.42 frames. ], batch size: 74, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 03:59:52,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 03:59:54,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:59:54,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1517733.3333333333, ans=0.1 2023-10-04 03:59:58,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:59:58,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:00:00,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:00:03,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:00:05,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1517800.0, ans=0.125 2023-10-04 04:00:06,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:00:09,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:00:09,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:00:09,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:12,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:00:12,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:00:13,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:00:18,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 04:00:18,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 04:00:21,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:00:21,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 04:00:22,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1517866.6666666667, ans=0.0 2023-10-04 04:00:24,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 04:00:24,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:00:25,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1517866.6666666667, ans=0.125 2023-10-04 04:00:28,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 04:00:28,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:00:31,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:33,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:33,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:00:34,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 04:00:37,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:00:40,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:42,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:00:42,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:00:44,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 04:00:44,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 04:00:44,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:00:45,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 04:00:45,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1517933.3333333333, ans=0.1 2023-10-04 04:00:48,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 04:00:48,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:00:50,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:00:50,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:00:52,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:52,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:00:53,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:00:53,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 04:00:54,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1518000.0, ans=0.5 2023-10-04 04:00:55,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:00:55,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 04:00:56,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 04:00:56,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:00:56,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 04:00:58,352 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.020e+02 2.212e+02 2.607e+02 3.843e+02, threshold=4.425e+02, percent-clipped=0.0 2023-10-04 04:00:58,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:00:58,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:00:59,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:01:01,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:01:01,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:01:02,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:01:04,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:01:06,031 INFO [train.py:1046] (1/4) Epoch 43, batch 4600, loss[loss=0.16, simple_loss=0.2446, pruned_loss=0.03771, over 24506.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2333, pruned_loss=0.03745, over 4691475.20 frames. ], batch size: 66, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:01:07,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:08,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:01:12,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:01:12,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:01:12,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:01:13,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 04:01:15,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:01:19,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:01:20,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:01:22,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:27,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=1518133.3333333333, ans=10.0 2023-10-04 04:01:28,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 04:01:28,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:30,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:34,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:01:34,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:01:41,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 04:01:41,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:01:42,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:01:46,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:46,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:01:47,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:01:51,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 04:01:52,423 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.15 vs. limit=15.0 2023-10-04 04:01:52,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:01:54,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1518266.6666666667, ans=0.0 2023-10-04 04:01:57,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:01:58,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:02:00,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:00,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 04:02:01,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:02:01,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 04:02:03,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:03,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:02:05,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:05,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:02:06,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:02:06,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 04:02:07,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 04:02:07,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 04:02:07,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:02:09,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:02:09,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:02:10,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:02:11,084 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.94 vs. limit=15.0 2023-10-04 04:02:19,789 INFO [train.py:1046] (1/4) Epoch 43, batch 4650, loss[loss=0.1527, simple_loss=0.2309, pruned_loss=0.03723, over 23399.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2329, pruned_loss=0.03723, over 4698482.62 frames. ], batch size: 119, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:02:21,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:02:24,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:02:24,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:02:24,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:02:24,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:02:24,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:02:27,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:02:30,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 04:02:32,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:02:34,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 04:02:36,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:02:37,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 04:02:37,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:02:37,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 04:02:37,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 04:02:37,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:39,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:02:43,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:02:43,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1518466.6666666667, ans=0.0 2023-10-04 04:02:44,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:02:44,970 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 04:02:48,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:02:49,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 04:02:50,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1518533.3333333333, ans=0.125 2023-10-04 04:02:51,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:51,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:02:52,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 04:02:52,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:02:55,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:02:59,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:02:59,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1518533.3333333333, ans=0.0 2023-10-04 04:03:05,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:03:07,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:03:08,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:03:09,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:03:11,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 04:03:11,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 04:03:13,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 04:03:13,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 04:03:16,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:03:23,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:03:23,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:03:23,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 04:03:23,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1518666.6666666667, ans=0.1 2023-10-04 04:03:24,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:03:26,333 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.656e+02 2.022e+02 2.193e+02 2.608e+02 3.826e+02, threshold=4.386e+02, percent-clipped=0.0 2023-10-04 04:03:26,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:03:26,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:03:26,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:03:29,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:03:29,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:03:30,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:03:33,370 INFO [train.py:1046] (1/4) Epoch 43, batch 4700, loss[loss=0.1726, simple_loss=0.245, pruned_loss=0.05014, over 19640.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2337, pruned_loss=0.03722, over 4705111.20 frames. ], batch size: 388, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:03:33,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:03:33,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:03:33,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:03:35,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 04:03:35,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:03:36,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 04:03:41,983 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.08 vs. limit=22.5 2023-10-04 04:03:43,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1518733.3333333333, ans=0.0 2023-10-04 04:03:47,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:03:48,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:03:48,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:03:49,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:03:50,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:03:55,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 04:03:56,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 04:03:56,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:03:57,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:03:57,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:04:02,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:04:08,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:04:08,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 04:04:11,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:04:15,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 04:04:17,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:04:18,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:21,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1518933.3333333333, ans=0.07 2023-10-04 04:04:22,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 04:04:23,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:04:26,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1518933.3333333333, ans=0.05 2023-10-04 04:04:27,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:04:28,083 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.93 vs. limit=6.0 2023-10-04 04:04:28,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 04:04:29,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:29,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:04:31,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:04:31,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:04:33,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 04:04:33,305 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 04:04:34,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:04:38,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:38,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:38,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 04:04:40,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:45,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 04:04:48,894 INFO [train.py:1046] (1/4) Epoch 43, batch 4750, loss[loss=0.1311, simple_loss=0.2091, pruned_loss=0.02652, over 24451.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2344, pruned_loss=0.03755, over 4692600.50 frames. ], batch size: 58, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 04:04:48,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:04:49,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1519066.6666666667, ans=0.125 2023-10-04 04:04:50,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:04:55,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:04:55,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:04:56,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 04:04:56,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:04:59,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 04:05:01,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:05:02,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:05:03,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:05:04,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1519133.3333333333, ans=0.125 2023-10-04 04:05:07,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 04:05:08,742 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.74 vs. limit=15.0 2023-10-04 04:05:09,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:05:11,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 04:05:12,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:05:15,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:05:15,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:05:15,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:05:16,072 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.38 vs. limit=10.0 2023-10-04 04:05:17,347 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 04:05:17,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 04:05:23,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 04:05:24,306 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.80 vs. limit=15.0 2023-10-04 04:05:26,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:05:27,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:05:30,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:05:30,464 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 04:05:30,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:05:31,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:05:32,510 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.89 vs. limit=15.0 2023-10-04 04:05:34,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:05:36,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1519266.6666666667, ans=0.125 2023-10-04 04:05:37,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 04:05:37,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 04:05:38,060 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.32 vs. limit=15.0 2023-10-04 04:05:39,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:05:39,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:05:40,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:05:42,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 04:05:42,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 04:05:44,500 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.04 vs. limit=12.0 2023-10-04 04:05:45,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 04:05:48,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:05:50,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:05:50,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 04:05:50,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:05:52,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:05:54,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:05:54,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:05:56,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:05:57,481 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 2.030e+02 2.225e+02 2.486e+02 3.690e+02, threshold=4.449e+02, percent-clipped=0.0 2023-10-04 04:06:00,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:06:00,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 04:06:01,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 04:06:02,914 INFO [train.py:1046] (1/4) Epoch 43, batch 4800, loss[loss=0.1584, simple_loss=0.2353, pruned_loss=0.04074, over 23353.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2357, pruned_loss=0.03758, over 4712489.02 frames. ], batch size: 119, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:06:02,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 04:06:05,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:06:05,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:06:07,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 04:06:10,499 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.15 vs. limit=15.0 2023-10-04 04:06:11,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:06:13,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:06:19,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:06:20,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:06:20,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:06:20,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 04:06:21,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:06:23,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:06:23,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:06:26,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:06:29,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:06:29,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:06:31,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:06:31,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 04:06:31,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:06:32,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:06:35,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:06:38,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:06:39,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:06:39,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:06:40,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 04:06:40,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:06:44,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 04:06:44,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 04:06:44,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:06:45,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:06:45,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:06:45,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:06:45,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:06:47,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:06:49,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:06:52,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:06:52,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:06:55,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:06:55,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1519600.0, ans=0.125 2023-10-04 04:07:00,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 04:07:00,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:07:00,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:01,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:07:02,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:07:03,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1519666.6666666667, ans=0.125 2023-10-04 04:07:04,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:07:05,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:07:05,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:07,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:07:08,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:07:09,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:07:11,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:07:12,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:12,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:07:14,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 04:07:17,521 INFO [train.py:1046] (1/4) Epoch 43, batch 4850, loss[loss=0.1453, simple_loss=0.2393, pruned_loss=0.02563, over 24569.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2366, pruned_loss=0.03819, over 4703610.25 frames. ], batch size: 71, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:07:17,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 04:07:17,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:07:17,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:07:17,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:07:17,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:19,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1519733.3333333333, ans=0.125 2023-10-04 04:07:21,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:07:23,846 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:07:27,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 04:07:29,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:07:32,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:07:33,220 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.96 vs. limit=6.0 2023-10-04 04:07:33,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:07:34,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:38,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:07:39,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:07:40,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:07:40,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 04:07:45,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:07:46,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:07:47,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1519866.6666666667, ans=0.125 2023-10-04 04:07:48,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:07:49,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:07:49,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 04:07:50,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:07:52,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:07:55,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:07:55,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 04:07:55,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 04:07:55,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1519866.6666666667, ans=0.1 2023-10-04 04:07:57,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:08:03,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1519933.3333333333, ans=0.0 2023-10-04 04:08:05,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:08:07,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 04:08:07,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:08:08,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:08:10,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:08:11,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 04:08:11,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:08:11,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 04:08:11,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:08:12,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:08:14,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 04:08:24,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:08:24,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1520000.0, ans=0.125 2023-10-04 04:08:26,940 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 2.023e+02 2.216e+02 2.527e+02 3.488e+02, threshold=4.432e+02, percent-clipped=0.0 2023-10-04 04:08:30,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:08:31,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:08:32,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1520066.6666666667, ans=0.1 2023-10-04 04:08:33,729 INFO [train.py:1046] (1/4) Epoch 43, batch 4900, loss[loss=0.1335, simple_loss=0.1946, pruned_loss=0.03616, over 19238.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2354, pruned_loss=0.0377, over 4700620.06 frames. ], batch size: 388, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:08:35,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 04:08:35,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:08:40,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:08:42,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:08:42,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:08:44,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 04:08:51,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 04:08:55,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 04:08:55,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 04:08:55,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:08:55,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:08:56,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:08:57,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:08:57,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:08:57,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 04:09:00,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 04:09:00,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 04:09:01,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:09:03,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:09:03,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1520200.0, ans=0.1 2023-10-04 04:09:05,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:09:05,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:09:06,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:09:06,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 04:09:07,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:09:09,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:09:09,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 04:09:09,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 04:09:14,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 04:09:16,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:09:18,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:09:18,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:09:19,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:09:19,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 04:09:19,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:09:19,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 04:09:22,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:09:23,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:09:24,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:09:26,942 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.77 vs. limit=12.0 2023-10-04 04:09:27,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 04:09:27,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1520266.6666666667, ans=0.05 2023-10-04 04:09:29,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:09:29,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 04:09:29,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 04:09:37,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:09:38,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:09:38,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 04:09:38,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 04:09:39,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:09:41,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:09:42,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1520333.3333333333, ans=0.125 2023-10-04 04:09:45,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:09:45,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:09:45,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:09:45,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 04:09:46,728 INFO [train.py:1046] (1/4) Epoch 43, batch 4950, loss[loss=0.1651, simple_loss=0.2488, pruned_loss=0.04072, over 23961.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2346, pruned_loss=0.03737, over 4703943.25 frames. ], batch size: 86, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:09:48,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:09:51,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:09:51,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 04:09:53,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 04:09:54,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 04:09:54,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:09:55,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 04:09:55,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:09:55,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:09:57,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:09:57,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:09:58,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:09:58,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:10:00,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:10:02,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1520466.6666666667, ans=0.1 2023-10-04 04:10:03,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:10:06,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:10:06,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:10:09,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:10:13,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:10:14,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:10:16,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:10:17,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:10:19,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:10:19,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 04:10:21,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 04:10:22,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:10:24,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:10:25,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:10:26,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:10:26,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:10:26,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:10:29,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:10:31,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:10:33,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:10:34,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:10:35,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:10:37,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 04:10:37,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:10:37,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1520600.0, ans=0.5 2023-10-04 04:10:38,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:10:43,594 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.40 vs. limit=15.0 2023-10-04 04:10:44,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:10:45,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:10:45,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:10:45,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:10:45,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:10:46,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:10:50,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:10:50,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:10:51,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:10:53,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 04:10:54,732 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.929e+02 2.246e+02 2.665e+02 5.151e+02, threshold=4.493e+02, percent-clipped=1.0 2023-10-04 04:10:57,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:11:00,476 INFO [train.py:1046] (1/4) Epoch 43, batch 5000, loss[loss=0.1416, simple_loss=0.2131, pruned_loss=0.03504, over 23842.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2344, pruned_loss=0.03699, over 4718864.80 frames. ], batch size: 212, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:11:00,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 04:11:00,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 04:11:04,645 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.29 vs. limit=22.5 2023-10-04 04:11:04,664 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.69 vs. limit=22.5 2023-10-04 04:11:07,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:11:07,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:11:07,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1520733.3333333333, ans=0.0 2023-10-04 04:11:09,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 04:11:11,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 04:11:12,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:11:15,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 04:11:15,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:11:15,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:11:15,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 04:11:15,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:11:16,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:11:18,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 04:11:18,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:11:19,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:11:20,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 04:11:22,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 04:11:22,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:11:23,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 04:11:24,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 04:11:24,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:11:24,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:11:24,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 04:11:24,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 04:11:27,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 04:11:27,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:11:27,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:11:27,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1520800.0, ans=0.125 2023-10-04 04:11:30,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 04:11:30,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:11:30,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:11:30,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1520866.6666666667, ans=0.0 2023-10-04 04:11:32,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:11:32,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1520866.6666666667, ans=0.5 2023-10-04 04:11:32,925 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.61 vs. limit=15.0 2023-10-04 04:11:33,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 04:11:34,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 04:11:35,211 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:11:36,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:11:38,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:11:39,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1520866.6666666667, ans=0.0 2023-10-04 04:11:42,406 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 04:11:45,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:11:45,823 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.32 vs. limit=22.5 2023-10-04 04:11:46,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:11:46,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:11:48,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 04:11:49,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:11:49,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:11:49,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:11:50,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 04:11:52,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:11:55,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:11:56,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:12:03,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 04:12:07,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:12:15,126 INFO [train.py:1046] (1/4) Epoch 43, batch 5050, loss[loss=0.1563, simple_loss=0.2323, pruned_loss=0.04018, over 23268.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2348, pruned_loss=0.03713, over 4714957.44 frames. ], batch size: 105, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 04:12:15,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:12:16,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:12:16,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:12:16,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:12:16,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:12:16,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:12:16,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:12:20,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:12:20,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 04:12:20,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:12:24,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:12:26,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:12:26,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1521066.6666666667, ans=0.125 2023-10-04 04:12:27,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 04:12:28,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:12:29,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:12:30,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1521133.3333333333, ans=0.2 2023-10-04 04:12:31,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:12:33,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:12:34,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:12:42,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 04:12:42,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:12:43,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:12:43,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 04:12:43,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:12:45,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:12:45,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:12:45,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:12:45,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 04:12:46,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 04:12:46,857 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.77 vs. limit=12.0 2023-10-04 04:12:48,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:12:50,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:12:54,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:12:54,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 04:12:55,121 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1521200.0, ans=0.1 2023-10-04 04:12:56,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:12:59,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 04:13:01,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:13:02,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:13:02,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:13:03,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:13:05,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:13:08,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:13:08,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:13:08,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1521266.6666666667, ans=0.05 2023-10-04 04:13:10,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:13:10,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:13:10,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 04:13:11,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:13:13,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:13:17,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:13:17,248 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 04:13:17,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:13:19,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:13:19,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:13:20,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1521333.3333333333, ans=0.125 2023-10-04 04:13:21,183 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 04:13:22,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:13:22,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 04:13:22,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:13:24,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1521333.3333333333, ans=0.0 2023-10-04 04:13:25,265 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 1.920e+02 2.111e+02 2.330e+02 3.749e+02, threshold=4.222e+02, percent-clipped=0.0 2023-10-04 04:13:28,069 INFO [train.py:1046] (1/4) Epoch 43, batch 5100, loss[loss=0.155, simple_loss=0.2307, pruned_loss=0.0396, over 23791.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2348, pruned_loss=0.03696, over 4727888.68 frames. ], batch size: 179, lr: 2.35e-03, grad_scale: 8.0 2023-10-04 04:13:28,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:13:28,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:13:28,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 04:13:30,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 04:13:31,243 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.22 vs. limit=15.0 2023-10-04 04:13:31,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:13:31,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:13:32,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:13:36,293 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 04:13:37,015 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.38 vs. limit=15.0 2023-10-04 04:13:37,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:13:39,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 04:13:41,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 04:13:41,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:13:43,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:13:43,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1521466.6666666667, ans=0.125 2023-10-04 04:13:44,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1521466.6666666667, ans=0.2 2023-10-04 04:13:47,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:13:47,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 04:13:47,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 04:13:51,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:13:51,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:13:54,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:13:58,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 04:13:58,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:13:59,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:14:00,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 04:14:02,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:14:03,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:14:03,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 04:14:04,821 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 04:14:04,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:14:06,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 04:14:06,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 04:14:07,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1521533.3333333333, ans=0.125 2023-10-04 04:14:11,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:14:19,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:14:22,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 04:14:22,292 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 04:14:22,301 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 04:14:23,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 04:14:23,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:14:26,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 04:14:29,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 04:14:32,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 04:14:32,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:14:36,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 04:14:37,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:14:37,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 04:14:43,753 INFO [train.py:1046] (1/4) Epoch 43, batch 5150, loss[loss=0.1687, simple_loss=0.2464, pruned_loss=0.04547, over 23567.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2355, pruned_loss=0.03775, over 4706628.65 frames. ], batch size: 106, lr: 2.35e-03, grad_scale: 8.0 2023-10-04 04:14:43,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:14:43,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:14:43,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:14:43,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:14:43,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:14:45,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:14:46,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 04:14:46,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 04:14:46,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 04:14:48,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:14:48,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 04:14:48,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1521733.3333333333, ans=0.125 2023-10-04 04:14:49,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:14:49,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 04:14:50,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:14:52,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:14:56,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 04:14:56,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 04:14:57,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:14:57,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:14:59,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:14:59,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:14:59,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:15:01,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:15:01,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:15:03,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 04:15:04,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:15:05,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:15:07,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 04:15:08,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 04:15:10,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:15:14,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:15:16,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 04:15:20,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:15:24,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:15:25,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:15:29,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:15:30,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:15:33,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 04:15:37,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:15:39,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:15:39,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:15:39,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1521933.3333333333, ans=0.125 2023-10-04 04:15:41,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:15:42,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:15:44,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 04:15:48,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:15:48,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:15:51,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:15:51,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:15:51,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:15:52,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:15:52,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:15:52,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:15:53,836 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 2.002e+02 2.172e+02 2.451e+02 4.119e+02, threshold=4.344e+02, percent-clipped=0.0 2023-10-04 04:15:55,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:15:56,710 INFO [train.py:1046] (1/4) Epoch 43, batch 5200, loss[loss=0.1581, simple_loss=0.2328, pruned_loss=0.04163, over 23845.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.236, pruned_loss=0.03812, over 4699550.48 frames. ], batch size: 195, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 04:15:58,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:16:00,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:16:00,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1522066.6666666667, ans=0.0 2023-10-04 04:16:04,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 04:16:06,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:16:06,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:16:10,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:16:10,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:16:12,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:16:13,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 04:16:16,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 04:16:18,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:16:20,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 04:16:22,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:16:23,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:16:24,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 04:16:24,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 04:16:26,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 04:16:27,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:16:27,861 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 04:16:27,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:16:27,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:16:29,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:16:29,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 04:16:29,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:16:33,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:16:33,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1522200.0, ans=0.0 2023-10-04 04:16:34,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 04:16:35,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 04:16:35,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 04:16:41,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 04:16:41,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:16:45,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:16:46,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:16:47,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 04:16:48,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:16:48,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 04:16:48,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1522266.6666666667, ans=0.125 2023-10-04 04:16:49,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:16:49,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:16:53,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:16:54,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:16:58,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:16:58,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:16:58,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:17:06,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:17:06,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 04:17:07,039 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:17:08,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:17:08,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:17:08,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1522333.3333333333, ans=0.125 2023-10-04 04:17:10,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:17:11,341 INFO [train.py:1046] (1/4) Epoch 43, batch 5250, loss[loss=0.1572, simple_loss=0.2481, pruned_loss=0.03313, over 24638.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2361, pruned_loss=0.03778, over 4706930.96 frames. ], batch size: 73, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 04:17:11,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:17:12,008 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.22 vs. limit=15.0 2023-10-04 04:17:12,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:17:15,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:17:17,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:17:17,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:17:18,433 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.36 vs. limit=12.0 2023-10-04 04:17:18,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:17:24,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:17:26,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1522466.6666666667, ans=0.125 2023-10-04 04:17:27,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:17:28,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:17:29,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:17:30,201 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:17:31,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 04:17:31,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:17:32,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:17:34,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1522466.6666666667, ans=0.0 2023-10-04 04:17:36,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1522466.6666666667, ans=0.0 2023-10-04 04:17:43,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1522533.3333333333, ans=0.0 2023-10-04 04:17:54,013 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=3.87 vs. limit=12.0 2023-10-04 04:18:01,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1522600.0, ans=0.0 2023-10-04 04:18:10,028 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.11 vs. limit=15.0 2023-10-04 04:18:11,289 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=10.62 vs. limit=22.5 2023-10-04 04:18:18,268 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.988e+02 2.166e+02 2.434e+02 3.421e+02, threshold=4.331e+02, percent-clipped=0.0 2023-10-04 04:18:19,588 INFO [train.py:1046] (1/4) Epoch 43, batch 5300, loss[loss=0.1374, simple_loss=0.1864, pruned_loss=0.04421, over 19138.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2348, pruned_loss=0.03757, over 4714613.63 frames. ], batch size: 388, lr: 2.35e-03, grad_scale: 8.0 2023-10-04 04:18:28,204 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.20 vs. limit=15.0 2023-10-04 04:18:33,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:18:33,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 04:18:33,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 04:18:33,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:18:33,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:18:34,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:18:34,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:18:34,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:18:34,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:18:34,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:18:34,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:18:34,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:18:34,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 04:18:34,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 04:18:34,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 04:18:34,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:18:34,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 04:18:35,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 04:18:35,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:18:35,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:18:35,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:18:35,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:18:35,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:18:35,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:18:35,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:18:35,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:18:36,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:18:36,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:18:36,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:18:36,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:18:36,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:18:37,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 04:18:37,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:18:37,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:18:37,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 04:18:37,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 04:18:37,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:18:37,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:18:37,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 04:18:37,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 04:18:37,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 04:18:38,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:18:38,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:18:38,331 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 04:18:38,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 04:18:38,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:18:38,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:18:39,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 04:18:39,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 04:18:39,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 04:18:39,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 04:18:43,403 INFO [train.py:1046] (1/4) Epoch 44, batch 0, loss[loss=0.153, simple_loss=0.2315, pruned_loss=0.03723, over 23641.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2315, pruned_loss=0.03723, over 23641.00 frames. ], batch size: 149, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:18:43,404 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 04:18:55,641 INFO [train.py:1078] (1/4) Epoch 44, validation: loss=0.3443, simple_loss=0.2733, pruned_loss=0.2076, over 1125622.00 frames. 2023-10-04 04:18:55,641 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 04:18:58,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 04:18:58,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:19:00,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:19:06,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:19:06,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:19:06,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:19:08,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 04:19:08,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 04:19:09,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:19:11,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:19:15,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:19:15,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:19:16,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:19:18,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:19:18,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 04:19:19,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:19:20,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1522880.0, ans=0.0 2023-10-04 04:19:20,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1522880.0, ans=0.0 2023-10-04 04:19:28,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:19:28,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:19:31,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 04:19:34,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:19:34,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:19:37,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:19:41,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:19:45,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:19:47,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1523013.3333333333, ans=0.125 2023-10-04 04:19:51,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 04:19:54,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 04:19:54,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:19:54,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:19:54,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:19:55,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:19:57,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 04:19:59,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:19:59,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:20:04,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:20:06,382 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 04:20:07,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:20:07,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1523080.0, ans=0.125 2023-10-04 04:20:10,412 INFO [train.py:1046] (1/4) Epoch 44, batch 50, loss[loss=0.1598, simple_loss=0.2476, pruned_loss=0.036, over 23747.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2376, pruned_loss=0.03864, over 1064715.74 frames. ], batch size: 85, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:20:10,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:20:13,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:20:13,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 04:20:13,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1523146.6666666667, ans=0.1 2023-10-04 04:20:14,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:20:14,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:20:15,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:20:17,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:20:19,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1523146.6666666667, ans=0.1 2023-10-04 04:20:20,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1523146.6666666667, ans=0.125 2023-10-04 04:20:21,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:20:25,426 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.86 vs. limit=22.5 2023-10-04 04:20:26,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 04:20:26,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:20:27,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1523213.3333333333, ans=0.125 2023-10-04 04:20:31,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:20:33,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 04:20:35,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 04:20:36,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:20:38,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:20:38,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:20:41,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:20:43,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:20:43,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:20:43,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:20:44,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1523280.0, ans=0.125 2023-10-04 04:20:51,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:20:52,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:20:52,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:20:53,362 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.83 vs. limit=15.0 2023-10-04 04:20:54,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 04:20:55,012 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.78 vs. limit=15.0 2023-10-04 04:20:56,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:20:57,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:20:57,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 04:20:57,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:20:59,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 04:20:59,993 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.15 vs. limit=12.0 2023-10-04 04:21:03,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1523346.6666666667, ans=0.1 2023-10-04 04:21:04,388 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.994e+02 2.209e+02 2.588e+02 5.609e+02, threshold=4.418e+02, percent-clipped=8.0 2023-10-04 04:21:07,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:21:07,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:21:09,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:21:09,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:21:09,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:21:12,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 04:21:12,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 04:21:13,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:21:13,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:21:15,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:21:16,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:21:16,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 04:21:16,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 04:21:17,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 04:21:18,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:21:19,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:21:19,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 04:21:19,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 04:21:20,260 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.30 vs. limit=15.0 2023-10-04 04:21:20,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:21:20,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:21:22,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 04:21:22,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:21:23,645 INFO [train.py:1046] (1/4) Epoch 44, batch 100, loss[loss=0.1584, simple_loss=0.2394, pruned_loss=0.03875, over 24464.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2372, pruned_loss=0.03685, over 1876967.72 frames. ], batch size: 63, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:21:26,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:21:28,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=1523480.0, ans=0.95 2023-10-04 04:21:30,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:21:33,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:21:34,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 04:21:34,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:21:38,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:21:38,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:21:38,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:21:38,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:21:38,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:21:41,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 04:21:43,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:21:43,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:21:43,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:21:43,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:21:45,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 04:21:47,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:21:47,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:21:47,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1523546.6666666667, ans=0.0 2023-10-04 04:21:48,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:21:51,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:21:54,284 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 04:21:55,702 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 04:21:55,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:21:55,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:21:55,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1523613.3333333333, ans=0.2 2023-10-04 04:21:59,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:22:00,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:22:02,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:06,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:06,647 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 04:22:08,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 04:22:10,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1523680.0, ans=0.125 2023-10-04 04:22:11,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1523680.0, ans=0.04949747468305833 2023-10-04 04:22:12,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:22:14,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:22:16,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:20,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:22:23,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:22:25,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:22:26,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:26,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1523746.6666666667, ans=0.125 2023-10-04 04:22:27,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:22:29,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:22:29,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:22:30,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:31,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 04:22:31,060 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 04:22:31,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:22:32,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:22:33,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:33,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:22:33,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 04:22:34,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 04:22:35,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:22:35,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:35,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:22:37,001 INFO [train.py:1046] (1/4) Epoch 44, batch 150, loss[loss=0.1491, simple_loss=0.2401, pruned_loss=0.02909, over 24475.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2378, pruned_loss=0.03782, over 2502989.54 frames. ], batch size: 69, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:22:37,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:22:37,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:22:37,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:22:39,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:22:42,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:22:42,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:22:43,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:44,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:22:44,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:48,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:22:50,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:52,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 04:22:52,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 04:22:52,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 04:22:55,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:22:55,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:22:58,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:22:59,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:59,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:22:59,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:59,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:23:03,030 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 04:23:04,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:23:10,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:23:12,434 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1523946.6666666667, ans=0.125 2023-10-04 04:23:14,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:23:15,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 04:23:15,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1523946.6666666667, ans=0.0 2023-10-04 04:23:18,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:23:18,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:23:18,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:23:20,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:23:21,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:23:21,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:23:22,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:23:24,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 04:23:28,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:23:29,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:23:31,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:23:31,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:23:32,512 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 2.030e+02 2.255e+02 2.508e+02 4.097e+02, threshold=4.511e+02, percent-clipped=0.0 2023-10-04 04:23:32,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:23:32,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1524013.3333333333, ans=0.0 2023-10-04 04:23:34,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 04:23:38,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:23:39,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:23:41,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:23:42,010 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.43 vs. limit=15.0 2023-10-04 04:23:44,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:23:44,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 04:23:44,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:23:44,591 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 04:23:47,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:23:51,246 INFO [train.py:1046] (1/4) Epoch 44, batch 200, loss[loss=0.1688, simple_loss=0.2431, pruned_loss=0.04724, over 23594.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2388, pruned_loss=0.03895, over 2990916.39 frames. ], batch size: 256, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:23:52,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:23:52,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:23:56,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 04:23:56,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:23:56,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:23:58,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 04:23:58,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:24:01,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:24:01,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:24:03,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1524146.6666666667, ans=0.125 2023-10-04 04:24:05,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:24:05,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:24:05,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:24:09,377 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.01 vs. limit=22.5 2023-10-04 04:24:21,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1524280.0, ans=0.125 2023-10-04 04:24:22,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:24:22,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:24:22,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:24:23,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:24:23,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 04:24:25,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:24:26,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:24:27,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:24:29,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:24:29,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:24:30,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 04:24:30,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 04:24:30,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:24:36,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:24:37,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1524346.6666666667, ans=0.125 2023-10-04 04:24:41,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:24:47,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:24:48,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:24:54,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:24:57,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 04:24:58,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:24:58,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:24:58,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:25:00,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:25:00,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1524413.3333333333, ans=0.125 2023-10-04 04:25:01,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 04:25:02,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:25:02,019 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 04:25:04,611 INFO [train.py:1046] (1/4) Epoch 44, batch 250, loss[loss=0.1369, simple_loss=0.2007, pruned_loss=0.03651, over 23390.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2375, pruned_loss=0.03889, over 3358771.94 frames. ], batch size: 285, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:25:04,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:25:06,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:25:08,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:25:08,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:25:08,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1524480.0, ans=0.125 2023-10-04 04:25:10,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:25:10,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:25:12,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:25:15,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:25:22,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:25:26,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:25:26,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:25:32,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:25:32,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:25:32,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:25:33,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:25:34,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:25:34,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:25:35,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:25:37,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:25:38,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 04:25:38,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:25:40,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:25:42,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:25:42,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:25:43,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:25:43,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1524613.3333333333, ans=0.125 2023-10-04 04:25:44,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:25:44,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:25:48,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:25:50,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:25:50,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:25:54,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:25:58,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:25:59,830 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.036e+02 2.263e+02 2.637e+02 3.971e+02, threshold=4.526e+02, percent-clipped=0.0 2023-10-04 04:26:02,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:26:04,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1524746.6666666667, ans=0.0 2023-10-04 04:26:05,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:26:07,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:26:09,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 04:26:10,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:26:10,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:26:12,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 04:26:12,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 04:26:13,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:26:13,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 04:26:17,854 INFO [train.py:1046] (1/4) Epoch 44, batch 300, loss[loss=0.1468, simple_loss=0.2311, pruned_loss=0.03123, over 24471.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2352, pruned_loss=0.03807, over 3652483.77 frames. ], batch size: 63, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:26:17,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:26:19,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:26:23,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:26:23,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 04:26:24,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:26:26,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:26:26,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 04:26:27,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:26:31,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:26:35,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:26:35,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 04:26:42,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 04:26:42,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:26:45,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:26:45,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:26:45,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 04:26:45,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:26:47,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:26:50,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:26:50,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:26:55,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 04:26:55,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 04:26:57,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:26:57,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1524946.6666666667, ans=0.125 2023-10-04 04:26:58,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:27:00,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 04:27:00,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:27:00,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1525013.3333333333, ans=0.1 2023-10-04 04:27:05,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:27:08,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:27:08,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 04:27:12,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:27:12,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:27:16,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:27:17,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:27:18,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 04:27:19,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 04:27:19,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:27:21,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 04:27:23,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:27:23,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:24,825 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.96 vs. limit=15.0 2023-10-04 04:27:25,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:27:25,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:27:26,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:29,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:27:29,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 04:27:31,118 INFO [train.py:1046] (1/4) Epoch 44, batch 350, loss[loss=0.1727, simple_loss=0.2601, pruned_loss=0.04262, over 24414.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2348, pruned_loss=0.03746, over 3894681.76 frames. ], batch size: 77, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:27:33,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:37,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:27:39,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:27:39,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:39,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1525146.6666666667, ans=0.1 2023-10-04 04:27:42,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 04:27:45,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:27:45,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 04:27:49,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:49,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 04:27:50,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:27:53,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 04:27:54,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:27:57,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:27:57,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:27:59,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:27:59,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:28:00,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:28:00,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:28:00,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:28:03,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:28:03,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:28:03,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1525280.0, ans=0.125 2023-10-04 04:28:09,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1525280.0, ans=0.0 2023-10-04 04:28:10,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:28:10,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:28:11,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:28:11,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:28:12,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1525280.0, ans=0.125 2023-10-04 04:28:16,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 04:28:16,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:28:21,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:28:21,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:28:21,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:28:22,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 04:28:25,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:28:27,099 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 04:28:28,275 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.956e+02 2.278e+02 2.648e+02 3.683e+02, threshold=4.557e+02, percent-clipped=0.0 2023-10-04 04:28:28,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 04:28:28,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:28:32,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:28:32,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 04:28:35,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:28:36,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:28:38,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:28:39,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:28:39,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:28:42,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:28:42,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1525480.0, ans=0.2 2023-10-04 04:28:43,770 INFO [train.py:1046] (1/4) Epoch 44, batch 400, loss[loss=0.1501, simple_loss=0.2363, pruned_loss=0.03194, over 24645.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2339, pruned_loss=0.03709, over 4076684.30 frames. ], batch size: 68, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:28:45,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:28:49,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:28:50,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 04:28:50,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:28:50,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:28:52,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:28:52,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:28:55,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:28:56,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:28:59,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 04:29:00,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 04:29:00,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:29:03,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 04:29:03,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:29:07,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:29:07,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:29:07,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 04:29:07,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:29:08,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1525546.6666666667, ans=0.125 2023-10-04 04:29:09,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:29:09,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:29:10,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:29:11,957 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 04:29:13,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 04:29:17,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:29:19,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:29:19,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 04:29:22,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 04:29:24,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:29:27,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:29:29,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1525680.0, ans=0.125 2023-10-04 04:29:32,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 04:29:35,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:29:36,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 04:29:38,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:29:40,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:29:40,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 04:29:43,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:29:44,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:29:47,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:29:51,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:29:51,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 04:29:53,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 04:29:54,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 04:29:55,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:29:55,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:29:59,135 INFO [train.py:1046] (1/4) Epoch 44, batch 450, loss[loss=0.1518, simple_loss=0.2314, pruned_loss=0.03609, over 23649.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2347, pruned_loss=0.03719, over 4231542.73 frames. ], batch size: 232, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:29:59,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 04:30:00,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:30:02,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:30:02,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 04:30:03,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 04:30:03,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:30:04,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:30:04,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:30:06,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 04:30:06,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:30:07,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:30:09,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:30:13,884 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.74 vs. limit=6.0 2023-10-04 04:30:16,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:30:17,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:30:17,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 04:30:18,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 04:30:22,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:30:25,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:30:25,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1525880.0, ans=0.125 2023-10-04 04:30:27,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:30:33,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:30:33,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:30:35,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 04:30:37,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 04:30:38,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 04:30:39,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:30:39,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:30:40,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten.whitening_limit, batch_count=1525946.6666666667, ans=15.0 2023-10-04 04:30:41,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:30:42,755 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 04:30:42,763 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 04:30:42,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:30:45,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:30:45,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 04:30:48,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:30:49,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:30:49,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 04:30:51,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 04:30:53,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:30:56,704 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.766e+02 2.172e+02 2.484e+02 2.956e+02 4.900e+02, threshold=4.968e+02, percent-clipped=2.0 2023-10-04 04:30:56,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:30:56,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:30:57,717 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.57 vs. limit=15.0 2023-10-04 04:30:58,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 04:31:01,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:31:01,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 04:31:04,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 04:31:04,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:31:07,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1526080.0, ans=0.1 2023-10-04 04:31:07,940 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.16 vs. limit=6.0 2023-10-04 04:31:10,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:31:10,939 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.49 vs. limit=10.0 2023-10-04 04:31:12,728 INFO [train.py:1046] (1/4) Epoch 44, batch 500, loss[loss=0.1419, simple_loss=0.2194, pruned_loss=0.03223, over 24603.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2353, pruned_loss=0.03728, over 4337767.45 frames. ], batch size: 60, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:31:12,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:31:12,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:31:14,219 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 04:31:17,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:31:18,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:31:18,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:31:18,534 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 04:31:19,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 04:31:19,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:31:24,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:31:27,406 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.41 vs. limit=10.0 2023-10-04 04:31:28,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 04:31:30,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:31:32,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:31:32,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:31:33,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:31:40,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1526213.3333333333, ans=0.1 2023-10-04 04:31:41,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:31:42,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 04:31:42,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:31:42,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:31:42,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 04:31:42,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:31:44,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1526280.0, ans=0.125 2023-10-04 04:31:45,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:31:46,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:31:46,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:31:48,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:31:48,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 04:31:50,900 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 04:31:56,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:31:57,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:31:58,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:31:58,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:32:00,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:32:02,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 04:32:02,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1526346.6666666667, ans=0.125 2023-10-04 04:32:05,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:32:06,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:32:08,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:32:09,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:32:15,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:32:15,769 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.53 vs. limit=22.5 2023-10-04 04:32:17,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 04:32:17,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:32:18,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:32:21,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 04:32:22,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 04:32:24,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:32:27,414 INFO [train.py:1046] (1/4) Epoch 44, batch 550, loss[loss=0.1653, simple_loss=0.2368, pruned_loss=0.04692, over 23875.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2359, pruned_loss=0.03776, over 4423236.31 frames. ], batch size: 179, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:32:29,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1526480.0, ans=0.0 2023-10-04 04:32:30,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 04:32:33,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 04:32:33,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:32:33,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 04:32:34,211 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.29 vs. limit=22.5 2023-10-04 04:32:35,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:32:35,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:32:35,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:32:36,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:32:36,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:32:37,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:32:39,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:32:40,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 04:32:40,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:32:46,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:32:46,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:32:48,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:32:50,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:32:53,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 04:32:54,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 04:32:55,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1526613.3333333333, ans=0.125 2023-10-04 04:32:56,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:33:02,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:33:02,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:33:04,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:33:07,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:33:07,682 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 04:33:07,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:33:09,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 04:33:12,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:33:13,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:33:13,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:33:13,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:33:15,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 04:33:17,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 04:33:17,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:33:17,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:33:18,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:33:18,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:33:23,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:33:23,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:33:25,077 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.655e+02 1.940e+02 2.267e+02 2.484e+02 4.509e+02, threshold=4.533e+02, percent-clipped=0.0 2023-10-04 04:33:26,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:33:27,299 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.83 vs. limit=6.0 2023-10-04 04:33:27,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:33:27,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 04:33:28,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:33:30,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:33:31,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:33:31,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:33:31,641 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:33:32,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:33:32,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 04:33:39,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 04:33:42,071 INFO [train.py:1046] (1/4) Epoch 44, batch 600, loss[loss=0.1929, simple_loss=0.2631, pruned_loss=0.06135, over 19518.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2363, pruned_loss=0.03799, over 4483085.28 frames. ], batch size: 388, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:33:43,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 04:33:43,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:33:43,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:33:44,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:33:50,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:33:52,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:33:56,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 04:33:57,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:33:58,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:34:01,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:34:04,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 04:34:04,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:34:10,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 04:34:12,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:34:12,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:34:12,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:34:17,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:34:19,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:34:19,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:34:26,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:34:27,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1527013.3333333333, ans=0.5 2023-10-04 04:34:28,912 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.88 vs. limit=15.0 2023-10-04 04:34:31,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:34:31,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:34:31,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:34:38,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 04:34:43,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:34:44,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:34:46,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 04:34:47,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:34:49,550 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.32 vs. limit=15.0 2023-10-04 04:34:50,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 04:34:50,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:34:51,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:34:51,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1527080.0, ans=0.0 2023-10-04 04:34:52,165 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.59 vs. limit=15.0 2023-10-04 04:34:53,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1527080.0, ans=0.2 2023-10-04 04:34:55,829 INFO [train.py:1046] (1/4) Epoch 44, batch 650, loss[loss=0.1638, simple_loss=0.2474, pruned_loss=0.04014, over 23751.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2352, pruned_loss=0.03777, over 4522119.44 frames. ], batch size: 85, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:34:57,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 04:34:58,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:35:00,659 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.22 vs. limit=15.0 2023-10-04 04:35:01,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:35:01,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:35:04,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:05,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 04:35:07,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:35:13,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:35:13,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:35:13,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1527213.3333333333, ans=0.125 2023-10-04 04:35:16,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:35:16,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1527213.3333333333, ans=0.125 2023-10-04 04:35:19,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1527213.3333333333, ans=0.05 2023-10-04 04:35:20,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 04:35:21,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:35:23,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:35:25,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:35:25,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 04:35:29,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:35:29,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:29,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1527280.0, ans=0.125 2023-10-04 04:35:31,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 04:35:31,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:32,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:35:34,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:35:34,429 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 04:35:35,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:35:35,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:35:36,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1527280.0, ans=0.1 2023-10-04 04:35:38,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:39,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:35:39,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:35:39,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:35:41,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1527346.6666666667, ans=0.2 2023-10-04 04:35:43,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 04:35:43,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:35:44,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:35:44,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 04:35:44,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:35:45,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:35:47,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 04:35:48,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 04:35:48,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:48,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:35:48,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:35:50,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:35:51,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:35:54,090 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.998e+02 2.208e+02 2.441e+02 4.854e+02, threshold=4.416e+02, percent-clipped=2.0 2023-10-04 04:35:57,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:57,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:35:59,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:36:00,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1527413.3333333333, ans=0.125 2023-10-04 04:36:02,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:36:02,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 04:36:04,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:36:06,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1527413.3333333333, ans=0.125 2023-10-04 04:36:09,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:36:09,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:36:09,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:36:09,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:36:10,812 INFO [train.py:1046] (1/4) Epoch 44, batch 700, loss[loss=0.1488, simple_loss=0.2346, pruned_loss=0.03145, over 23280.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.234, pruned_loss=0.03731, over 4563293.39 frames. ], batch size: 93, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:36:14,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 04:36:15,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 04:36:18,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 04:36:18,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:36:21,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:36:23,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 04:36:28,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:36:31,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:36:32,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:36:33,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:36:33,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:36:36,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:36:36,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1527546.6666666667, ans=0.125 2023-10-04 04:36:37,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 04:36:37,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:36:39,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 04:36:40,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 04:36:46,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:36:46,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:36:49,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:36:52,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:36:52,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 04:36:55,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1527680.0, ans=0.1 2023-10-04 04:36:56,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:36:58,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:36:58,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 04:37:01,089 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.09 vs. limit=12.0 2023-10-04 04:37:03,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:37:05,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:37:07,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:37:13,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:37:13,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 04:37:16,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 04:37:16,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 04:37:19,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:37:21,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:37:21,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:37:23,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:37:23,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 04:37:25,160 INFO [train.py:1046] (1/4) Epoch 44, batch 750, loss[loss=0.1663, simple_loss=0.2409, pruned_loss=0.0459, over 23737.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2332, pruned_loss=0.037, over 4596811.04 frames. ], batch size: 232, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:37:27,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 04:37:27,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 04:37:28,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 04:37:29,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 04:37:29,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 04:37:31,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:37:33,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 04:37:34,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:37:34,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:37:36,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:37:37,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:37:37,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:37:37,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:37:39,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1527880.0, ans=0.125 2023-10-04 04:37:40,040 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.77 vs. limit=12.0 2023-10-04 04:37:40,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:37:40,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:37:42,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:37:45,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:37:45,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:37:46,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 04:37:48,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:37:49,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:37:52,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:37:54,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 04:37:55,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 04:37:55,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:37:56,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 04:37:56,541 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 04:37:57,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 04:37:57,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:37:57,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 04:37:58,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1527946.6666666667, ans=0.125 2023-10-04 04:38:00,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:38:06,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:38:08,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:38:08,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:38:10,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:38:11,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:38:11,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 04:38:13,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:38:13,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 04:38:14,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:38:16,897 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.95 vs. limit=15.0 2023-10-04 04:38:17,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:38:17,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 04:38:19,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:38:23,162 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.912e+02 2.188e+02 2.458e+02 4.051e+02, threshold=4.377e+02, percent-clipped=0.0 2023-10-04 04:38:24,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:38:26,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:38:26,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:38:28,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:38:29,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1528080.0, ans=0.0 2023-10-04 04:38:32,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 04:38:33,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:38:33,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:38:37,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:38:37,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:38:40,337 INFO [train.py:1046] (1/4) Epoch 44, batch 800, loss[loss=0.1532, simple_loss=0.2435, pruned_loss=0.03147, over 24458.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2337, pruned_loss=0.03695, over 4627701.74 frames. ], batch size: 66, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:38:40,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:38:40,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:38:42,944 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.68 vs. limit=5.0 2023-10-04 04:38:47,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:38:47,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:38:50,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:38:50,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:38:51,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:38:51,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:38:53,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:38:56,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:38:57,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:38:58,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 04:39:00,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:39:01,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:39:02,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:39:02,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:39:03,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 04:39:03,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:39:04,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 04:39:08,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:39:08,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1528280.0, ans=0.0 2023-10-04 04:39:09,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:39:09,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1528280.0, ans=0.2 2023-10-04 04:39:12,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:39:12,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:39:14,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:39:14,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:39:18,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:39:18,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:39:18,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 04:39:22,013 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 04:39:22,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 04:39:22,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:39:22,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:39:24,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:39:24,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:39:27,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1528346.6666666667, ans=0.0 2023-10-04 04:39:30,120 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 04:39:30,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 04:39:31,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:39:34,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:39:35,417 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.64 vs. limit=15.0 2023-10-04 04:39:38,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:39:38,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1528413.3333333333, ans=0.1 2023-10-04 04:39:41,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:39:42,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 04:39:44,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:39:46,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 04:39:47,132 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:39:52,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:39:53,701 INFO [train.py:1046] (1/4) Epoch 44, batch 850, loss[loss=0.1664, simple_loss=0.2417, pruned_loss=0.04553, over 23689.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2344, pruned_loss=0.03725, over 4644334.75 frames. ], batch size: 232, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:39:55,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:39:55,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 04:39:57,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:39:58,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:39:59,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 04:39:59,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:40:01,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:40:02,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:04,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:40:06,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:40:06,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 04:40:07,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 04:40:07,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 04:40:08,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:40:10,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:40:12,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:12,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:40:12,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:40:16,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:40:16,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:40:16,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 04:40:20,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 04:40:22,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1528613.3333333333, ans=0.125 2023-10-04 04:40:22,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1528613.3333333333, ans=0.125 2023-10-04 04:40:23,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:40:25,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 04:40:27,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 04:40:29,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 04:40:32,613 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 04:40:32,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:40:32,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:40:32,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 04:40:34,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:37,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:37,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 04:40:39,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:40:40,851 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.32 vs. limit=22.5 2023-10-04 04:40:41,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:40:41,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:40:41,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:40:44,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:40:44,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1528680.0, ans=0.0 2023-10-04 04:40:45,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:40:45,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 04:40:49,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:40:49,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:40:50,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:40:50,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:40:52,178 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.761e+02 1.992e+02 2.228e+02 2.493e+02 3.552e+02, threshold=4.457e+02, percent-clipped=0.0 2023-10-04 04:40:52,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:40:53,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:56,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:40:56,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1528746.6666666667, ans=0.2 2023-10-04 04:40:57,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:40:59,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:40:59,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:41:05,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 04:41:08,427 INFO [train.py:1046] (1/4) Epoch 44, batch 900, loss[loss=0.1679, simple_loss=0.2412, pruned_loss=0.04724, over 23495.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2355, pruned_loss=0.03786, over 4652792.78 frames. ], batch size: 285, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:41:08,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:41:08,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 04:41:08,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:41:08,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1528813.3333333333, ans=0.125 2023-10-04 04:41:09,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:41:11,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 04:41:17,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:41:20,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:41:20,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 04:41:23,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:41:24,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 04:41:24,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 04:41:26,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:41:26,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:41:27,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:41:27,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:41:35,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:41:35,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:41:36,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:41:39,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:41:43,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 04:41:46,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:41:49,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 04:41:50,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:41:51,686 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 04:41:53,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 04:41:58,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:41:58,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:42:00,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:42:06,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1529080.0, ans=0.0 2023-10-04 04:42:06,722 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.34 vs. limit=15.0 2023-10-04 04:42:07,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:42:07,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:42:08,291 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.14 vs. limit=22.5 2023-10-04 04:42:08,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 04:42:08,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:42:12,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 04:42:13,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:42:15,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:42:17,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:42:17,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:42:21,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 04:42:21,763 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 04:42:23,059 INFO [train.py:1046] (1/4) Epoch 44, batch 950, loss[loss=0.1734, simple_loss=0.2563, pruned_loss=0.04531, over 23973.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2366, pruned_loss=0.038, over 4661438.30 frames. ], batch size: 80, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:42:23,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 04:42:23,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 04:42:24,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:42:26,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 04:42:33,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:42:36,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:42:37,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:42:37,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:42:39,050 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 04:42:41,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1529213.3333333333, ans=0.035 2023-10-04 04:42:42,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:42:42,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1529213.3333333333, ans=0.0 2023-10-04 04:42:44,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:42:45,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:42:45,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:42:45,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 04:42:48,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 04:42:48,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:42:50,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 04:42:50,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:42:54,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:42:54,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:42:54,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:42:55,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 04:42:58,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:43:00,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:43:01,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:43:06,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:43:06,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:43:09,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 04:43:12,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 04:43:12,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:43:12,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:43:14,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:43:14,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:43:18,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 04:43:19,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:43:20,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:43:21,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1529346.6666666667, ans=0.125 2023-10-04 04:43:22,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:43:22,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 04:43:22,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1529413.3333333333, ans=0.125 2023-10-04 04:43:23,473 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 2.090e+02 2.361e+02 2.661e+02 3.886e+02, threshold=4.722e+02, percent-clipped=0.0 2023-10-04 04:43:23,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:43:23,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:43:24,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 04:43:25,404 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.55 vs. limit=22.5 2023-10-04 04:43:27,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:43:30,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:43:33,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:43:34,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 04:43:34,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 04:43:37,336 INFO [train.py:1046] (1/4) Epoch 44, batch 1000, loss[loss=0.164, simple_loss=0.2435, pruned_loss=0.0422, over 24462.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2362, pruned_loss=0.03799, over 4679611.79 frames. ], batch size: 66, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:43:39,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:43:41,338 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.03 vs. limit=15.0 2023-10-04 04:43:42,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 04:43:44,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:43:46,923 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.23 vs. limit=22.5 2023-10-04 04:43:47,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:43:47,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1529480.0, ans=0.1 2023-10-04 04:43:47,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1529480.0, ans=0.125 2023-10-04 04:43:49,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 04:43:49,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 04:43:54,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:43:54,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:43:55,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:43:57,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 04:44:01,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 04:44:03,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 04:44:04,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:44:05,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 04:44:07,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 04:44:07,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 04:44:08,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:44:08,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:10,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1529613.3333333333, ans=0.0 2023-10-04 04:44:17,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:44:17,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:44:18,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:20,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:44:20,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 04:44:20,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:44:21,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:44:21,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:44:21,784 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 04:44:24,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 04:44:24,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1529680.0, ans=0.0 2023-10-04 04:44:25,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 04:44:28,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 04:44:29,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:44:36,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:36,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:44:36,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:38,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:44:40,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 04:44:41,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:44:42,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 04:44:42,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 04:44:44,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:44:44,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:44:45,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:44:45,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1529746.6666666667, ans=0.1 2023-10-04 04:44:45,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1529746.6666666667, ans=0.125 2023-10-04 04:44:45,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1529746.6666666667, ans=0.125 2023-10-04 04:44:48,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:44:50,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:44:52,204 INFO [train.py:1046] (1/4) Epoch 44, batch 1050, loss[loss=0.1458, simple_loss=0.2196, pruned_loss=0.036, over 23429.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2345, pruned_loss=0.03747, over 4677184.46 frames. ], batch size: 285, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:44:52,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:44:53,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:44:55,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:44:56,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:59,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:45:01,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:45:03,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:45:06,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:45:07,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:45:07,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:45:08,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:45:08,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 04:45:10,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:45:10,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1529880.0, ans=0.2 2023-10-04 04:45:11,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 04:45:13,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:45:13,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 04:45:13,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 04:45:16,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1529880.0, ans=0.125 2023-10-04 04:45:21,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:45:21,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:45:21,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:45:25,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 04:45:25,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 04:45:25,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:45:29,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 04:45:32,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 04:45:32,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:45:34,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1530013.3333333333, ans=0.0 2023-10-04 04:45:36,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 04:45:37,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 04:45:37,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:45:39,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:45:42,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:45:44,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1530013.3333333333, ans=0.125 2023-10-04 04:45:45,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 04:45:47,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 04:45:49,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 04:45:49,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:45:49,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:45:50,481 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 2.023e+02 2.342e+02 2.796e+02 4.637e+02, threshold=4.684e+02, percent-clipped=0.0 2023-10-04 04:45:50,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 04:45:52,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1530080.0, ans=0.125 2023-10-04 04:45:53,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1530080.0, ans=0.125 2023-10-04 04:45:54,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:45:56,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:45:56,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:45:56,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:45:56,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:01,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:01,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 04:46:01,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:46:03,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 04:46:03,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 04:46:04,598 INFO [train.py:1046] (1/4) Epoch 44, batch 1100, loss[loss=0.1369, simple_loss=0.2245, pruned_loss=0.02469, over 15540.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.234, pruned_loss=0.03695, over 4683857.13 frames. ], batch size: 33, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:46:04,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:46:07,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:46:11,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:46:18,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:46:19,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:46:19,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:46:19,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 04:46:21,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:46:23,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:46:27,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:46:29,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:46:30,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 04:46:31,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 04:46:32,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:46:32,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:46:35,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:46:36,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:46:37,730 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.01 vs. limit=15.0 2023-10-04 04:46:41,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:46:44,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 04:46:45,055 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 04:46:46,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:48,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:49,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:46:50,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:46:52,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 04:46:53,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:46:53,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:46:53,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:46:53,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:53,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 04:47:00,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:47:00,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 04:47:01,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:47:04,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:47:07,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 04:47:07,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 04:47:09,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:47:10,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:47:11,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:47:13,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 04:47:13,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:47:14,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:47:15,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 04:47:16,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:47:16,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 04:47:18,680 INFO [train.py:1046] (1/4) Epoch 44, batch 1150, loss[loss=0.1669, simple_loss=0.2503, pruned_loss=0.04178, over 23443.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2344, pruned_loss=0.03737, over 4685888.24 frames. ], batch size: 106, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:47:18,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:47:18,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:47:20,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:47:24,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:47:25,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1530480.0, ans=0.125 2023-10-04 04:47:26,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:47:26,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1530480.0, ans=0.125 2023-10-04 04:47:27,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:47:29,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:47:29,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 04:47:30,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:47:33,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 04:47:34,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:47:34,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:47:38,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 04:47:40,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:47:45,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:47:45,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:47:46,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 04:47:46,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:47:46,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:47:48,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1530613.3333333333, ans=0.0 2023-10-04 04:47:49,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 04:47:51,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:47:53,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:48:01,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:48:08,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:48:08,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 04:48:08,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:48:09,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:48:11,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1530680.0, ans=0.2 2023-10-04 04:48:11,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1530680.0, ans=0.2 2023-10-04 04:48:14,050 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 04:48:16,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:48:17,278 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 1.987e+02 2.114e+02 2.395e+02 3.524e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-04 04:48:22,988 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 04:48:27,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:48:29,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:48:29,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:48:29,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:48:31,266 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.67 vs. limit=15.0 2023-10-04 04:48:31,941 INFO [train.py:1046] (1/4) Epoch 44, batch 1200, loss[loss=0.1594, simple_loss=0.2477, pruned_loss=0.03557, over 24524.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2355, pruned_loss=0.0376, over 4698046.30 frames. ], batch size: 71, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:48:32,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:48:32,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1530813.3333333333, ans=0.1 2023-10-04 04:48:35,454 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.72 vs. limit=15.0 2023-10-04 04:48:37,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:48:37,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:48:39,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:48:39,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:48:39,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:48:41,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:48:44,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:48:45,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:48:45,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:48:46,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1530880.0, ans=0.125 2023-10-04 04:48:48,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1530880.0, ans=0.0 2023-10-04 04:48:49,323 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 04:48:51,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 04:48:52,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1530880.0, ans=0.2 2023-10-04 04:48:55,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:48:58,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:49:00,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:49:01,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:49:01,577 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 04:49:02,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:49:07,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1530946.6666666667, ans=0.125 2023-10-04 04:49:11,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:49:11,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:49:11,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 04:49:12,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:49:15,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 04:49:15,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1531013.3333333333, ans=0.125 2023-10-04 04:49:18,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 04:49:18,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:49:18,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:49:20,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:49:22,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 04:49:22,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:49:23,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:49:23,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:49:24,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 04:49:24,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:49:25,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:49:26,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 04:49:30,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:49:30,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:49:33,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 04:49:36,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:49:36,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 04:49:40,820 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 04:49:43,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:49:44,775 INFO [train.py:1046] (1/4) Epoch 44, batch 1250, loss[loss=0.1596, simple_loss=0.2374, pruned_loss=0.04095, over 23819.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2367, pruned_loss=0.03827, over 4707280.04 frames. ], batch size: 164, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:49:44,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:49:46,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:49:48,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:49:51,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 04:49:54,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1531146.6666666667, ans=0.125 2023-10-04 04:49:55,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:49:56,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:49:57,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 04:49:58,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:50:00,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:50:00,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1531213.3333333333, ans=0.05 2023-10-04 04:50:03,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:50:04,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:50:06,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:50:06,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:50:07,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:50:10,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 04:50:11,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:50:11,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:50:12,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:50:13,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:50:16,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:50:17,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 04:50:21,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 04:50:22,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:50:25,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:50:27,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 04:50:27,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:50:27,198 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 04:50:28,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:50:28,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:50:32,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:50:33,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:50:33,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:50:35,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 04:50:35,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 04:50:37,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 04:50:40,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:50:41,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 04:50:41,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:50:43,531 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.33 vs. limit=15.0 2023-10-04 04:50:45,652 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.030e+02 2.263e+02 2.727e+02 4.243e+02, threshold=4.525e+02, percent-clipped=1.0 2023-10-04 04:50:45,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 04:50:45,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:50:49,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 04:50:49,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:50:50,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:50:50,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 04:50:50,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:50:52,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 04:50:54,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:50:56,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:50:57,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:51:00,133 INFO [train.py:1046] (1/4) Epoch 44, batch 1300, loss[loss=0.1325, simple_loss=0.2111, pruned_loss=0.02692, over 24339.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.237, pruned_loss=0.03834, over 4712413.73 frames. ], batch size: 61, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:51:00,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:51:03,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:51:04,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 04:51:07,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:51:10,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:51:12,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:51:14,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:51:14,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:51:16,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 04:51:19,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:51:20,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:51:22,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 04:51:23,229 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.09 vs. limit=22.5 2023-10-04 04:51:24,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:51:28,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:51:28,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:51:29,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:51:32,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:51:32,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:51:33,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 04:51:33,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 04:51:39,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:51:39,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:51:41,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 04:51:41,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:51:42,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:51:44,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:51:46,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 04:51:46,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:51:46,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 04:51:49,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:51:53,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:51:53,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:51:57,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 04:51:57,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1531680.0, ans=0.07 2023-10-04 04:51:58,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 04:51:58,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 04:52:02,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:52:05,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 04:52:06,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1531746.6666666667, ans=0.125 2023-10-04 04:52:07,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:52:14,027 INFO [train.py:1046] (1/4) Epoch 44, batch 1350, loss[loss=0.1607, simple_loss=0.2532, pruned_loss=0.03414, over 24546.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2357, pruned_loss=0.0383, over 4701843.85 frames. ], batch size: 71, lr: 2.32e-03, grad_scale: 4.0 2023-10-04 04:52:14,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 04:52:16,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:52:18,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:52:21,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:52:22,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:52:24,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:52:24,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:52:27,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:52:28,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 04:52:30,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:52:30,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:52:33,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 04:52:35,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:52:36,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:52:36,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 04:52:39,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 04:52:40,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1531880.0, ans=0.2 2023-10-04 04:52:41,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 04:52:43,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:52:43,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 04:52:45,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1531946.6666666667, ans=0.125 2023-10-04 04:52:50,306 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.61 vs. limit=6.0 2023-10-04 04:52:55,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:53:04,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:53:04,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:53:04,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 04:53:04,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1532013.3333333333, ans=0.125 2023-10-04 04:53:08,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:53:10,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 04:53:10,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:53:10,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:53:11,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:53:13,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 04:53:14,017 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.07 vs. limit=15.0 2023-10-04 04:53:14,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:53:16,249 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.909e+02 2.111e+02 2.476e+02 3.345e+02, threshold=4.221e+02, percent-clipped=0.0 2023-10-04 04:53:17,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1532080.0, ans=0.0 2023-10-04 04:53:19,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 04:53:20,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 04:53:26,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1532080.0, ans=0.0 2023-10-04 04:53:27,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 04:53:28,409 INFO [train.py:1046] (1/4) Epoch 44, batch 1400, loss[loss=0.1594, simple_loss=0.2326, pruned_loss=0.04303, over 23604.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2346, pruned_loss=0.03785, over 4699194.46 frames. ], batch size: 232, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 04:53:28,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:53:32,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:53:32,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:53:37,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 04:53:38,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 04:53:46,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1532213.3333333333, ans=0.125 2023-10-04 04:53:49,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:53:51,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:53:52,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1532213.3333333333, ans=0.1 2023-10-04 04:53:54,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:53:54,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 04:53:57,231 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.14 vs. limit=22.5 2023-10-04 04:53:57,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:53:59,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 04:54:06,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:06,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:11,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 04:54:12,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:54:13,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:54:13,695 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-04 04:54:14,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:54:14,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:54:17,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:54:17,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:54:17,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:54:18,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 04:54:18,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1532346.6666666667, ans=0.2 2023-10-04 04:54:19,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:54:22,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1532346.6666666667, ans=0.125 2023-10-04 04:54:24,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:29,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:54:34,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 04:54:36,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:54:36,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:54:38,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 04:54:40,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:54:41,378 INFO [train.py:1046] (1/4) Epoch 44, batch 1450, loss[loss=0.1334, simple_loss=0.1863, pruned_loss=0.04028, over 19355.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2334, pruned_loss=0.03767, over 4699705.70 frames. ], batch size: 388, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 04:54:41,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:54:46,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:54:49,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:54:49,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:49,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 04:54:53,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:54:53,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:54:55,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1532546.6666666667, ans=0.1 2023-10-04 04:54:56,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:54:56,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 04:54:57,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:54:57,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 04:54:59,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:59,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:54:59,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 04:55:03,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:55:03,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:55:04,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 04:55:04,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:55:04,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:55:05,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:55:07,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:55:10,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:55:10,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:55:12,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1532613.3333333333, ans=0.125 2023-10-04 04:55:13,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:55:13,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:55:14,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:55:14,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:55:14,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:55:16,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:55:19,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 04:55:22,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:55:23,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1532613.3333333333, ans=0.125 2023-10-04 04:55:25,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.whiten.whitening_limit, batch_count=1532680.0, ans=12.0 2023-10-04 04:55:26,219 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 04:55:26,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:55:28,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1532680.0, ans=0.2 2023-10-04 04:55:29,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:55:30,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:55:30,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 04:55:36,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:55:38,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 04:55:39,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 04:55:41,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:55:41,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1532746.6666666667, ans=0.125 2023-10-04 04:55:43,946 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.952e+02 2.155e+02 2.479e+02 3.753e+02, threshold=4.311e+02, percent-clipped=0.0 2023-10-04 04:55:44,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:55:44,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:55:46,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 04:55:49,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 04:55:49,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 04:55:49,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1532746.6666666667, ans=0.125 2023-10-04 04:55:50,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:55:52,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:55:56,246 INFO [train.py:1046] (1/4) Epoch 44, batch 1500, loss[loss=0.1548, simple_loss=0.2439, pruned_loss=0.03288, over 23713.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2334, pruned_loss=0.03723, over 4708697.56 frames. ], batch size: 85, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 04:55:58,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1532813.3333333333, ans=0.0 2023-10-04 04:56:03,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 04:56:03,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:56:03,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:56:03,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:56:04,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:56:06,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:56:06,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1532813.3333333333, ans=0.125 2023-10-04 04:56:06,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1532813.3333333333, ans=0.125 2023-10-04 04:56:07,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 04:56:08,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:56:08,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 04:56:08,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:56:10,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:56:13,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:56:13,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:56:17,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:56:17,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 04:56:19,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:56:19,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:56:20,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1532880.0, ans=0.125 2023-10-04 04:56:21,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:56:21,737 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.94 vs. limit=15.0 2023-10-04 04:56:23,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 04:56:28,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 04:56:29,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:56:29,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 04:56:34,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 04:56:36,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:56:36,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1532946.6666666667, ans=0.125 2023-10-04 04:56:37,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:56:37,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:56:37,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1532946.6666666667, ans=0.0 2023-10-04 04:56:38,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 04:56:40,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:56:40,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:56:40,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 04:56:41,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:56:47,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:56:47,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 04:56:50,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:56:52,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:56:52,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1533013.3333333333, ans=0.1 2023-10-04 04:56:52,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1533013.3333333333, ans=0.0 2023-10-04 04:56:55,531 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 04:56:55,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:56:56,833 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 04:56:58,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:56:58,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:56:58,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1533080.0, ans=0.0 2023-10-04 04:56:59,648 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 04:56:59,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:57:02,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 04:57:04,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:57:07,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:57:07,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:57:08,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:57:08,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:57:08,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:57:11,064 INFO [train.py:1046] (1/4) Epoch 44, batch 1550, loss[loss=0.1644, simple_loss=0.2325, pruned_loss=0.04817, over 23771.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2347, pruned_loss=0.03759, over 4712175.12 frames. ], batch size: 150, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 04:57:11,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 04:57:11,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 04:57:12,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:57:12,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 04:57:12,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 04:57:15,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:57:16,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:57:18,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:57:18,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:57:20,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:57:21,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:57:24,674 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 04:57:24,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:57:24,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:57:26,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:57:28,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:57:28,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 04:57:31,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:57:31,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 04:57:32,196 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.01 vs. limit=15.0 2023-10-04 04:57:32,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 04:57:32,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 04:57:32,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:57:34,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:57:38,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:57:40,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 04:57:40,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 04:57:42,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1533280.0, ans=0.0 2023-10-04 04:57:49,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:57:52,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:57:52,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:57:52,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:57:52,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 04:57:57,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1533346.6666666667, ans=0.1 2023-10-04 04:57:58,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:58:00,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:58:03,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:58:06,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:58:07,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:58:07,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 04:58:07,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:58:07,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1533346.6666666667, ans=0.125 2023-10-04 04:58:09,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:58:09,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:58:10,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 04:58:10,917 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 04:58:13,598 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.992e+02 2.204e+02 2.459e+02 3.860e+02, threshold=4.408e+02, percent-clipped=0.0 2023-10-04 04:58:13,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:58:19,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 04:58:23,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:58:24,644 INFO [train.py:1046] (1/4) Epoch 44, batch 1600, loss[loss=0.1411, simple_loss=0.2196, pruned_loss=0.03129, over 23336.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2353, pruned_loss=0.03743, over 4709487.39 frames. ], batch size: 105, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 04:58:24,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:58:24,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 04:58:26,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:58:28,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:58:28,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:58:28,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:58:28,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:58:32,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1533480.0, ans=0.125 2023-10-04 04:58:33,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:58:33,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 04:58:33,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 04:58:35,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 04:58:37,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:58:39,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1533546.6666666667, ans=0.0 2023-10-04 04:58:39,627 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.45 vs. limit=15.0 2023-10-04 04:58:40,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 04:58:42,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:58:43,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:58:44,386 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.06 vs. limit=15.0 2023-10-04 04:58:47,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:58:49,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 04:58:52,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:58:52,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 04:58:53,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:58:53,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 04:58:59,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 04:59:06,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:59:08,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 04:59:09,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:59:09,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:59:09,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:59:11,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1533680.0, ans=0.2 2023-10-04 04:59:12,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 04:59:14,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1533680.0, ans=0.125 2023-10-04 04:59:16,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 04:59:17,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1533680.0, ans=15.0 2023-10-04 04:59:19,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:59:19,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:59:19,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:59:19,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:59:21,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:59:22,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:59:23,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:59:27,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1533746.6666666667, ans=0.2 2023-10-04 04:59:28,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:59:30,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:59:30,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1533746.6666666667, ans=0.0 2023-10-04 04:59:32,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 04:59:32,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:59:34,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 04:59:34,683 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.03 vs. limit=22.5 2023-10-04 04:59:38,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:59:40,519 INFO [train.py:1046] (1/4) Epoch 44, batch 1650, loss[loss=0.1621, simple_loss=0.2473, pruned_loss=0.03848, over 24466.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2361, pruned_loss=0.03789, over 4712858.32 frames. ], batch size: 66, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 04:59:42,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:59:43,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:59:44,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 04:59:44,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 04:59:44,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 04:59:44,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 04:59:48,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:59:48,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:59:50,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:59:50,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:59:52,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:59:54,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 04:59:57,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:59:57,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:59:57,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:59:57,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:59:58,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 04:59:58,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 05:00:05,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:00:06,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:00:15,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 05:00:16,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1533946.6666666667, ans=0.125 2023-10-04 05:00:18,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:00:19,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 05:00:21,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:00:24,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:00:24,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:00:24,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:00:24,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1534013.3333333333, ans=0.125 2023-10-04 05:00:25,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:00:26,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:00:29,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:00:30,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:00:30,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:00:30,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:00:32,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:00:32,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:00:36,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:00:37,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 05:00:39,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:00:39,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 05:00:40,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 05:00:42,347 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.929e+02 2.067e+02 2.263e+02 2.881e+02, threshold=4.133e+02, percent-clipped=0.0 2023-10-04 05:00:42,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 05:00:42,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:00:42,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:00:42,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:00:42,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:00:42,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 05:00:47,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:00:50,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:00:50,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:00:51,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 05:00:54,166 INFO [train.py:1046] (1/4) Epoch 44, batch 1700, loss[loss=0.1326, simple_loss=0.2105, pruned_loss=0.02737, over 24423.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2352, pruned_loss=0.03753, over 4718832.81 frames. ], batch size: 58, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:00:57,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:00:57,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:00:57,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 05:00:58,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:00:58,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:00:58,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:00:59,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:00:59,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:00:59,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 05:01:03,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1534146.6666666667, ans=0.1 2023-10-04 05:01:05,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:01:06,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1534146.6666666667, ans=0.0 2023-10-04 05:01:11,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:01:12,095 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.37 vs. limit=15.0 2023-10-04 05:01:14,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:01:18,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:01:18,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:01:20,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:01:20,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:01:21,015 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.50 vs. limit=15.0 2023-10-04 05:01:21,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1534213.3333333333, ans=0.125 2023-10-04 05:01:22,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 05:01:25,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:01:25,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:01:27,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:01:27,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:01:27,850 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.68 vs. limit=10.0 2023-10-04 05:01:28,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1534280.0, ans=0.0 2023-10-04 05:01:30,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 05:01:30,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1534280.0, ans=0.0 2023-10-04 05:01:32,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 05:01:33,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:01:34,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 05:01:34,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:01:42,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:01:42,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:01:44,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:01:47,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:01:47,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 05:01:47,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:01:48,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:01:48,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 05:01:50,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:01:50,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:01:50,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:01:50,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:01:52,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:01:52,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:01:54,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:01:54,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:01:54,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:01:58,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:01:59,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 05:02:03,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:02:04,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:02:06,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 05:02:09,457 INFO [train.py:1046] (1/4) Epoch 44, batch 1750, loss[loss=0.1594, simple_loss=0.2445, pruned_loss=0.03713, over 24424.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2352, pruned_loss=0.03723, over 4731240.02 frames. ], batch size: 69, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:02:12,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:02:17,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:02:17,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:02:17,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 05:02:18,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:02:19,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:02:19,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:02:22,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 05:02:25,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:02:26,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 05:02:26,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:02:28,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:02:30,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 05:02:32,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 05:02:34,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:02:35,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 05:02:39,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1534613.3333333333, ans=0.125 2023-10-04 05:02:43,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:02:45,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:02:45,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:02:48,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:02:48,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:02:51,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:02:51,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:02:54,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:02:54,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:02:54,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1534680.0, ans=0.1 2023-10-04 05:02:55,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 05:02:58,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:03:01,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 05:03:02,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:03:04,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:03:05,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:03:09,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:03:09,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 05:03:11,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:03:12,441 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 2.076e+02 2.414e+02 2.954e+02 4.467e+02, threshold=4.828e+02, percent-clipped=1.0 2023-10-04 05:03:12,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1534746.6666666667, ans=0.0 2023-10-04 05:03:13,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:03:17,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:03:18,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:03:19,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:03:21,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 05:03:21,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:03:22,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:03:22,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:03:22,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:03:22,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:03:24,017 INFO [train.py:1046] (1/4) Epoch 44, batch 1800, loss[loss=0.1514, simple_loss=0.2278, pruned_loss=0.03747, over 24461.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2344, pruned_loss=0.03714, over 4710795.72 frames. ], batch size: 58, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:03:24,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:03:26,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:03:28,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:03:29,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:03:31,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:03:35,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:03:35,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:03:39,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:03:40,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1534880.0, ans=0.07 2023-10-04 05:03:41,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:03:41,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:03:41,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:03:41,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1534880.0, ans=0.1 2023-10-04 05:03:45,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:03:45,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 05:03:45,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:03:48,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:03:51,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 05:03:53,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 05:03:54,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 05:03:54,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:03:54,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:03:54,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:03:55,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:03:56,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1534946.6666666667, ans=0.0 2023-10-04 05:04:00,192 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 05:04:02,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:04:04,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:04:06,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 05:04:07,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 05:04:07,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:04:09,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:04:11,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:04:15,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 05:04:16,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1535013.3333333333, ans=0.1 2023-10-04 05:04:21,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:04:21,700 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1535080.0, ans=0.1 2023-10-04 05:04:22,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 05:04:22,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:04:22,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:04:24,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:04:24,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 05:04:28,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:04:28,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:04:28,991 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.12 vs. limit=10.0 2023-10-04 05:04:29,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 05:04:29,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:04:31,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:04:31,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:04:31,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:04:32,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:04:32,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:04:35,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:04:35,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:04:36,682 INFO [train.py:1046] (1/4) Epoch 44, batch 1850, loss[loss=0.1646, simple_loss=0.2409, pruned_loss=0.04414, over 22825.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2354, pruned_loss=0.03743, over 4715030.15 frames. ], batch size: 322, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:04:38,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:04:38,888 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:04:40,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:04:44,769 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.19 vs. limit=15.0 2023-10-04 05:04:46,390 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.84 vs. limit=6.0 2023-10-04 05:04:46,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:04:46,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 05:04:51,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 05:04:53,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 05:04:56,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1535213.3333333333, ans=0.0 2023-10-04 05:04:58,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:04:58,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1535213.3333333333, ans=0.09899494936611666 2023-10-04 05:04:59,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 05:04:59,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 05:05:08,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:05:10,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 05:05:11,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:05:13,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:05:17,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 05:05:17,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:17,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:05:19,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:05:22,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:05:23,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:05:26,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:05:26,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:26,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 05:05:26,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:05:29,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:05:31,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:05:33,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 05:05:34,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:05:37,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:05:37,960 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.22 vs. limit=10.0 2023-10-04 05:05:38,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:05:38,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 05:05:38,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 05:05:39,920 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 2.019e+02 2.202e+02 2.668e+02 3.594e+02, threshold=4.403e+02, percent-clipped=0.0 2023-10-04 05:05:40,813 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.27 vs. limit=22.5 2023-10-04 05:05:41,955 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 05:05:43,845 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 05:05:45,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:05:45,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:05:45,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:05:46,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:46,586 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 05:05:46,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:05:46,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:48,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 05:05:49,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:05:51,017 INFO [train.py:1046] (1/4) Epoch 44, batch 1900, loss[loss=0.155, simple_loss=0.2303, pruned_loss=0.03979, over 23485.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.236, pruned_loss=0.03746, over 4714838.38 frames. ], batch size: 134, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:05:51,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:05:51,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 05:05:52,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:52,632 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 05:05:53,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:05:53,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:05:54,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1535480.0, ans=0.0 2023-10-04 05:05:59,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:06:00,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:06:02,355 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 05:06:03,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 05:06:04,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:06:05,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:06:05,057 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 05:06:05,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1535546.6666666667, ans=0.04949747468305833 2023-10-04 05:06:06,301 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 05:06:09,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 05:06:10,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:06:15,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 05:06:17,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 05:06:26,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 05:06:29,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 05:06:29,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:06:29,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1535613.3333333333, ans=0.05 2023-10-04 05:06:31,029 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 05:06:31,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 05:06:31,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 05:06:31,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 05:06:31,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:06:36,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 05:06:39,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:06:42,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1535680.0, ans=0.1 2023-10-04 05:06:43,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:06:43,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 05:06:45,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:06:49,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 05:06:49,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:06:57,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:06:57,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:06:57,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:06:58,406 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.05 vs. limit=15.0 2023-10-04 05:06:58,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:07:00,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:07:00,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:07:01,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:07:02,900 INFO [train.py:1046] (1/4) Epoch 44, batch 1950, loss[loss=0.145, simple_loss=0.2241, pruned_loss=0.03289, over 18269.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.236, pruned_loss=0.03711, over 4726025.07 frames. ], batch size: 40, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:07:04,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:07:04,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:07:05,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:07:05,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:07:06,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:07:07,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:07:10,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:07:11,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:07:11,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:07:11,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:07:15,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 05:07:15,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 05:07:17,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:07:17,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:07:19,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:07:20,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:07:20,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:07:21,081 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1535880.0, ans=0.125 2023-10-04 05:07:22,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:07:25,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:07:25,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:07:25,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:07:25,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:07:28,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:07:31,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1535946.6666666667, ans=0.125 2023-10-04 05:07:32,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:07:32,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:07:32,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:07:32,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 05:07:32,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:07:32,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:07:33,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:07:36,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:07:39,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:07:43,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:07:46,857 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.72 vs. limit=6.0 2023-10-04 05:07:47,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:07:47,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:07:47,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 05:07:49,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:07:51,491 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.60 vs. limit=22.5 2023-10-04 05:07:53,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:07:53,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:07:53,971 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.00 vs. limit=6.0 2023-10-04 05:07:55,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:08:00,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:08:03,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:08:06,094 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.020e+02 2.252e+02 2.597e+02 3.985e+02, threshold=4.504e+02, percent-clipped=0.0 2023-10-04 05:08:06,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:08:07,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:08:10,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:08:10,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:08:10,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 05:08:10,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:08:11,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:08:11,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 05:08:12,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1536080.0, ans=0.0 2023-10-04 05:08:14,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:08:16,727 INFO [train.py:1046] (1/4) Epoch 44, batch 2000, loss[loss=0.1752, simple_loss=0.2606, pruned_loss=0.04488, over 24371.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2363, pruned_loss=0.03724, over 4730828.90 frames. ], batch size: 77, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:08:18,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:08:19,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:08:19,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:08:20,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:08:21,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1536146.6666666667, ans=0.0 2023-10-04 05:08:23,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:08:25,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1536146.6666666667, ans=0.0 2023-10-04 05:08:26,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 05:08:27,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:08:29,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:08:31,917 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.31 vs. limit=22.5 2023-10-04 05:08:32,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 05:08:32,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:08:32,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:08:36,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:08:38,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 05:08:39,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:08:40,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:08:42,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:08:42,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1536213.3333333333, ans=0.125 2023-10-04 05:08:43,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 05:08:43,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:08:46,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 05:08:46,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:08:48,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:08:49,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 05:08:49,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:08:50,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:08:51,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:08:52,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 05:08:54,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 05:08:54,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:08:55,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:08:55,901 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1536280.0, ans=0.04949747468305833 2023-10-04 05:08:59,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:09:01,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:09:01,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:09:01,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:09:02,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1536346.6666666667, ans=0.125 2023-10-04 05:09:03,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:09:03,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:09:03,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:09:03,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:09:05,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:08,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:09:08,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1536346.6666666667, ans=0.125 2023-10-04 05:09:09,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 05:09:14,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:09:16,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:09:20,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:09:20,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:09:23,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:24,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:09:24,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:26,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 05:09:27,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:09:28,931 INFO [train.py:1046] (1/4) Epoch 44, batch 2050, loss[loss=0.1574, simple_loss=0.245, pruned_loss=0.03487, over 24684.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2357, pruned_loss=0.03723, over 4723950.81 frames. ], batch size: 73, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:09:29,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:09:30,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:31,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:09:31,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:32,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1536480.0, ans=0.125 2023-10-04 05:09:37,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:09:38,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:09:40,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:41,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:09:43,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 05:09:43,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:09:45,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:09:46,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:09:55,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:09:55,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:09:57,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 05:10:00,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:10:02,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 05:10:02,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:10:03,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1536613.3333333333, ans=0.0 2023-10-04 05:10:04,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:10:07,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:10:08,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 05:10:09,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:10:10,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:10:10,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:10:10,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:10:15,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:10:16,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:10:17,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1536680.0, ans=0.09899494936611666 2023-10-04 05:10:18,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:10:18,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:10:18,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1536680.0, ans=0.125 2023-10-04 05:10:18,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1536680.0, ans=0.2 2023-10-04 05:10:18,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1536680.0, ans=0.0 2023-10-04 05:10:19,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1536680.0, ans=0.0 2023-10-04 05:10:23,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:10:28,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:10:30,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 05:10:33,074 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 1.975e+02 2.184e+02 2.529e+02 3.912e+02, threshold=4.369e+02, percent-clipped=0.0 2023-10-04 05:10:34,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:10:35,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:10:37,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:10:38,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 05:10:40,791 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.53 vs. limit=15.0 2023-10-04 05:10:41,468 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 05:10:41,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:10:43,308 INFO [train.py:1046] (1/4) Epoch 44, batch 2100, loss[loss=0.1509, simple_loss=0.2248, pruned_loss=0.03853, over 18826.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2341, pruned_loss=0.03702, over 4703603.13 frames. ], batch size: 41, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:10:43,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:10:43,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:10:45,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:10:45,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 05:10:45,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 05:10:46,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:10:49,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:10:49,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1536813.3333333333, ans=0.0 2023-10-04 05:10:50,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:10:53,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:10:54,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:10:54,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 05:10:55,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:10:55,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 05:10:55,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 05:10:58,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:10:58,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:10:58,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 05:11:00,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 05:11:04,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 05:11:04,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:11:07,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:11:08,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:11:12,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:11:12,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 05:11:14,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:11:14,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 05:11:16,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 05:11:17,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:11:17,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 05:11:17,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 05:11:18,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1536946.6666666667, ans=0.125 2023-10-04 05:11:19,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 05:11:19,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:11:20,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:11:24,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:11:25,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:11:27,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:11:28,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:11:28,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 05:11:28,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:11:28,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:11:30,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:11:30,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 05:11:31,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 05:11:32,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 05:11:37,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:11:39,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:11:39,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 05:11:43,387 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.48 vs. limit=10.0 2023-10-04 05:11:45,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:11:47,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:11:49,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:11:49,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:11:49,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 05:11:49,845 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.54 vs. limit=15.0 2023-10-04 05:11:50,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:11:51,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:11:51,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:11:51,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:11:51,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:11:55,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 05:11:55,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 05:11:55,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:11:57,863 INFO [train.py:1046] (1/4) Epoch 44, batch 2150, loss[loss=0.1384, simple_loss=0.1955, pruned_loss=0.04062, over 19345.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2334, pruned_loss=0.03701, over 4698937.77 frames. ], batch size: 389, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:11:57,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:11:57,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:11:58,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:11:59,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:12:02,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 05:12:02,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1537146.6666666667, ans=0.0 2023-10-04 05:12:04,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:12:06,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:12:08,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:12:08,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:08,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:12:11,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:12:11,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:12:11,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:12:16,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:16,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 05:12:20,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:12:22,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:12:24,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:24,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:12:24,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:24,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:12:25,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:12:25,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:12:27,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:12:28,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 05:12:30,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:12:31,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:12:33,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:12:33,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:12:34,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:12:35,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:12:35,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:12:37,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:12:37,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 05:12:38,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:12:41,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:12:42,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:44,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:12:46,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:12:47,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:12:49,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:49,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 05:12:49,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1537346.6666666667, ans=0.2 2023-10-04 05:12:51,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 05:12:52,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:12:52,373 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 05:12:52,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:12:53,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:12:55,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 05:12:55,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:12:55,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 05:12:55,125 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 05:12:55,125 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 05:12:56,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 05:12:58,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:12:59,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:12:59,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:13:01,601 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 1.996e+02 2.244e+02 2.703e+02 4.049e+02, threshold=4.487e+02, percent-clipped=0.0 2023-10-04 05:13:01,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:01,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:13:02,451 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.10 vs. limit=6.0 2023-10-04 05:13:03,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:13:03,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:07,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1537413.3333333333, ans=0.125 2023-10-04 05:13:09,039 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:13:09,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1537413.3333333333, ans=0.0 2023-10-04 05:13:11,480 INFO [train.py:1046] (1/4) Epoch 44, batch 2200, loss[loss=0.1765, simple_loss=0.2465, pruned_loss=0.05325, over 23962.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2339, pruned_loss=0.03711, over 4702285.64 frames. ], batch size: 180, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:13:11,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:13:12,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 05:13:15,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:13:16,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1537480.0, ans=0.125 2023-10-04 05:13:20,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:20,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:13:21,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:13:22,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1537480.0, ans=0.07 2023-10-04 05:13:23,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:13:26,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:13:26,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:13:26,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 05:13:32,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 05:13:32,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 05:13:39,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 05:13:41,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:41,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:13:43,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:13:45,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:13:47,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 05:13:51,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:13:52,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:53,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 05:13:56,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:13:58,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:13:58,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:14:00,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:14:02,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 05:14:02,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:14:04,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 05:14:07,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:14:07,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:14:07,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:14:08,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:14:10,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:14:10,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:14:10,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:14:11,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1537746.6666666667, ans=0.0 2023-10-04 05:14:12,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:14:12,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:14:14,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:14:17,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 05:14:17,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:14:19,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:14:20,022 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 05:14:22,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:14:24,137 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 05:14:25,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 05:14:25,572 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 05:14:26,811 INFO [train.py:1046] (1/4) Epoch 44, batch 2250, loss[loss=0.1623, simple_loss=0.2518, pruned_loss=0.03644, over 24377.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2347, pruned_loss=0.0374, over 4696348.57 frames. ], batch size: 74, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:14:26,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:14:28,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:14:28,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:14:30,406 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 05:14:33,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:14:34,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:14:38,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1537813.3333333333, ans=0.0 2023-10-04 05:14:39,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:14:39,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:14:42,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:14:42,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:14:43,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:14:45,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 05:14:46,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:14:46,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:14:47,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 05:14:49,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:14:49,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:14:51,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:14:57,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:14:57,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 05:14:57,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 05:14:57,741 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.07 vs. limit=10.0 2023-10-04 05:15:00,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 05:15:03,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:15:04,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:15:08,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:15:09,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:15:10,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:15:10,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:15:13,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:15:14,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:15:16,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1538013.3333333333, ans=0.125 2023-10-04 05:15:17,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:15:21,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:15:21,697 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=9.90 vs. limit=22.5 2023-10-04 05:15:24,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:15:24,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:15:25,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:15:28,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1538080.0, ans=0.2 2023-10-04 05:15:31,712 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 1.999e+02 2.130e+02 2.376e+02 3.367e+02, threshold=4.261e+02, percent-clipped=0.0 2023-10-04 05:15:33,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 05:15:36,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:15:36,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 05:15:36,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:15:36,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:15:39,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 05:15:40,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:15:40,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:15:42,002 INFO [train.py:1046] (1/4) Epoch 44, batch 2300, loss[loss=0.1621, simple_loss=0.251, pruned_loss=0.03658, over 24676.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2354, pruned_loss=0.0377, over 4687717.33 frames. ], batch size: 73, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:15:43,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1538146.6666666667, ans=0.2 2023-10-04 05:15:46,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:15:46,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:15:48,837 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 05:15:51,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:15:58,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1538213.3333333333, ans=0.125 2023-10-04 05:16:00,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:16:00,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 05:16:00,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:16:01,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:16:01,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 05:16:01,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1538213.3333333333, ans=0.0 2023-10-04 05:16:02,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:16:04,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:16:04,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1538213.3333333333, ans=0.1 2023-10-04 05:16:05,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:16:10,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:16:11,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:16:14,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:16:14,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1538280.0, ans=0.125 2023-10-04 05:16:18,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:16:18,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:16:21,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:16:22,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:16:27,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:16:28,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:16:29,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:16:29,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 05:16:33,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 05:16:33,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:16:33,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:16:33,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:16:33,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:16:35,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 05:16:35,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:16:35,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 05:16:35,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:16:35,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:16:36,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 05:16:42,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:16:45,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:16:45,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1538413.3333333333, ans=0.0 2023-10-04 05:16:50,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:16:50,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:16:50,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:16:52,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:16:52,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:16:54,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:16:54,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 05:16:55,466 INFO [train.py:1046] (1/4) Epoch 44, batch 2350, loss[loss=0.1674, simple_loss=0.2566, pruned_loss=0.03911, over 24264.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2362, pruned_loss=0.03799, over 4697017.75 frames. ], batch size: 74, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:17:01,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:17:01,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 05:17:07,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 05:17:09,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:17:13,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:17:13,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:17:13,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:17:13,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:17:13,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 05:17:17,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:17:22,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 05:17:25,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:17:28,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:17:28,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:17:30,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:17:31,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 05:17:32,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:17:34,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:17:34,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:17:34,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:17:37,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:17:40,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 05:17:40,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1538680.0, ans=0.0 2023-10-04 05:17:41,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:17:44,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:17:44,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:17:46,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 05:17:47,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:17:49,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 05:17:49,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:17:53,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 05:17:56,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 05:17:56,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:17:56,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 05:17:56,678 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 05:17:58,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 05:17:59,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 05:18:01,032 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.679e+02 2.046e+02 2.383e+02 2.695e+02 3.864e+02, threshold=4.767e+02, percent-clipped=0.0 2023-10-04 05:18:03,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:18:05,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:18:09,844 INFO [train.py:1046] (1/4) Epoch 44, batch 2400, loss[loss=0.1426, simple_loss=0.2134, pruned_loss=0.0359, over 23477.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2353, pruned_loss=0.03775, over 4690318.73 frames. ], batch size: 285, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:18:11,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:18:14,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:18:14,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 05:18:14,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 05:18:21,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:18:21,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:18:24,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 05:18:24,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:18:25,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:18:25,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 05:18:28,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1538880.0, ans=0.125 2023-10-04 05:18:29,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:18:31,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1538880.0, ans=0.2 2023-10-04 05:18:33,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 05:18:37,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:18:38,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1538946.6666666667, ans=0.125 2023-10-04 05:18:43,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 05:18:44,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:18:46,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:18:50,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:18:52,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 05:18:52,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:18:59,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:19:01,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:19:05,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:19:06,044 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.71 vs. limit=10.0 2023-10-04 05:19:06,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:19:06,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 05:19:06,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:19:06,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:19:07,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:19:07,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:19:10,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:19:10,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:19:10,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 05:19:12,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 05:19:15,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:19:15,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:19:15,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 05:19:16,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 05:19:16,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 05:19:16,517 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 05:19:18,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 05:19:18,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1539080.0, ans=0.0 2023-10-04 05:19:19,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:19:21,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:19:21,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:19:21,140 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 05:19:22,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:19:23,833 INFO [train.py:1046] (1/4) Epoch 44, batch 2450, loss[loss=0.1338, simple_loss=0.2167, pruned_loss=0.02547, over 24685.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2336, pruned_loss=0.03717, over 4695286.22 frames. ], batch size: 68, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:19:23,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 05:19:27,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:19:27,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:19:29,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:19:29,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:19:31,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 05:19:37,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:19:37,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:19:38,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:19:38,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:19:40,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:19:40,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 05:19:44,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:19:48,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:19:48,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:19:51,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:19:51,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:19:53,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:19:53,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:19:56,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 05:19:56,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1539280.0, ans=0.125 2023-10-04 05:19:57,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:20:04,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:20:05,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:20:06,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:20:06,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:20:06,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:20:07,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1539346.6666666667, ans=0.1 2023-10-04 05:20:08,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:20:08,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 05:20:12,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:20:12,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:20:16,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:20:16,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:20:22,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:20:22,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 05:20:23,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:20:24,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:20:24,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 05:20:24,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:20:26,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:20:27,988 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.926e+02 2.194e+02 2.495e+02 3.736e+02, threshold=4.388e+02, percent-clipped=0.0 2023-10-04 05:20:29,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:20:32,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:20:32,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:20:36,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 05:20:37,281 INFO [train.py:1046] (1/4) Epoch 44, batch 2500, loss[loss=0.1435, simple_loss=0.2196, pruned_loss=0.03373, over 23432.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2335, pruned_loss=0.03715, over 4703815.69 frames. ], batch size: 285, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:20:37,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:20:44,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:20:52,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:20:52,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:20:54,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:20:54,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 05:20:54,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1539546.6666666667, ans=0.2 2023-10-04 05:20:57,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff3.min_abs, batch_count=1539546.6666666667, ans=0.2 2023-10-04 05:21:00,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:21:01,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:21:02,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 05:21:02,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 05:21:02,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 05:21:04,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:21:05,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:21:07,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 05:21:07,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:21:07,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 05:21:08,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:21:12,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:21:13,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:21:16,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:21:17,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 05:21:17,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:21:19,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:21:23,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:21:26,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:21:27,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1539680.0, ans=0.1 2023-10-04 05:21:28,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:21:36,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:21:39,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 05:21:39,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:21:39,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:21:41,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:21:41,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:21:43,700 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 05:21:43,700 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 05:21:43,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 05:21:45,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:21:47,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 05:21:47,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 05:21:49,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:21:49,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 05:21:50,653 INFO [train.py:1046] (1/4) Epoch 44, batch 2550, loss[loss=0.1621, simple_loss=0.2333, pruned_loss=0.04541, over 22943.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.234, pruned_loss=0.03714, over 4715247.96 frames. ], batch size: 322, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:21:50,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 05:21:53,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:21:56,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:21:57,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:21:58,881 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.26 vs. limit=22.5 2023-10-04 05:21:59,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:22:00,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 05:22:00,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:22:04,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 05:22:06,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:22:07,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:08,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1539880.0, ans=0.125 2023-10-04 05:22:10,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:22:10,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 05:22:12,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:22:12,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:22:12,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:22:15,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:22:15,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 05:22:15,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:22:15,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:15,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 05:22:26,868 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.98 vs. limit=12.0 2023-10-04 05:22:28,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:22:34,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:22:34,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:34,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:22:35,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:22:40,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:22:42,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1540013.3333333333, ans=0.125 2023-10-04 05:22:43,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:22:43,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:22:43,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:22:43,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:22:44,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:22:47,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:22:48,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:52,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:22:52,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 05:22:52,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:22:54,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:55,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:22:56,823 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.968e+02 2.128e+02 2.363e+02 3.523e+02, threshold=4.255e+02, percent-clipped=0.0 2023-10-04 05:22:56,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:22:58,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:23:04,900 INFO [train.py:1046] (1/4) Epoch 44, batch 2600, loss[loss=0.1458, simple_loss=0.2347, pruned_loss=0.02844, over 24495.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2346, pruned_loss=0.0375, over 4703152.46 frames. ], batch size: 66, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:23:05,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:23:05,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1540146.6666666667, ans=0.125 2023-10-04 05:23:06,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1540146.6666666667, ans=0.125 2023-10-04 05:23:07,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:23:09,676 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 05:23:12,987 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 05:23:13,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:23:13,039 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 05:23:14,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 05:23:14,434 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 05:23:17,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:23:17,099 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 05:23:18,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 05:23:19,833 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 05:23:20,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1540213.3333333333, ans=0.1 2023-10-04 05:23:21,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:23:23,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 05:23:23,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 05:23:24,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1540213.3333333333, ans=0.0 2023-10-04 05:23:25,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:23:26,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 05:23:26,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1540213.3333333333, ans=0.125 2023-10-04 05:23:29,405 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 05:23:29,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 05:23:36,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:23:38,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:23:38,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:23:38,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 05:23:40,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:23:44,861 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 05:23:45,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1540280.0, ans=0.0 2023-10-04 05:23:50,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:23:50,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:23:50,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 05:23:51,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:23:51,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:23:53,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 05:23:56,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:23:56,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:23:57,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:24:01,675 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 05:24:01,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:24:01,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:24:06,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:24:06,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:24:08,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 05:24:08,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:24:12,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:24:13,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:24:16,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1540413.3333333333, ans=0.125 2023-10-04 05:24:18,956 INFO [train.py:1046] (1/4) Epoch 44, batch 2650, loss[loss=0.1549, simple_loss=0.232, pruned_loss=0.0389, over 23931.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2352, pruned_loss=0.03742, over 4719847.32 frames. ], batch size: 195, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:24:19,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 05:24:20,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:24:21,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:24:27,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 05:24:27,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:24:27,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:24:28,808 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 05:24:28,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:24:30,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:24:33,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:24:34,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:24:36,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:24:37,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 05:24:37,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:24:37,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:24:40,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 05:24:42,715 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 05:24:44,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:24:47,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 05:24:47,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:24:47,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 05:24:51,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:24:51,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:24:51,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:24:51,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:24:56,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 05:24:56,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 05:24:59,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:25:04,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 05:25:04,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:25:04,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:04,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:25:04,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:25:05,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:25:08,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:25:08,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:25:10,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:25:11,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:25:11,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:25:11,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:25:13,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:25:15,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:25:16,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:25:17,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 05:25:20,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:21,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=1540746.6666666667, ans=15.0 2023-10-04 05:25:23,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:25:23,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:25:23,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 05:25:24,460 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.709e+02 2.007e+02 2.168e+02 2.484e+02 3.520e+02, threshold=4.337e+02, percent-clipped=0.0 2023-10-04 05:25:25,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:25:27,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:30,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:30,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:25:31,515 INFO [train.py:1046] (1/4) Epoch 44, batch 2700, loss[loss=0.1464, simple_loss=0.2209, pruned_loss=0.03596, over 23668.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2356, pruned_loss=0.03752, over 4716193.76 frames. ], batch size: 256, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:25:31,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:25:31,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:25:34,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:25:34,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 05:25:37,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:25:39,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 05:25:40,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:25:41,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:25:41,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:25:42,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1540813.3333333333, ans=0.0 2023-10-04 05:25:44,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:25:44,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:25:44,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:25:45,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 05:25:45,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 05:25:46,466 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.15 vs. limit=22.5 2023-10-04 05:25:46,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:25:48,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:25:49,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:25:49,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:52,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:25:52,994 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.76 vs. limit=12.0 2023-10-04 05:25:53,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 05:25:53,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:25:56,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:25:56,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:26:02,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:26:02,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1540946.6666666667, ans=0.125 2023-10-04 05:26:03,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:26:03,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:26:03,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:26:04,258 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.20 vs. limit=15.0 2023-10-04 05:26:06,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:26:09,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:26:09,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:26:09,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:26:13,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:26:13,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:26:23,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:26:23,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:26:26,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:26:26,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:26:28,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:26:30,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:26:31,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:26:31,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:26:33,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:26:34,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:26:35,083 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=12.63 vs. limit=15.0 2023-10-04 05:26:35,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:26:37,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:26:37,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:26:39,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1541080.0, ans=0.125 2023-10-04 05:26:40,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 05:26:41,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:26:42,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1541080.0, ans=0.125 2023-10-04 05:26:44,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:26:44,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 05:26:44,380 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:26:45,393 INFO [train.py:1046] (1/4) Epoch 44, batch 2750, loss[loss=0.1598, simple_loss=0.2273, pruned_loss=0.04612, over 23733.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.235, pruned_loss=0.03765, over 4709595.47 frames. ], batch size: 164, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:26:46,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 05:26:46,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:26:49,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:26:49,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:26:52,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:26:52,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:26:53,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:26:56,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:26:56,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 05:26:56,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:26:56,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1541146.6666666667, ans=0.125 2023-10-04 05:26:58,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:26:58,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 05:26:58,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:26:58,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:27:02,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 05:27:03,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:27:05,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:27:05,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:27:06,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 05:27:06,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:27:06,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1541213.3333333333, ans=0.125 2023-10-04 05:27:08,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:27:08,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:27:09,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:27:14,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:27:14,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 05:27:16,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:27:16,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:27:16,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1541280.0, ans=0.125 2023-10-04 05:27:17,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 05:27:24,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:27:24,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:27:26,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:27:30,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:27:30,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:27:31,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:27:34,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1541346.6666666667, ans=0.1 2023-10-04 05:27:35,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:27:37,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:27:37,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 05:27:42,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:27:45,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 05:27:48,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 05:27:50,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:27:50,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 05:27:51,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:27:52,770 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 2.112e+02 2.441e+02 2.787e+02 5.885e+02, threshold=4.882e+02, percent-clipped=3.0 2023-10-04 05:27:54,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:27:54,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 05:27:54,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:27:58,456 INFO [train.py:1046] (1/4) Epoch 44, batch 2800, loss[loss=0.157, simple_loss=0.2499, pruned_loss=0.03207, over 24645.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2335, pruned_loss=0.03753, over 4699416.59 frames. ], batch size: 73, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:27:58,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 05:27:58,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:27:58,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:27:59,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 05:27:59,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:27:59,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:28:01,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:28:01,400 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 05:28:01,401 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 05:28:05,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:28:08,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:28:08,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:28:10,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:28:13,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 05:28:16,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 05:28:16,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 05:28:19,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:28:19,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:28:19,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:28:21,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1541546.6666666667, ans=0.1 2023-10-04 05:28:23,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:28:23,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:28:23,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 05:28:25,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:28:25,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1541546.6666666667, ans=0.2 2023-10-04 05:28:30,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:28:32,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:28:33,169 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.95 vs. limit=15.0 2023-10-04 05:28:34,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:28:36,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:28:36,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:28:41,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:28:41,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 05:28:43,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:28:43,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:28:43,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:28:48,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:28:48,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:28:51,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:28:51,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:28:53,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:28:53,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:28:54,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:28:54,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:28:56,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:28:56,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 05:28:56,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:28:58,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:28:58,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:29:00,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 05:29:01,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:29:01,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:29:01,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1541746.6666666667, ans=0.125 2023-10-04 05:29:02,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:29:04,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 05:29:12,645 INFO [train.py:1046] (1/4) Epoch 44, batch 2850, loss[loss=0.1511, simple_loss=0.2342, pruned_loss=0.03402, over 23347.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2329, pruned_loss=0.03714, over 4693009.60 frames. ], batch size: 105, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:29:12,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:29:12,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:29:12,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:29:14,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:29:17,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:29:17,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:29:18,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:29:18,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1541813.3333333333, ans=0.1 2023-10-04 05:29:20,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:29:20,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:29:21,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1541813.3333333333, ans=0.0 2023-10-04 05:29:22,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:29:22,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 05:29:27,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 05:29:27,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:29:30,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 05:29:31,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:29:33,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1541880.0, ans=0.1 2023-10-04 05:29:34,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 05:29:34,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 05:29:34,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1541880.0, ans=0.125 2023-10-04 05:29:35,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1541880.0, ans=0.2 2023-10-04 05:29:36,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:29:48,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:29:50,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:29:50,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:29:52,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 05:29:52,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:29:52,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:29:54,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:29:54,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 05:29:56,153 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:29:57,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:29:57,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:29:57,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:29:58,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:30:00,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:30:01,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:30:02,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:30:04,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:30:04,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1542013.3333333333, ans=0.2 2023-10-04 05:30:05,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:30:05,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:30:05,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1542013.3333333333, ans=0.125 2023-10-04 05:30:07,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:30:10,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:30:16,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:30:18,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 05:30:18,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 05:30:19,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:30:20,627 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.995e+02 2.371e+02 2.715e+02 4.777e+02, threshold=4.741e+02, percent-clipped=0.0 2023-10-04 05:30:20,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:30:20,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 05:30:20,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:30:22,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:30:22,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:30:22,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:30:22,721 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 05:30:22,780 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 05:30:22,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:30:24,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:30:26,683 INFO [train.py:1046] (1/4) Epoch 44, batch 2900, loss[loss=0.1677, simple_loss=0.2463, pruned_loss=0.04453, over 23718.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2331, pruned_loss=0.03704, over 4688336.62 frames. ], batch size: 85, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:30:29,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 05:30:29,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:30:29,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1542146.6666666667, ans=0.1 2023-10-04 05:30:30,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:30:30,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 05:30:35,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:30:35,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 05:30:36,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 05:30:37,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:30:37,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:30:40,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:30:40,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:30:45,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:30:46,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:30:48,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:30:48,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 05:30:49,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:30:49,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:30:52,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 05:30:54,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 05:30:56,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:30:56,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 05:30:56,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:30:58,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:30:58,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 05:30:59,922 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1542280.0, ans=0.0 2023-10-04 05:31:02,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:31:03,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:31:06,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:31:09,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:31:09,992 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.70 vs. limit=15.0 2023-10-04 05:31:10,249 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.38 vs. limit=15.0 2023-10-04 05:31:10,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 05:31:12,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 05:31:12,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:31:15,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:31:17,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 05:31:17,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1542346.6666666667, ans=0.125 2023-10-04 05:31:20,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:31:26,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:31:33,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:31:33,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:31:34,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 05:31:37,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:31:37,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 05:31:37,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:31:37,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:31:38,682 INFO [train.py:1046] (1/4) Epoch 44, batch 2950, loss[loss=0.1551, simple_loss=0.2415, pruned_loss=0.03432, over 24654.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2345, pruned_loss=0.03737, over 4693786.87 frames. ], batch size: 68, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:31:43,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:31:44,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 05:31:46,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:31:46,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:31:48,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:31:50,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:31:50,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1542480.0, ans=0.1 2023-10-04 05:31:51,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 05:31:52,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 05:31:52,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:31:52,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:32:00,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:32:01,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:32:03,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:32:04,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:32:05,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:32:05,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:32:07,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:32:08,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:32:08,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:32:11,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 05:32:15,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 05:32:15,760 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 05:32:15,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1542613.3333333333, ans=0.0 2023-10-04 05:32:17,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:32:19,582 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 05:32:21,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1542613.3333333333, ans=0.0 2023-10-04 05:32:22,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 05:32:22,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:32:23,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:32:23,958 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 05:32:23,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 05:32:25,544 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:32:26,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 05:32:26,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:32:26,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:32:30,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1542680.0, ans=0.2 2023-10-04 05:32:31,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:32:31,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:32:31,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:32:32,877 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 05:32:34,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:32:34,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 05:32:38,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:32:39,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:32:39,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 05:32:41,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:32:42,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 05:32:45,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:32:47,164 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.991e+02 2.260e+02 2.568e+02 3.516e+02, threshold=4.519e+02, percent-clipped=0.0 2023-10-04 05:32:47,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:32:47,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1542746.6666666667, ans=0.0 2023-10-04 05:32:48,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:32:48,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:32:48,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 05:32:51,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:32:52,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:32:52,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:32:52,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:32:53,910 INFO [train.py:1046] (1/4) Epoch 44, batch 3000, loss[loss=0.1682, simple_loss=0.2512, pruned_loss=0.0426, over 24615.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.235, pruned_loss=0.03713, over 4715494.53 frames. ], batch size: 68, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:32:53,910 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 05:33:05,947 INFO [train.py:1078] (1/4) Epoch 44, validation: loss=0.3969, simple_loss=0.2803, pruned_loss=0.2567, over 1125622.00 frames. 2023-10-04 05:33:05,948 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 05:33:06,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:33:07,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:33:07,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1542813.3333333333, ans=0.125 2023-10-04 05:33:08,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:33:10,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 05:33:11,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:33:13,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:33:13,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:33:16,730 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 05:33:18,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 05:33:19,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:33:19,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:33:21,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 05:33:23,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:33:27,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:33:33,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1542880.0, ans=0.125 2023-10-04 05:33:35,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1542946.6666666667, ans=0.0 2023-10-04 05:33:36,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:33:40,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 05:33:42,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:33:46,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:33:46,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:33:46,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:33:47,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1542946.6666666667, ans=0.07 2023-10-04 05:33:49,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:33:49,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 05:33:51,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1543013.3333333333, ans=0.0 2023-10-04 05:33:52,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 05:33:54,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:33:54,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 05:33:55,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:33:57,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:33:57,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:33:57,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:34:01,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:34:03,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:34:03,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:34:04,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:34:07,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 05:34:08,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:34:08,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:08,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:34:12,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:34:12,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:34:14,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 05:34:16,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 05:34:16,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:34:16,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 05:34:16,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:34:17,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 05:34:20,214 INFO [train.py:1046] (1/4) Epoch 44, batch 3050, loss[loss=0.1434, simple_loss=0.2213, pruned_loss=0.03269, over 24424.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2357, pruned_loss=0.03735, over 4725775.20 frames. ], batch size: 58, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:34:20,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:34:21,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:34:23,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 05:34:23,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 05:34:23,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:34:24,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:34:24,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:34:25,896 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.70 vs. limit=15.0 2023-10-04 05:34:26,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:34:26,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:26,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:34:27,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 05:34:28,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1543146.6666666667, ans=0.125 2023-10-04 05:34:30,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:34:31,075 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.67 vs. limit=22.5 2023-10-04 05:34:31,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:34:31,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:34:34,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:36,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 05:34:42,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 05:34:43,549 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 05:34:43,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:34:45,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:34:49,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:50,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:34:50,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:34:54,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:34:54,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:34:54,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:34:54,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:34:54,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:34:56,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:57,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:35:01,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:35:01,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 05:35:03,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:35:03,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:35:05,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:35:06,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:35:06,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:35:07,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:35:08,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_na.min_abs, batch_count=1543346.6666666667, ans=0.02 2023-10-04 05:35:09,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1543346.6666666667, ans=0.0 2023-10-04 05:35:12,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:35:13,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:35:18,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:35:20,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:35:20,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:35:20,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:35:21,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 05:35:21,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:35:21,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1543413.3333333333, ans=0.1 2023-10-04 05:35:22,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 05:35:25,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:35:26,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:35:27,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 05:35:28,607 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 2.065e+02 2.224e+02 2.528e+02 3.449e+02, threshold=4.449e+02, percent-clipped=0.0 2023-10-04 05:35:28,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:35:32,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:35:33,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:35:33,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1543480.0, ans=0.1 2023-10-04 05:35:35,036 INFO [train.py:1046] (1/4) Epoch 44, batch 3100, loss[loss=0.1665, simple_loss=0.2485, pruned_loss=0.04223, over 23418.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2361, pruned_loss=0.03767, over 4720828.87 frames. ], batch size: 106, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:35:35,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1543480.0, ans=0.125 2023-10-04 05:35:36,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:35:39,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 05:35:40,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 05:35:42,564 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.68 vs. limit=12.0 2023-10-04 05:35:43,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 05:35:45,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:35:45,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1543480.0, ans=0.125 2023-10-04 05:35:47,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:35:48,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:35:51,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 05:35:56,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:36:00,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 05:36:04,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 05:36:04,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:04,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:36:06,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:36:06,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 05:36:09,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:36:09,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 05:36:09,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:36:10,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:36:10,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1543613.3333333333, ans=0.125 2023-10-04 05:36:13,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 05:36:14,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:36:18,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:36:18,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 05:36:18,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 05:36:20,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:20,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:36:22,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:36:23,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:23,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:36:23,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1543680.0, ans=0.07 2023-10-04 05:36:24,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:36:24,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:36:25,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:36:27,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:36:27,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:27,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 05:36:31,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:36:32,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 05:36:35,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:36:35,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 05:36:37,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:36:37,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:37,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 05:36:45,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 05:36:47,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:36:47,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:49,280 INFO [train.py:1046] (1/4) Epoch 44, batch 3150, loss[loss=0.1535, simple_loss=0.2369, pruned_loss=0.03509, over 24378.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2359, pruned_loss=0.0374, over 4716066.47 frames. ], batch size: 77, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:36:50,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:36:50,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:36:53,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 05:36:54,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:36:54,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 05:36:55,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 05:36:57,661 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.19 vs. limit=15.0 2023-10-04 05:36:58,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:37:00,062 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 05:37:00,538 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.11 vs. limit=15.0 2023-10-04 05:37:03,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 05:37:03,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:37:04,742 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 05:37:04,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 05:37:06,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 05:37:06,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 05:37:06,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 05:37:06,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:37:06,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:37:07,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:37:10,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 05:37:11,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:37:11,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:37:11,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:37:14,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:37:19,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 05:37:20,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:37:22,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:37:23,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:37:23,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 05:37:27,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 05:37:28,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:37:28,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 05:37:28,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 05:37:30,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:37:30,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:37:31,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:37:31,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 05:37:34,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 05:37:34,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:37:34,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:37:36,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:37:36,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:37:37,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 05:37:37,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:37:39,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 05:37:39,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:37:40,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 05:37:41,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 05:37:43,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:37:43,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:37:43,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1544013.3333333333, ans=0.125 2023-10-04 05:37:44,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 05:37:44,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 05:37:46,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:37:47,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:37:49,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:37:49,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:37:55,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:37:55,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:37:58,397 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.021e+02 2.313e+02 2.490e+02 3.781e+02, threshold=4.625e+02, percent-clipped=0.0 2023-10-04 05:37:58,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 05:38:02,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:38:02,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:38:03,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1544146.6666666667, ans=0.0 2023-10-04 05:38:03,686 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.97 vs. limit=15.0 2023-10-04 05:38:04,440 INFO [train.py:1046] (1/4) Epoch 44, batch 3200, loss[loss=0.1454, simple_loss=0.2288, pruned_loss=0.03094, over 19914.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2355, pruned_loss=0.03713, over 4714692.00 frames. ], batch size: 43, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:38:05,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:38:06,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:38:06,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 05:38:09,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:38:14,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:38:18,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:38:23,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1544213.3333333333, ans=0.1 2023-10-04 05:38:27,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1544213.3333333333, ans=0.2 2023-10-04 05:38:28,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:38:36,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 05:38:38,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:38:38,803 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.15 vs. limit=10.0 2023-10-04 05:38:39,178 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.54 vs. limit=6.0 2023-10-04 05:38:40,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 05:38:42,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 05:38:46,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:38:46,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:38:46,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:38:48,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1544346.6666666667, ans=0.125 2023-10-04 05:38:49,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 05:38:51,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 05:38:52,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 05:38:55,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 05:39:00,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:39:00,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1544346.6666666667, ans=0.04949747468305833 2023-10-04 05:39:04,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:39:04,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:39:04,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:39:04,729 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 05:39:04,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:39:09,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:39:10,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 05:39:10,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 05:39:12,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 05:39:14,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 05:39:15,300 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.50 vs. limit=10.0 2023-10-04 05:39:16,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:39:17,954 INFO [train.py:1046] (1/4) Epoch 44, batch 3250, loss[loss=0.1617, simple_loss=0.2391, pruned_loss=0.04212, over 23872.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2357, pruned_loss=0.03721, over 4724442.60 frames. ], batch size: 179, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:39:19,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1544480.0, ans=0.0 2023-10-04 05:39:20,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:39:20,824 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 05:39:20,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:39:20,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:22,370 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 05:39:26,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:39:29,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:39:36,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:39:36,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 05:39:38,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:39:38,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:39:38,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:39:39,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:39:39,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:39:42,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:44,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:39:44,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:39:45,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:45,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:45,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:39:46,008 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.46 vs. limit=15.0 2023-10-04 05:39:48,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:39:50,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:39:51,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:39:51,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:52,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:39:53,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:39:53,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:39:59,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 05:40:01,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:40:01,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:40:02,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:40:04,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:40:10,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:40:12,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1544680.0, ans=0.125 2023-10-04 05:40:16,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:40:16,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:40:16,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 05:40:16,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:40:16,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 05:40:18,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:40:19,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 05:40:21,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 05:40:21,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:40:23,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:40:23,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:40:24,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 05:40:24,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:40:25,884 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 2.026e+02 2.220e+02 2.491e+02 3.845e+02, threshold=4.441e+02, percent-clipped=0.0 2023-10-04 05:40:27,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:40:27,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:40:30,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 05:40:30,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:40:31,334 INFO [train.py:1046] (1/4) Epoch 44, batch 3300, loss[loss=0.1601, simple_loss=0.2457, pruned_loss=0.03722, over 24486.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2367, pruned_loss=0.03768, over 4712262.50 frames. ], batch size: 66, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:40:33,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:40:33,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 05:40:34,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:40:34,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 05:40:36,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 05:40:37,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 05:40:37,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:40:41,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:40:42,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:40:42,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:40:43,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 05:40:44,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:40:47,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:40:49,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:40:51,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1544880.0, ans=0.125 2023-10-04 05:40:54,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 05:40:54,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1544880.0, ans=0.0 2023-10-04 05:40:55,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:40:55,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:40:57,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:40:58,048 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 05:41:00,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:41:00,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 05:41:00,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:41:00,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:41:02,167 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 05:41:05,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:41:05,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:41:08,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:41:08,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 05:41:08,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 05:41:10,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:41:10,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:41:13,035 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 05:41:14,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 05:41:14,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:41:17,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 05:41:19,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:41:23,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:41:23,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:41:26,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:41:26,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:41:26,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:41:26,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:41:27,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:41:27,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:41:29,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:41:30,354 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 05:41:31,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 05:41:33,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 05:41:34,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:41:34,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:41:36,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:41:36,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:41:38,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:41:38,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:41:38,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:41:39,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:41:42,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:41:43,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 05:41:44,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:41:44,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:41:45,382 INFO [train.py:1046] (1/4) Epoch 44, batch 3350, loss[loss=0.1472, simple_loss=0.2262, pruned_loss=0.03412, over 23263.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2371, pruned_loss=0.03806, over 4716273.45 frames. ], batch size: 119, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:41:46,126 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.04 vs. limit=15.0 2023-10-04 05:41:46,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:41:46,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:41:48,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:41:49,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:41:49,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:41:53,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:41:54,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:41:56,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:41:58,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:01,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:42:01,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:42:01,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:42:03,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 05:42:06,890 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 05:42:06,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:42:08,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1545213.3333333333, ans=0.1 2023-10-04 05:42:09,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 05:42:09,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 05:42:11,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:42:11,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:42:12,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:42:12,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 05:42:13,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:13,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:42:15,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:17,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:42:17,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:19,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:42:21,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:42:23,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:42:25,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:42:27,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1545280.0, ans=0.1 2023-10-04 05:42:29,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:42:29,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:30,803 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.74 vs. limit=22.5 2023-10-04 05:42:31,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:42:32,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:42:35,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:42:37,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 05:42:37,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:42:39,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 05:42:39,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:42:39,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 05:42:40,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:42:42,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:42:42,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1545346.6666666667, ans=0.0 2023-10-04 05:42:47,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:42:47,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 05:42:48,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:42:50,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:42:51,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:42:52,934 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.103e+02 2.289e+02 2.774e+02 3.662e+02, threshold=4.578e+02, percent-clipped=0.0 2023-10-04 05:42:56,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:42:59,553 INFO [train.py:1046] (1/4) Epoch 44, batch 3400, loss[loss=0.1381, simple_loss=0.2149, pruned_loss=0.03067, over 24312.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2369, pruned_loss=0.03801, over 4713628.29 frames. ], batch size: 56, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:42:59,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 05:42:59,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:43:00,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:43:01,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:43:01,787 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.02 vs. limit=15.0 2023-10-04 05:43:02,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 05:43:02,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:43:02,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 05:43:02,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1545480.0, ans=0.125 2023-10-04 05:43:03,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:43:05,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:43:07,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:43:07,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:43:07,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 05:43:10,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 05:43:10,498 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 05:43:10,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:43:14,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:43:14,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:43:14,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:43:14,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1545546.6666666667, ans=0.125 2023-10-04 05:43:14,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1545546.6666666667, ans=0.125 2023-10-04 05:43:17,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:43:20,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1545546.6666666667, ans=0.2 2023-10-04 05:43:21,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:43:21,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1545546.6666666667, ans=0.04949747468305833 2023-10-04 05:43:22,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 05:43:27,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:43:29,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:43:31,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:43:31,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 05:43:37,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:43:40,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 05:43:41,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1545613.3333333333, ans=0.1 2023-10-04 05:43:43,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1545680.0, ans=0.125 2023-10-04 05:43:47,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:43:47,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:43:48,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 05:43:48,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:43:48,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:43:49,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:43:49,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:43:52,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:43:52,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1545680.0, ans=0.2 2023-10-04 05:43:56,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:43:56,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:43:56,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1545680.0, ans=0.125 2023-10-04 05:43:59,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=1545746.6666666667, ans=0.05 2023-10-04 05:44:02,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:44:03,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 05:44:09,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 05:44:13,864 INFO [train.py:1046] (1/4) Epoch 44, batch 3450, loss[loss=0.1621, simple_loss=0.2463, pruned_loss=0.03891, over 24388.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2366, pruned_loss=0.03795, over 4714771.68 frames. ], batch size: 77, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:44:15,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 05:44:18,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 05:44:18,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:44:20,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:44:20,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 05:44:22,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:44:25,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:44:29,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:44:29,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:44:31,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:44:33,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:44:34,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:44:41,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 05:44:45,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 05:44:45,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 05:44:46,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:44:48,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:44:53,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 05:44:53,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:44:56,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:44:58,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:44:58,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:44:59,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:45:01,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 05:45:01,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:45:01,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:45:05,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:45:06,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 05:45:06,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1546013.3333333333, ans=0.95 2023-10-04 05:45:12,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:45:16,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:45:18,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:45:19,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1546080.0, ans=0.125 2023-10-04 05:45:20,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:45:22,081 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 2.019e+02 2.264e+02 2.676e+02 4.035e+02, threshold=4.528e+02, percent-clipped=0.0 2023-10-04 05:45:23,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:45:23,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:45:23,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:45:25,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:45:28,337 INFO [train.py:1046] (1/4) Epoch 44, batch 3500, loss[loss=0.15, simple_loss=0.2381, pruned_loss=0.03091, over 24471.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2351, pruned_loss=0.03741, over 4708480.97 frames. ], batch size: 66, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:45:29,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:45:33,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:45:34,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 05:45:36,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 05:45:39,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 05:45:41,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:45:41,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 05:45:46,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:45:46,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:45:47,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:45:47,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:45:47,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:45:47,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:45:47,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:45:49,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 05:45:52,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:45:52,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:45:55,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:45:59,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:45:59,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1546280.0, ans=0.07 2023-10-04 05:46:00,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 05:46:00,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:46:04,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:46:05,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:46:06,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:08,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:46:08,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:46:10,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 05:46:13,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 05:46:13,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 05:46:14,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:46:16,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:17,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:46:17,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:46:19,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:46:20,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:46:24,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:46:25,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 05:46:25,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 05:46:25,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:46:27,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:46:28,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1546413.3333333333, ans=0.035 2023-10-04 05:46:29,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:46:31,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:33,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 05:46:34,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:46:36,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:46:38,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 05:46:41,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 05:46:43,393 INFO [train.py:1046] (1/4) Epoch 44, batch 3550, loss[loss=0.1748, simple_loss=0.2462, pruned_loss=0.05176, over 23754.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2341, pruned_loss=0.03725, over 4715038.39 frames. ], batch size: 164, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:46:43,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:43,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:46:43,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:46:44,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:46:47,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:46:56,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:46:56,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 05:46:59,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:47:01,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:47:03,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:47:03,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:47:03,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:47:06,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:47:06,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:47:08,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:47:08,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:47:08,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:47:16,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:47:16,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:47:17,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:47:17,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:47:19,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:47:19,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 05:47:19,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:47:19,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1546613.3333333333, ans=0.0 2023-10-04 05:47:21,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:47:22,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 05:47:28,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1546680.0, ans=0.125 2023-10-04 05:47:29,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:47:29,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:47:31,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:47:33,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 05:47:34,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:47:34,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 05:47:36,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:47:38,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:47:38,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:47:39,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1546680.0, ans=0.125 2023-10-04 05:47:40,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 05:47:42,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:47:47,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:47:49,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 05:47:49,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1546746.6666666667, ans=0.0 2023-10-04 05:47:50,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:47:54,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:47:55,575 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.057e+02 2.396e+02 2.886e+02 4.470e+02, threshold=4.792e+02, percent-clipped=0.0 2023-10-04 05:47:55,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 05:47:59,998 INFO [train.py:1046] (1/4) Epoch 44, batch 3600, loss[loss=0.1522, simple_loss=0.2282, pruned_loss=0.03807, over 19404.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2342, pruned_loss=0.03688, over 4708450.48 frames. ], batch size: 42, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:48:02,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 05:48:02,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:48:02,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:48:02,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1546813.3333333333, ans=0.125 2023-10-04 05:48:03,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1546813.3333333333, ans=0.0 2023-10-04 05:48:04,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:48:04,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:48:06,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:48:09,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:48:11,161 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.10 vs. limit=6.0 2023-10-04 05:48:11,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:48:11,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:48:13,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:48:15,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:48:15,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 05:48:17,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:48:19,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1546880.0, ans=0.125 2023-10-04 05:48:20,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:48:20,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1546880.0, ans=0.95 2023-10-04 05:48:20,937 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.70 vs. limit=15.0 2023-10-04 05:48:22,321 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.45 vs. limit=15.0 2023-10-04 05:48:23,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:48:25,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:48:27,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:48:27,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:48:27,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 05:48:28,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:48:31,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:48:31,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:48:34,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:48:34,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:48:36,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:48:36,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 05:48:44,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:48:46,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 05:48:48,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 05:48:52,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:48:58,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:49:01,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:49:04,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1547080.0, ans=0.125 2023-10-04 05:49:05,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:49:05,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:49:05,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 05:49:07,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 05:49:08,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 05:49:09,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:49:09,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:49:11,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 05:49:12,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:49:12,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:49:12,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:49:13,948 INFO [train.py:1046] (1/4) Epoch 44, batch 3650, loss[loss=0.1553, simple_loss=0.2326, pruned_loss=0.03901, over 23486.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2345, pruned_loss=0.03724, over 4707228.61 frames. ], batch size: 285, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:49:14,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 05:49:14,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 05:49:17,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:49:19,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 05:49:19,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1547146.6666666667, ans=0.2 2023-10-04 05:49:23,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 05:49:23,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:49:25,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1547146.6666666667, ans=0.0 2023-10-04 05:49:26,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 05:49:28,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 05:49:31,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:49:31,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:49:31,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:49:34,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:49:35,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:49:35,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 05:49:37,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:49:37,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:49:37,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 05:49:38,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:49:38,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:49:38,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:49:41,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:49:44,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 05:49:45,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 05:49:46,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:49:49,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 05:49:50,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:49:50,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:49:53,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1547280.0, ans=0.0 2023-10-04 05:49:57,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:49:59,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:49:59,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1547346.6666666667, ans=0.0 2023-10-04 05:50:00,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:50:01,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:50:02,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:50:04,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:50:08,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:50:09,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:50:09,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:50:11,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:50:11,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1547346.6666666667, ans=0.125 2023-10-04 05:50:12,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:50:12,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:50:17,435 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 05:50:20,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:50:20,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:50:22,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:50:22,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:50:23,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:50:24,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:50:26,099 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 1.956e+02 2.107e+02 2.308e+02 3.469e+02, threshold=4.214e+02, percent-clipped=0.0 2023-10-04 05:50:26,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 05:50:26,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:50:27,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:50:29,001 INFO [train.py:1046] (1/4) Epoch 44, batch 3700, loss[loss=0.1918, simple_loss=0.259, pruned_loss=0.06224, over 19064.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2356, pruned_loss=0.03771, over 4696285.21 frames. ], batch size: 388, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:50:30,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:50:31,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:50:34,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:50:34,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 05:50:36,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:50:36,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 05:50:37,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:50:39,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 05:50:42,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:50:43,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:50:43,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:50:44,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:50:46,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:50:48,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:50:48,260 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 05:50:56,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:50:57,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 05:50:58,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:51:00,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 05:51:00,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:51:00,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1547613.3333333333, ans=0.0 2023-10-04 05:51:03,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:51:04,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 05:51:06,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:51:07,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:51:10,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:51:11,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:51:13,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 05:51:13,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1547680.0, ans=0.125 2023-10-04 05:51:16,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:51:16,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 05:51:17,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:51:17,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 05:51:22,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:51:23,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:51:25,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:51:26,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 05:51:27,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1547746.6666666667, ans=0.125 2023-10-04 05:51:28,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:51:28,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:51:28,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:51:29,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:51:31,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:51:33,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 05:51:33,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 05:51:34,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:51:34,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:51:35,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:51:37,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:51:39,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:51:40,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:51:41,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:51:43,118 INFO [train.py:1046] (1/4) Epoch 44, batch 3750, loss[loss=0.1493, simple_loss=0.2319, pruned_loss=0.03335, over 23397.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2368, pruned_loss=0.03786, over 4710690.65 frames. ], batch size: 119, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:51:43,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1547813.3333333333, ans=0.125 2023-10-04 05:51:44,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 05:51:44,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1547813.3333333333, ans=0.0 2023-10-04 05:51:45,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 05:51:50,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:51:50,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 05:51:51,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:51:53,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:51:53,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:51:56,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:51:59,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:52:00,630 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.41 vs. limit=22.5 2023-10-04 05:52:02,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:52:03,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:52:06,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:52:11,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:52:11,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 05:52:11,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:52:12,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:52:12,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:52:15,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 05:52:19,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 05:52:22,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:52:22,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:52:25,037 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.36 vs. limit=15.0 2023-10-04 05:52:25,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:52:29,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:52:30,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:52:34,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 05:52:37,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:52:38,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:52:40,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:52:43,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:52:45,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:52:48,172 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.97 vs. limit=15.0 2023-10-04 05:52:48,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 05:52:50,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:52:51,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:52:54,520 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.028e+02 2.230e+02 2.527e+02 3.718e+02, threshold=4.460e+02, percent-clipped=0.0 2023-10-04 05:52:54,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:52:57,960 INFO [train.py:1046] (1/4) Epoch 44, batch 3800, loss[loss=0.1603, simple_loss=0.2385, pruned_loss=0.04099, over 14551.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2365, pruned_loss=0.0379, over 4709454.58 frames. ], batch size: 31, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:53:02,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:53:06,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:53:06,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:53:08,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 05:53:09,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:53:10,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:53:12,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:53:14,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 05:53:14,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:53:15,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:53:17,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:53:17,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:53:18,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:53:18,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 05:53:21,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 05:53:21,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:53:24,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:53:26,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1548280.0, ans=0.2 2023-10-04 05:53:27,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:53:28,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 05:53:30,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 05:53:30,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:53:31,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:53:31,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:53:34,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:53:34,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 05:53:37,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:53:39,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1548280.0, ans=0.125 2023-10-04 05:53:44,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:53:48,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:53:50,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 05:53:54,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 05:53:55,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:53:55,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:53:56,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:53:58,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 05:54:00,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1548413.3333333333, ans=0.125 2023-10-04 05:54:02,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 05:54:02,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 05:54:03,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:54:04,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:54:09,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:54:11,776 INFO [train.py:1046] (1/4) Epoch 44, batch 3850, loss[loss=0.1519, simple_loss=0.2316, pruned_loss=0.03606, over 23434.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.236, pruned_loss=0.0377, over 4709611.42 frames. ], batch size: 93, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:54:11,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:54:12,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1548480.0, ans=0.125 2023-10-04 05:54:17,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:54:17,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 05:54:18,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:54:20,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:54:23,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:54:26,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:54:26,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1548546.6666666667, ans=0.125 2023-10-04 05:54:27,300 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.43 vs. limit=15.0 2023-10-04 05:54:28,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:54:29,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 05:54:35,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:36,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:54:39,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:54:39,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:54:43,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:43,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:54:43,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:54:43,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:54:44,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:54:45,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:54:47,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:47,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:54:47,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 05:54:48,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 05:54:48,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:54:48,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:52,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:54:53,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:54,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 05:54:55,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 05:54:57,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:54:59,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 05:55:00,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:55:05,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:55:06,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:55:12,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:55:12,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 05:55:15,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 05:55:16,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:55:16,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:55:19,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:55:19,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:55:19,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:21,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:21,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:55:21,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 05:55:21,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:55:21,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1548746.6666666667, ans=0.07 2023-10-04 05:55:22,993 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.962e+02 2.115e+02 2.398e+02 3.315e+02, threshold=4.231e+02, percent-clipped=0.0 2023-10-04 05:55:23,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 05:55:24,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:24,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:55:25,734 INFO [train.py:1046] (1/4) Epoch 44, batch 3900, loss[loss=0.1491, simple_loss=0.2283, pruned_loss=0.03494, over 24576.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2343, pruned_loss=0.03716, over 4706480.10 frames. ], batch size: 60, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:55:25,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:55:25,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:27,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:55:28,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:55:28,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:55:30,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:55:30,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 05:55:31,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:31,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1548813.3333333333, ans=0.125 2023-10-04 05:55:33,726 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.79 vs. limit=10.0 2023-10-04 05:55:36,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:55:36,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:55:36,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:55:37,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:55:39,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:55:40,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:40,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:55:42,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 05:55:42,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:55:43,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 05:55:45,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:45,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 05:55:47,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 05:55:50,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:55:52,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:55:52,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:55:53,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:55:57,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:55:58,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:56:01,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:56:01,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:56:03,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:56:09,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:56:09,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:56:11,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1549013.3333333333, ans=0.1 2023-10-04 05:56:15,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:56:16,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:56:25,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:56:30,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:56:30,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 05:56:30,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 05:56:31,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:56:33,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 05:56:34,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:56:36,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 05:56:41,070 INFO [train.py:1046] (1/4) Epoch 44, batch 3950, loss[loss=0.1404, simple_loss=0.2178, pruned_loss=0.03143, over 14246.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2344, pruned_loss=0.03668, over 4712674.18 frames. ], batch size: 30, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:56:41,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:56:41,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1549146.6666666667, ans=0.0 2023-10-04 05:56:43,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 05:56:43,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:56:45,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:56:48,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:56:52,720 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 05:56:53,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:56:53,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 05:56:54,043 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 05:56:55,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:56:58,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:56:58,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 05:56:58,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:57:03,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 05:57:06,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:57:06,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:57:06,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:57:06,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1549213.3333333333, ans=0.125 2023-10-04 05:57:07,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:57:07,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:57:12,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1549280.0, ans=0.125 2023-10-04 05:57:17,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:57:17,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:57:21,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 05:57:28,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 05:57:28,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 05:57:29,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:57:30,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:57:36,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:57:36,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:57:36,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:57:36,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:57:38,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 05:57:43,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:57:43,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1549413.3333333333, ans=0.125 2023-10-04 05:57:44,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:57:49,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 05:57:53,309 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.935e+02 2.171e+02 2.462e+02 3.454e+02, threshold=4.341e+02, percent-clipped=0.0 2023-10-04 05:57:54,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1549480.0, ans=0.125 2023-10-04 05:57:56,213 INFO [train.py:1046] (1/4) Epoch 44, batch 4000, loss[loss=0.1619, simple_loss=0.2373, pruned_loss=0.0433, over 23746.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.235, pruned_loss=0.03686, over 4722084.91 frames. ], batch size: 164, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:57:58,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:58:06,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:58:10,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1549546.6666666667, ans=0.1 2023-10-04 05:58:11,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:58:11,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:58:13,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:58:13,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 05:58:14,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:58:15,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 05:58:15,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:58:15,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 05:58:17,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:58:20,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:58:20,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:58:20,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:58:20,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:58:20,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 05:58:23,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:58:23,360 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 05:58:24,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:58:26,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:58:27,412 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 05:58:27,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1549613.3333333333, ans=0.0 2023-10-04 05:58:28,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:58:28,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:58:29,069 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1549613.3333333333, ans=0.0 2023-10-04 05:58:35,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 05:58:35,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:58:38,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1549613.3333333333, ans=0.125 2023-10-04 05:58:39,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:58:39,315 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 05:58:42,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:58:42,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 05:58:42,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:58:43,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:58:43,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:58:45,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:58:45,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 05:58:45,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:58:48,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 05:58:48,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:58:51,326 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 05:58:55,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:58:56,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 05:58:58,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:58:58,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1549746.6666666667, ans=0.025 2023-10-04 05:58:59,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:58:59,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:59:01,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:59:07,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:59:10,864 INFO [train.py:1046] (1/4) Epoch 44, batch 4050, loss[loss=0.1731, simple_loss=0.2418, pruned_loss=0.05221, over 23809.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2353, pruned_loss=0.03664, over 4736016.97 frames. ], batch size: 179, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:59:10,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 05:59:12,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 05:59:15,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:59:15,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:59:16,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:59:17,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:59:19,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:59:22,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:59:25,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:59:25,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:59:28,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:59:29,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:59:31,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1549880.0, ans=0.125 2023-10-04 05:59:32,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:59:32,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1549880.0, ans=0.0 2023-10-04 05:59:33,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:59:35,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1549880.0, ans=0.0 2023-10-04 05:59:37,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 05:59:39,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 05:59:39,785 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 05:59:42,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:59:47,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 05:59:47,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:59:51,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:59:54,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:59:56,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:59:56,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:59:59,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:00:03,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 06:00:04,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:00:05,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:00:07,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 06:00:10,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:00:16,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 06:00:16,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:00:16,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:00:20,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 06:00:21,023 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.954e+02 2.228e+02 2.538e+02 4.386e+02, threshold=4.457e+02, percent-clipped=1.0 2023-10-04 06:00:21,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 06:00:21,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:00:23,767 INFO [train.py:1046] (1/4) Epoch 44, batch 4100, loss[loss=0.1995, simple_loss=0.2748, pruned_loss=0.06206, over 19246.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2365, pruned_loss=0.03711, over 4738654.50 frames. ], batch size: 388, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:00:23,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:00:23,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:00:24,164 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:00:25,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:00:28,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1550146.6666666667, ans=0.125 2023-10-04 06:00:32,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 06:00:32,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 06:00:34,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 06:00:35,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 06:00:35,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:00:35,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:00:35,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:00:36,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:00:37,013 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 06:00:40,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:00:41,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:00:42,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:00:43,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:00:48,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:00:49,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:00:49,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:00:49,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 06:00:50,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:00:50,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:00:51,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1550213.3333333333, ans=0.0 2023-10-04 06:00:52,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:00:52,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:00:53,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 06:00:55,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:00:57,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 06:00:58,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:00:59,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:00:59,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 06:01:02,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:01:03,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:01:03,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:01:05,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 06:01:06,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:01:06,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:01:08,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 06:01:08,872 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.13 vs. limit=15.0 2023-10-04 06:01:09,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:01:09,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:01:12,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:01:17,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:01:17,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1550346.6666666667, ans=0.0 2023-10-04 06:01:19,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:01:20,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:01:22,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1550413.3333333333, ans=0.1 2023-10-04 06:01:28,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:01:28,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:01:30,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:01:33,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:01:37,888 INFO [train.py:1046] (1/4) Epoch 44, batch 4150, loss[loss=0.153, simple_loss=0.2225, pruned_loss=0.04178, over 23729.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2364, pruned_loss=0.03705, over 4730178.48 frames. ], batch size: 164, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:01:38,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:01:39,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:01:40,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:01:40,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:01:44,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 06:01:45,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:01:45,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 06:01:45,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 06:01:45,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 06:01:45,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1550480.0, ans=0.125 2023-10-04 06:01:47,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:01:53,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:01:53,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:01:57,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:01:59,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:01:59,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1550546.6666666667, ans=0.125 2023-10-04 06:02:00,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:02:02,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:02:02,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:02:02,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 06:02:06,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:02:09,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:02:10,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 06:02:13,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 06:02:13,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:02:15,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 06:02:15,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:02:15,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:02:18,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:02:18,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:02:23,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 06:02:27,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:02:27,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:02:28,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 06:02:28,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:02:30,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 06:02:30,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:02:32,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:02:34,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:02:36,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 06:02:36,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:02:36,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 06:02:37,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:02:40,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 06:02:40,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:02:40,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:02:40,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 06:02:42,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 06:02:42,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:02:43,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 06:02:44,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1550746.6666666667, ans=0.2 2023-10-04 06:02:45,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:02:47,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:02:48,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 06:02:48,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:02:49,483 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.733e+02 2.108e+02 2.384e+02 3.215e+02 5.967e+02, threshold=4.767e+02, percent-clipped=2.0 2023-10-04 06:02:52,739 INFO [train.py:1046] (1/4) Epoch 44, batch 4200, loss[loss=0.1525, simple_loss=0.2408, pruned_loss=0.03205, over 24321.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2353, pruned_loss=0.03689, over 4730025.70 frames. ], batch size: 74, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:02:52,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:02:54,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 06:02:54,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1550813.3333333333, ans=0.1 2023-10-04 06:02:55,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:02:57,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:02:58,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:02:59,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:02:59,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:03:03,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 06:03:05,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 06:03:05,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:03:06,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1550880.0, ans=0.125 2023-10-04 06:03:08,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:03:10,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:03:13,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 06:03:14,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:03:16,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:03:16,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 06:03:16,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:03:17,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1550880.0, ans=0.125 2023-10-04 06:03:18,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:03:18,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:03:18,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:03:18,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1550880.0, ans=0.0 2023-10-04 06:03:20,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:03:23,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 06:03:23,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:03:26,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:03:26,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:03:27,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1550946.6666666667, ans=0.0 2023-10-04 06:03:28,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:03:28,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:03:32,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:03:32,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 06:03:32,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:03:33,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:03:39,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:03:39,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:03:46,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:03:48,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 06:03:50,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:03:54,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:03:55,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:03:58,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 06:04:03,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:04:06,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:04:06,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:04:07,446 INFO [train.py:1046] (1/4) Epoch 44, batch 4250, loss[loss=0.158, simple_loss=0.248, pruned_loss=0.03396, over 24326.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.234, pruned_loss=0.03678, over 4719032.10 frames. ], batch size: 74, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:04:08,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:04:12,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:04:14,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 06:04:14,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:04:17,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:04:17,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1551146.6666666667, ans=0.125 2023-10-04 06:04:20,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:04:24,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:04:24,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:04:24,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1551213.3333333333, ans=0.1 2023-10-04 06:04:25,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:04:25,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:04:27,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:04:28,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:04:28,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:04:31,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:04:33,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:04:34,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 06:04:39,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 06:04:39,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:04:40,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:04:42,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:04:43,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:04:43,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:04:43,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:04:48,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:04:49,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:04:51,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:04:54,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:04:54,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 06:04:55,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:04:55,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 06:04:56,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:04:58,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:04:59,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:04:59,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:05:02,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 06:05:04,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:05:05,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:05:07,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:05:09,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:05:10,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:05:11,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1551413.3333333333, ans=0.0 2023-10-04 06:05:12,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:05:13,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:05:14,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:05:16,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:05:16,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 06:05:18,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:05:19,446 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 1.987e+02 2.246e+02 2.555e+02 4.795e+02, threshold=4.492e+02, percent-clipped=1.0 2023-10-04 06:05:22,554 INFO [train.py:1046] (1/4) Epoch 44, batch 4300, loss[loss=0.1723, simple_loss=0.256, pruned_loss=0.04426, over 23959.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2339, pruned_loss=0.03676, over 4715328.95 frames. ], batch size: 86, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:05:22,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:05:23,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:05:26,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:05:26,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1551480.0, ans=0.0 2023-10-04 06:05:33,235 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.65 vs. limit=15.0 2023-10-04 06:05:34,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:05:34,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 06:05:35,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:05:37,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:05:37,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:05:38,903 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 06:05:40,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:05:41,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:05:42,592 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.63 vs. limit=15.0 2023-10-04 06:05:44,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 06:05:44,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:05:44,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 06:05:49,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 06:05:51,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:05:53,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:05:53,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:05:54,103 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:05:55,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:05:56,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:05:58,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:05:58,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 06:05:59,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 06:06:00,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:06:03,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:06:03,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 06:06:03,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:06:03,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1551613.3333333333, ans=0.125 2023-10-04 06:06:04,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:06:04,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 06:06:04,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 06:06:06,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 06:06:08,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:06:08,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 06:06:09,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 06:06:15,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:06:16,711 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 06:06:16,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:06:18,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:06:18,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:06:19,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 06:06:21,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:06:21,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:06:21,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:06:21,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:06:23,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:06:24,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:06:26,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:06:27,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:06:27,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:06:27,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1551746.6666666667, ans=0.2 2023-10-04 06:06:34,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 06:06:34,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:06:35,918 INFO [train.py:1046] (1/4) Epoch 44, batch 4350, loss[loss=0.1582, simple_loss=0.2366, pruned_loss=0.0399, over 23281.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.235, pruned_loss=0.03709, over 4720351.50 frames. ], batch size: 93, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:06:38,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:06:40,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:06:44,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:06:44,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:06:50,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:06:53,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:06:57,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:06:57,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:06:59,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:07:00,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:07:00,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:07:06,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 06:07:08,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:07:08,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:14,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:16,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 06:07:17,506 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.52 vs. limit=15.0 2023-10-04 06:07:18,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:07:20,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:07:24,421 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 06:07:26,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:07:26,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:07:27,697 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 06:07:27,765 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 06:07:27,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:07:29,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:07:30,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:07:30,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:07:31,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:07:31,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:07:34,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 06:07:34,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:34,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:07:34,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:36,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 06:07:37,508 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 06:07:37,522 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 06:07:37,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 06:07:41,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:07:41,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:07:41,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:07:43,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:07:44,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 06:07:46,795 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 06:07:46,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:47,973 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.987e+02 2.204e+02 2.500e+02 5.176e+02, threshold=4.408e+02, percent-clipped=1.0 2023-10-04 06:07:50,785 INFO [train.py:1046] (1/4) Epoch 44, batch 4400, loss[loss=0.1616, simple_loss=0.2499, pruned_loss=0.03664, over 23412.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2363, pruned_loss=0.03727, over 4735593.61 frames. ], batch size: 93, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:07:52,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:07:52,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:54,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:07:55,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 06:07:56,288 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.40 vs. limit=10.0 2023-10-04 06:07:56,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 06:07:56,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 06:07:58,155 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 06:07:59,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:07:59,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:08:02,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 06:08:05,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:08:05,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:08:05,391 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 06:08:09,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:08:09,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 06:08:09,443 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 06:08:14,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 06:08:14,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 06:08:15,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 06:08:15,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:08:15,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:08:16,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:08:16,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:08:18,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 06:08:18,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 06:08:18,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1552280.0, ans=0.0 2023-10-04 06:08:19,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:08:21,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:08:21,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:08:22,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:08:24,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:08:24,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 06:08:26,230 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 06:08:27,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1552280.0, ans=0.0 2023-10-04 06:08:29,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:08:34,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1552346.6666666667, ans=0.0 2023-10-04 06:08:35,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:08:38,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 06:08:39,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1552346.6666666667, ans=0.125 2023-10-04 06:08:40,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:08:43,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:08:45,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:08:46,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 06:08:46,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:08:46,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:08:46,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:08:48,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:08:52,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 06:08:55,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 06:08:57,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 06:08:57,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:08:57,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 06:08:57,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1552413.3333333333, ans=0.125 2023-10-04 06:08:59,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:09:01,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:09:03,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 06:09:05,155 INFO [train.py:1046] (1/4) Epoch 44, batch 4450, loss[loss=0.1698, simple_loss=0.2482, pruned_loss=0.04574, over 23391.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.237, pruned_loss=0.03735, over 4737899.96 frames. ], batch size: 285, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:09:06,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:09:09,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:09:09,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:09:15,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:09:15,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1552480.0, ans=0.2 2023-10-04 06:09:16,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:09:18,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:09:19,015 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.92 vs. limit=15.0 2023-10-04 06:09:20,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:09:23,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:09:25,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:09:25,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 06:09:25,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:09:25,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:09:27,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:09:27,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:09:29,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 06:09:35,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:09:36,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:09:38,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:09:39,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:09:41,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:09:45,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 06:09:47,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 06:09:47,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 06:09:47,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:09:50,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:09:50,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 06:09:50,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1552680.0, ans=0.1 2023-10-04 06:09:53,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:09:56,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:09:57,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 06:09:57,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:09:57,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:09:57,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:09:57,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:09:59,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:10:01,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:10:01,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 06:10:03,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:10:06,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:10:08,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:10:09,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:10:09,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=1552746.6666666667, ans=0.05 2023-10-04 06:10:10,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 06:10:12,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:10:14,822 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.948e+02 2.234e+02 2.482e+02 3.571e+02, threshold=4.467e+02, percent-clipped=0.0 2023-10-04 06:10:14,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 06:10:16,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:10:18,169 INFO [train.py:1046] (1/4) Epoch 44, batch 4500, loss[loss=0.1497, simple_loss=0.2227, pruned_loss=0.0383, over 23796.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2368, pruned_loss=0.03735, over 4735232.83 frames. ], batch size: 195, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:10:22,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:10:22,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 06:10:22,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 06:10:25,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:10:27,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1552813.3333333333, ans=0.1 2023-10-04 06:10:30,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:10:30,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:10:31,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:10:31,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:10:32,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:10:33,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:10:37,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1552880.0, ans=0.0 2023-10-04 06:10:45,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1552880.0, ans=0.025 2023-10-04 06:10:46,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:10:46,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:10:47,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:10:47,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1552946.6666666667, ans=0.125 2023-10-04 06:10:49,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:10:49,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:10:55,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 06:11:00,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:11:04,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:11:05,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:11:05,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 06:11:07,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:11:07,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:11:09,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:11:09,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:11:13,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:11:13,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 06:11:13,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:11:13,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:11:13,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1553013.3333333333, ans=0.125 2023-10-04 06:11:15,127 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.43 vs. limit=15.0 2023-10-04 06:11:17,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:11:18,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:11:20,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:11:23,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:11:23,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:11:24,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 06:11:26,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 06:11:26,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 06:11:28,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1553080.0, ans=0.125 2023-10-04 06:11:30,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 06:11:32,273 INFO [train.py:1046] (1/4) Epoch 44, batch 4550, loss[loss=0.142, simple_loss=0.2047, pruned_loss=0.03961, over 22769.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2357, pruned_loss=0.03754, over 4716705.81 frames. ], batch size: 322, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:11:32,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 06:11:33,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:11:36,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:11:36,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:11:40,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:11:45,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:11:46,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:11:48,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:11:48,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:11:48,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:11:49,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:11:51,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:11:55,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:11:58,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 06:11:58,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 06:11:59,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:12:00,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 06:12:03,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 06:12:04,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:12:07,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 06:12:08,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1553280.0, ans=0.125 2023-10-04 06:12:09,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:12:12,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:12,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:12,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.whiten.whitening_limit, batch_count=1553280.0, ans=12.0 2023-10-04 06:12:13,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:12:15,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 06:12:16,231 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.64 vs. limit=15.0 2023-10-04 06:12:17,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:12:19,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:21,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:12:22,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:12:22,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 06:12:24,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 06:12:24,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:12:26,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 06:12:26,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 06:12:27,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:12:27,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:12:27,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:12:30,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:30,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:12:32,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:12:33,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 06:12:34,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:12:34,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 06:12:34,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 06:12:34,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:12:36,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 06:12:37,296 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.99 vs. limit=6.0 2023-10-04 06:12:37,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:12:37,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:12:39,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1553413.3333333333, ans=0.0 2023-10-04 06:12:40,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:12:40,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:40,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 06:12:42,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:12:43,831 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 1.968e+02 2.214e+02 2.646e+02 3.245e+02, threshold=4.428e+02, percent-clipped=0.0 2023-10-04 06:12:43,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:12:45,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:12:46,540 INFO [train.py:1046] (1/4) Epoch 44, batch 4600, loss[loss=0.1561, simple_loss=0.2303, pruned_loss=0.04095, over 23389.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2332, pruned_loss=0.03708, over 4697414.64 frames. ], batch size: 105, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:12:46,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:12:49,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:12:50,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:12:50,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:12:52,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 06:12:53,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:12:54,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1553480.0, ans=0.0 2023-10-04 06:12:57,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:12:57,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1553480.0, ans=0.125 2023-10-04 06:12:58,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:13:01,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:04,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1553546.6666666667, ans=0.125 2023-10-04 06:13:07,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 06:13:08,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:12,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:14,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1553613.3333333333, ans=0.05 2023-10-04 06:13:15,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:13:15,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:13:18,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 06:13:18,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 06:13:20,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:13:26,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:26,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:13:28,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:13:32,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 06:13:32,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 06:13:36,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:13:38,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:13:40,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:13:40,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 06:13:40,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:40,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1553680.0, ans=0.125 2023-10-04 06:13:42,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 06:13:42,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:13:42,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:13:43,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:13:45,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:13:45,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:13:46,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 06:13:48,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 06:13:48,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 06:13:48,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:13:49,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:13:49,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:13:51,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:13:54,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1553746.6666666667, ans=0.125 2023-10-04 06:14:00,586 INFO [train.py:1046] (1/4) Epoch 44, batch 4650, loss[loss=0.1628, simple_loss=0.2508, pruned_loss=0.03745, over 23981.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2335, pruned_loss=0.03694, over 4698112.29 frames. ], batch size: 80, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:14:00,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:14:02,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:14:02,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:14:03,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:14:03,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:14:03,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:14:04,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:14:07,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 06:14:10,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:14:12,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 06:14:12,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:14:13,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 06:14:13,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:14:13,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 06:14:14,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 06:14:14,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:14:16,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:14:19,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:14:20,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:14:20,447 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 06:14:24,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:14:24,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1553880.0, ans=0.0 2023-10-04 06:14:25,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 06:14:27,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:14:27,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:14:28,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 06:14:30,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:14:33,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:14:33,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1553946.6666666667, ans=0.125 2023-10-04 06:14:37,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:14:41,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1553946.6666666667, ans=0.125 2023-10-04 06:14:43,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:14:45,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:14:45,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:14:46,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:14:47,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 06:14:49,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 06:14:49,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 06:14:49,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 06:14:51,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:14:55,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1554013.3333333333, ans=0.125 2023-10-04 06:15:00,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:15:00,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:15:00,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 06:15:01,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:15:03,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:15:03,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:15:04,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:15:07,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:15:07,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:15:07,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:15:11,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:15:11,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:15:12,920 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.072e+02 2.400e+02 3.094e+02 4.124e+02, threshold=4.800e+02, percent-clipped=0.0 2023-10-04 06:15:12,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:15:13,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 06:15:14,307 INFO [train.py:1046] (1/4) Epoch 44, batch 4700, loss[loss=0.1465, simple_loss=0.2194, pruned_loss=0.03685, over 23505.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2336, pruned_loss=0.0368, over 4719240.87 frames. ], batch size: 256, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:15:14,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:15:14,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 06:15:22,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:15:22,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:15:22,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:15:24,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:15:26,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:15:29,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=1554213.3333333333, ans=0.05 2023-10-04 06:15:30,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 06:15:31,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 06:15:33,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:15:35,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:15:36,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:15:37,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:15:43,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:15:44,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 06:15:45,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1554280.0, ans=0.125 2023-10-04 06:15:46,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:15:49,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1554280.0, ans=0.0 2023-10-04 06:15:53,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 06:15:54,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:15:58,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:00,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 06:16:02,806 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.65 vs. limit=15.0 2023-10-04 06:16:03,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:16:08,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:16:08,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 06:16:08,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1554346.6666666667, ans=0.1 2023-10-04 06:16:09,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:09,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:16:11,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:16:11,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:16:11,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 06:16:13,856 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 06:16:15,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:16:15,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:15,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:15,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 06:16:15,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1554413.3333333333, ans=0.1 2023-10-04 06:16:17,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:20,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 06:16:20,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1554413.3333333333, ans=0.125 2023-10-04 06:16:22,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:16:25,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:16:27,074 INFO [train.py:1046] (1/4) Epoch 44, batch 4750, loss[loss=0.1466, simple_loss=0.234, pruned_loss=0.02954, over 24639.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2351, pruned_loss=0.03738, over 4713303.66 frames. ], batch size: 68, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:16:28,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:16:28,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1554480.0, ans=0.125 2023-10-04 06:16:29,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:16:30,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 06:16:30,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:16:34,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 06:16:34,743 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.05 vs. limit=12.0 2023-10-04 06:16:37,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:16:37,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:16:38,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:16:41,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 06:16:46,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:16:47,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 06:16:48,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:16:50,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:16:50,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:16:50,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:16:52,146 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 06:16:52,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 06:16:55,820 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.90 vs. limit=6.0 2023-10-04 06:16:57,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 06:16:59,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:17:02,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:17:04,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:17:04,921 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 06:17:04,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:17:08,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:17:08,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1554613.3333333333, ans=0.0 2023-10-04 06:17:10,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1554680.0, ans=0.07 2023-10-04 06:17:10,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1554680.0, ans=0.125 2023-10-04 06:17:11,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:17:12,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 06:17:12,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 06:17:12,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:17:12,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:17:12,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:17:14,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 06:17:14,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 06:17:15,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 06:17:19,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:17:22,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:17:22,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 06:17:23,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:17:25,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:17:27,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:17:27,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:17:28,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:17:31,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:17:32,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 06:17:33,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 06:17:34,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 06:17:35,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:17:36,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:17:38,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 06:17:40,116 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 2.005e+02 2.194e+02 2.471e+02 3.662e+02, threshold=4.389e+02, percent-clipped=0.0 2023-10-04 06:17:41,389 INFO [train.py:1046] (1/4) Epoch 44, batch 4800, loss[loss=0.1726, simple_loss=0.2548, pruned_loss=0.04523, over 23931.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2361, pruned_loss=0.03785, over 4704761.40 frames. ], batch size: 80, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:17:42,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:17:42,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:17:48,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:17:48,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:17:48,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:17:50,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 06:17:50,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:17:51,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:17:54,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:17:58,562 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.60 vs. limit=6.0 2023-10-04 06:17:59,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:00,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:00,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:18:02,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:02,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 06:18:02,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:18:03,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:18:06,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:08,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:18:10,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:18:10,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:18:11,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 06:18:13,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:18:14,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1554946.6666666667, ans=0.125 2023-10-04 06:18:15,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 06:18:15,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 06:18:15,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:18:15,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:18:15,350 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:18:16,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:18:16,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:18:16,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:18:20,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:18:20,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:18:25,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:18:27,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:29,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1555013.3333333333, ans=0.125 2023-10-04 06:18:30,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:18:31,173 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.77 vs. limit=15.0 2023-10-04 06:18:34,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 06:18:34,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:18:35,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:35,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:18:37,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:18:37,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1555013.3333333333, ans=0.125 2023-10-04 06:18:40,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:18:41,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:18:41,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:43,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:18:44,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:18:45,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:18:50,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:18:50,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:50,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:18:50,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 06:18:51,035 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.15 vs. limit=15.0 2023-10-04 06:18:53,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 06:18:54,256 INFO [train.py:1046] (1/4) Epoch 44, batch 4850, loss[loss=0.1457, simple_loss=0.2105, pruned_loss=0.04042, over 23468.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2354, pruned_loss=0.03773, over 4710606.97 frames. ], batch size: 285, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:18:54,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:54,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:54,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:18:54,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:56,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1555146.6666666667, ans=0.05 2023-10-04 06:18:57,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:19:04,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 06:19:05,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:19:09,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:19:11,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:19:11,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:19:12,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1555213.3333333333, ans=0.125 2023-10-04 06:19:15,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:19:15,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:19:15,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1555213.3333333333, ans=0.1 2023-10-04 06:19:17,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:19:17,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 06:19:21,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:19:22,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:19:23,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 06:19:25,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:19:25,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 06:19:27,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:19:27,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:19:33,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:19:33,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 06:19:34,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 06:19:35,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:19:42,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:19:43,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 06:19:43,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:19:43,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:19:46,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:19:48,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 06:19:48,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:19:48,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1555346.6666666667, ans=0.125 2023-10-04 06:19:49,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 06:19:50,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1555346.6666666667, ans=0.1 2023-10-04 06:19:51,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:19:52,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:19:52,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 06:20:00,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:20:03,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1555413.3333333333, ans=0.0 2023-10-04 06:20:05,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:20:05,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:20:08,909 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.027e+02 2.287e+02 2.709e+02 3.937e+02, threshold=4.574e+02, percent-clipped=0.0 2023-10-04 06:20:08,940 INFO [train.py:1046] (1/4) Epoch 44, batch 4900, loss[loss=0.1386, simple_loss=0.2043, pruned_loss=0.03648, over 23745.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2342, pruned_loss=0.03766, over 4699854.34 frames. ], batch size: 232, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:20:09,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1555480.0, ans=0.0 2023-10-04 06:20:11,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 06:20:11,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:20:17,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:20:19,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:20:19,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:20:21,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 06:20:26,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 06:20:27,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 06:20:29,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 06:20:29,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:20:30,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:20:30,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:20:30,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:20:30,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:20:30,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1555546.6666666667, ans=0.0 2023-10-04 06:20:32,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 06:20:34,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 06:20:35,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:20:35,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:20:37,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:20:39,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:20:40,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:20:41,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:20:41,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 06:20:43,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:20:45,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:20:45,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 06:20:45,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 06:20:49,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 06:20:50,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:20:52,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:20:52,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:20:53,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:20:53,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 06:20:54,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:20:54,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 06:20:57,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:20:58,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:21:00,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:21:03,266 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.92 vs. limit=5.0 2023-10-04 06:21:04,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 06:21:05,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:21:05,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 06:21:05,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 06:21:07,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1555746.6666666667, ans=0.2 2023-10-04 06:21:12,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:21:14,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:21:16,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 06:21:16,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 06:21:16,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:21:18,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:21:21,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:21:21,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:21:22,803 INFO [train.py:1046] (1/4) Epoch 44, batch 4950, loss[loss=0.1378, simple_loss=0.2249, pruned_loss=0.02541, over 24286.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2333, pruned_loss=0.03719, over 4703636.29 frames. ], batch size: 61, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:21:22,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:21:22,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 06:21:23,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:21:25,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:21:25,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 06:21:26,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1555813.3333333333, ans=0.0 2023-10-04 06:21:26,508 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.13 vs. limit=15.0 2023-10-04 06:21:28,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 06:21:28,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 06:21:28,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:21:28,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 06:21:29,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:21:29,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:21:29,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:21:31,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:21:33,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:21:35,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:21:36,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:21:37,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:21:38,679 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.86 vs. limit=22.5 2023-10-04 06:21:40,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:21:40,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:21:44,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:21:50,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:21:50,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:21:52,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:21:52,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:21:53,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:21:54,585 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.63 vs. limit=12.0 2023-10-04 06:21:55,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 06:21:56,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 06:21:59,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:22:02,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:22:02,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:22:02,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1555946.6666666667, ans=0.0 2023-10-04 06:22:03,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:22:03,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:22:04,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1555946.6666666667, ans=0.1 2023-10-04 06:22:05,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:22:07,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:22:10,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:22:13,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:22:13,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:22:13,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:22:14,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 06:22:14,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:22:14,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1556013.3333333333, ans=0.125 2023-10-04 06:22:16,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:22:16,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1556013.3333333333, ans=0.1 2023-10-04 06:22:20,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:22:23,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:22:23,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:22:23,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:22:23,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:22:23,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:22:26,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:22:26,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:22:26,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:22:27,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 06:22:32,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:22:35,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 06:22:35,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 06:22:37,369 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 2.058e+02 2.273e+02 2.535e+02 3.965e+02, threshold=4.546e+02, percent-clipped=0.0 2023-10-04 06:22:37,396 INFO [train.py:1046] (1/4) Epoch 44, batch 5000, loss[loss=0.1431, simple_loss=0.2331, pruned_loss=0.02659, over 24644.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2332, pruned_loss=0.03692, over 4709507.89 frames. ], batch size: 68, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:22:42,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:22:42,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:22:45,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 06:22:45,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 06:22:48,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:22:48,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 06:22:49,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:22:49,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:22:51,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 06:22:51,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:22:53,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:22:53,929 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.99 vs. limit=15.0 2023-10-04 06:22:54,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 06:22:54,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:22:54,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:22:57,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 06:22:57,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 06:22:57,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:22:58,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 06:22:58,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:22:58,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:22:59,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:22:59,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 06:22:59,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 06:23:01,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 06:23:01,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:23:02,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:23:03,506 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.84 vs. limit=15.0 2023-10-04 06:23:04,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 06:23:04,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:23:06,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:23:07,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:23:07,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 06:23:09,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 06:23:10,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:23:11,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:23:15,181 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 06:23:17,204 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.12 vs. limit=15.0 2023-10-04 06:23:17,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:23:19,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:23:19,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:22,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 06:23:22,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:23:22,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:23:22,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:23:24,599 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.24 vs. limit=6.0 2023-10-04 06:23:25,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 06:23:25,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:23:28,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:23:29,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:23:34,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 06:23:38,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:48,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:23:50,159 INFO [train.py:1046] (1/4) Epoch 44, batch 5050, loss[loss=0.156, simple_loss=0.2391, pruned_loss=0.03649, over 24490.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2341, pruned_loss=0.03711, over 4708495.26 frames. ], batch size: 66, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:23:50,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:50,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:23:50,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:23:50,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:23:51,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=1556480.0, ans=15.0 2023-10-04 06:23:51,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:23:51,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:57,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:57,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 06:23:57,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:24:00,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:24:01,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:24:01,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 06:24:03,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:24:03,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:24:06,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:24:08,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:24:08,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:24:19,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 06:24:19,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:24:20,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:24:20,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 06:24:21,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:24:23,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:24:23,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:24:24,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:24:24,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 06:24:24,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 06:24:26,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:24:27,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:24:27,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1556613.3333333333, ans=0.125 2023-10-04 06:24:31,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:24:32,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 06:24:32,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1556613.3333333333, ans=0.1 2023-10-04 06:24:33,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:24:36,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 06:24:38,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:24:38,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:24:39,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:24:40,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:24:42,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:24:43,005 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.69 vs. limit=15.0 2023-10-04 06:24:44,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:24:46,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:24:46,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:24:46,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:24:47,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 06:24:47,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1556680.0, ans=0.125 2023-10-04 06:24:48,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:24:48,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:24:51,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:24:51,871 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 06:24:51,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 06:24:53,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:24:53,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:24:53,400 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 06:24:56,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:24:56,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 06:24:56,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:25:00,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:25:00,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:25:00,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 06:25:01,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 06:25:05,120 INFO [train.py:1046] (1/4) Epoch 44, batch 5100, loss[loss=0.1578, simple_loss=0.2343, pruned_loss=0.04068, over 23791.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2353, pruned_loss=0.03774, over 4690369.98 frames. ], batch size: 179, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 06:25:05,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:25:05,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:25:06,395 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.012e+02 2.210e+02 2.512e+02 3.231e+02, threshold=4.420e+02, percent-clipped=0.0 2023-10-04 06:25:06,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:25:06,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1556813.3333333333, ans=0.0 2023-10-04 06:25:09,270 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 06:25:11,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:25:14,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 06:25:16,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 06:25:17,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:25:18,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:25:21,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:25:21,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 06:25:21,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 06:25:27,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:25:27,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:25:29,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:25:33,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 06:25:33,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:25:36,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:25:36,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 06:25:37,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:25:39,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:25:39,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 06:25:41,942 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 06:25:43,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:25:43,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 06:25:43,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 06:25:43,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1556946.6666666667, ans=0.125 2023-10-04 06:25:44,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1556946.6666666667, ans=0.035 2023-10-04 06:25:44,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1556946.6666666667, ans=0.125 2023-10-04 06:25:47,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:25:55,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:25:59,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 06:25:59,561 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 06:25:59,568 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 06:26:02,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 06:26:02,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:26:03,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 06:26:07,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 06:26:08,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 06:26:10,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:26:11,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 06:26:12,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:26:14,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 06:26:19,213 INFO [train.py:1046] (1/4) Epoch 44, batch 5150, loss[loss=0.157, simple_loss=0.2489, pruned_loss=0.03256, over 24478.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.236, pruned_loss=0.03793, over 4695727.13 frames. ], batch size: 69, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 06:26:19,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:26:19,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:26:19,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:26:19,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:26:19,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:26:19,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1557146.6666666667, ans=0.0 2023-10-04 06:26:20,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:26:20,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 06:26:20,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 06:26:22,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 06:26:22,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:26:22,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 06:26:23,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:26:24,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 06:26:26,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:26:27,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:26:27,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1557146.6666666667, ans=0.2 2023-10-04 06:26:33,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:26:33,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 06:26:34,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:26:34,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:26:38,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:26:38,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:26:38,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:26:38,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:26:38,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:26:38,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 06:26:41,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:26:41,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:26:44,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:26:44,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 06:26:46,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:26:49,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1557280.0, ans=0.0 2023-10-04 06:26:53,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:26:54,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 06:26:57,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:27:03,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:27:04,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1557346.6666666667, ans=0.125 2023-10-04 06:27:05,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:27:08,733 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.44 vs. limit=15.0 2023-10-04 06:27:09,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:27:11,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:27:12,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 06:27:15,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:27:16,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:27:16,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:27:21,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:27:21,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:27:22,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 06:27:27,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:27:27,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:27:30,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:27:30,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:27:30,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1557413.3333333333, ans=0.125 2023-10-04 06:27:31,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 06:27:31,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:27:31,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:27:31,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:27:31,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1557480.0, ans=0.0 2023-10-04 06:27:32,817 INFO [train.py:1046] (1/4) Epoch 44, batch 5200, loss[loss=0.1619, simple_loss=0.2294, pruned_loss=0.04715, over 23633.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2364, pruned_loss=0.03811, over 4703012.85 frames. ], batch size: 232, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:27:33,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:27:34,666 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.022e+02 2.221e+02 2.671e+02 4.836e+02, threshold=4.441e+02, percent-clipped=1.0 2023-10-04 06:27:36,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:27:38,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:27:42,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 06:27:43,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:27:43,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:27:43,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1557480.0, ans=0.2 2023-10-04 06:27:45,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1557480.0, ans=10.0 2023-10-04 06:27:46,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:27:49,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:27:49,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:27:50,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 06:27:52,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:27:52,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:27:55,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 06:27:56,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1557546.6666666667, ans=0.125 2023-10-04 06:27:57,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:27:59,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:28:01,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 06:28:01,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 06:28:04,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 06:28:04,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:28:04,431 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 06:28:04,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:28:06,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:28:07,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:28:07,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 06:28:07,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:28:11,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:28:13,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 06:28:14,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 06:28:14,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 06:28:18,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 06:28:20,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:28:25,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:28:25,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:28:27,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 06:28:27,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:28:28,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 06:28:28,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:28:28,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:28:31,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:28:32,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:28:36,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:28:38,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:28:38,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:28:42,404 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:28:43,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:28:43,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 06:28:44,282 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.94 vs. limit=12.0 2023-10-04 06:28:45,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:28:45,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:28:46,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:28:46,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 06:28:47,800 INFO [train.py:1046] (1/4) Epoch 44, batch 5250, loss[loss=0.1581, simple_loss=0.2488, pruned_loss=0.03369, over 24342.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2352, pruned_loss=0.03776, over 4701946.50 frames. ], batch size: 74, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:28:47,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:28:48,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1557813.3333333333, ans=0.0 2023-10-04 06:28:50,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:28:50,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1557813.3333333333, ans=0.0 2023-10-04 06:28:53,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:28:53,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:28:56,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:29:00,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:29:01,682 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.01 vs. limit=15.0 2023-10-04 06:29:02,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:29:03,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:29:05,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:29:08,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 06:29:08,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:29:10,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:29:32,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1558013.3333333333, ans=0.0 2023-10-04 06:29:53,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1558080.0, ans=0.125 2023-10-04 06:29:54,419 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.61 vs. limit=15.0 2023-10-04 06:29:56,289 INFO [train.py:1046] (1/4) Epoch 44, batch 5300, loss[loss=0.1332, simple_loss=0.2173, pruned_loss=0.02452, over 21951.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2337, pruned_loss=0.03713, over 4697555.36 frames. ], batch size: 48, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:29:57,452 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.098e+02 2.407e+02 2.784e+02 3.746e+02, threshold=4.815e+02, percent-clipped=0.0 2023-10-04 06:30:00,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1558146.6666666667, ans=0.125 2023-10-04 06:30:03,430 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.97 vs. limit=15.0 2023-10-04 06:30:04,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1558146.6666666667, ans=0.0 2023-10-04 06:30:10,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:30:10,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 06:30:10,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 06:30:10,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:30:10,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:30:10,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:30:10,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:30:10,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:30:10,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:10,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:30:10,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:30:11,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:30:11,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 06:30:11,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 06:30:11,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 06:30:11,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:30:11,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 06:30:11,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 06:30:11,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:30:12,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:30:12,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:30:12,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:30:12,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:30:12,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:30:12,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:30:12,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:30:12,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:30:12,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:30:12,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:30:12,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:30:13,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:30:13,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 06:30:13,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:30:13,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:30:13,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 06:30:13,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 06:30:14,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:30:14,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:30:14,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 06:30:14,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 06:30:14,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:30:15,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:30:15,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:30:15,362 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 06:30:15,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 06:30:15,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:30:15,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:30:15,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 06:30:15,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 06:30:15,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 06:30:15,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:30:22,558 INFO [train.py:1046] (1/4) Epoch 45, batch 0, loss[loss=0.1608, simple_loss=0.245, pruned_loss=0.03834, over 24414.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.245, pruned_loss=0.03834, over 24414.00 frames. ], batch size: 77, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:30:22,558 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 06:30:34,479 INFO [train.py:1078] (1/4) Epoch 45, validation: loss=0.3306, simple_loss=0.275, pruned_loss=0.1931, over 1125622.00 frames. 2023-10-04 06:30:34,480 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 06:30:35,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 06:30:37,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:30:38,008 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.09 vs. limit=22.5 2023-10-04 06:30:38,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:30:42,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:30:42,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:30:42,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:44,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 06:30:45,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 06:30:46,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:48,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:51,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:51,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:30:52,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:30:53,561 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.92 vs. limit=6.0 2023-10-04 06:30:54,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:30:56,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 06:30:58,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:31:05,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:31:05,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:31:07,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 06:31:08,774 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.32 vs. limit=15.0 2023-10-04 06:31:12,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:31:12,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:31:13,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:31:16,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:31:19,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:31:22,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1558426.6666666667, ans=0.125 2023-10-04 06:31:25,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 06:31:26,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1558426.6666666667, ans=0.0 2023-10-04 06:31:29,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 06:31:29,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:31:29,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:31:30,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:31:30,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:31:34,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 06:31:37,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:31:37,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:31:42,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:31:46,106 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 06:31:46,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:31:47,521 INFO [train.py:1046] (1/4) Epoch 45, batch 50, loss[loss=0.1347, simple_loss=0.2104, pruned_loss=0.02956, over 24351.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2362, pruned_loss=0.03714, over 1073485.94 frames. ], batch size: 56, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:31:50,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:31:53,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:31:53,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 06:31:53,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:31:53,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:31:56,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:31:56,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:31:57,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:32:00,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 06:32:00,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:32:02,609 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.32 vs. limit=22.5 2023-10-04 06:32:08,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:32:08,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 06:32:10,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 06:32:10,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1558626.6666666667, ans=0.5 2023-10-04 06:32:12,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:32:13,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:32:13,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:32:14,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:32:15,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=1558626.6666666667, ans=0.05 2023-10-04 06:32:16,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 06:32:16,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:32:16,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:32:21,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1558693.3333333333, ans=0.035 2023-10-04 06:32:25,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:32:27,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:32:27,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:32:28,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1558693.3333333333, ans=0.125 2023-10-04 06:32:29,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 06:32:30,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:32:31,500 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.23 vs. limit=15.0 2023-10-04 06:32:32,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:32:32,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 06:32:32,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:32:33,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 06:32:38,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1558760.0, ans=0.125 2023-10-04 06:32:41,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:32:41,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:32:43,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:32:44,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:32:44,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:32:45,937 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.033e+02 2.243e+02 2.613e+02 6.562e+02, threshold=4.487e+02, percent-clipped=2.0 2023-10-04 06:32:46,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 06:32:47,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 06:32:47,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:32:48,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:32:48,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:32:50,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:32:50,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 06:32:50,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 06:32:51,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 06:32:53,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:32:53,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:32:53,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 06:32:53,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 06:32:53,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:32:54,330 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.21 vs. limit=15.0 2023-10-04 06:32:54,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:32:57,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:32:57,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:32:58,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1558826.6666666667, ans=0.0 2023-10-04 06:33:00,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:33:02,061 INFO [train.py:1046] (1/4) Epoch 45, batch 100, loss[loss=0.1534, simple_loss=0.2344, pruned_loss=0.03623, over 24337.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2366, pruned_loss=0.03722, over 1886255.03 frames. ], batch size: 61, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:33:02,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:33:04,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:33:08,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 06:33:08,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:33:13,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:33:13,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:33:13,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:33:13,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:33:13,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:33:14,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 06:33:16,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:33:16,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:33:17,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:33:17,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:33:20,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 06:33:21,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:33:21,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:33:21,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1558960.0, ans=0.1 2023-10-04 06:33:23,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:33:26,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:33:30,294 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 06:33:30,317 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 06:33:31,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:33:31,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:33:33,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1559026.6666666667, ans=0.09899494936611666 2023-10-04 06:33:35,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:33:37,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:33:38,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:33:42,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:33:42,643 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.64 vs. limit=10.0 2023-10-04 06:33:44,044 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 06:33:46,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 06:33:50,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:33:52,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:33:53,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:33:58,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:33:58,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:34:00,072 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:34:01,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:34:02,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:34:02,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1559160.0, ans=0.07 2023-10-04 06:34:03,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:34:04,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:34:05,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:34:05,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:34:05,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 06:34:06,784 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 06:34:06,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:34:08,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:34:08,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:08,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:34:08,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 06:34:08,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:34:08,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:34:08,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:10,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:34:11,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:34:13,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:34:13,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:34:16,643 INFO [train.py:1046] (1/4) Epoch 45, batch 150, loss[loss=0.161, simple_loss=0.2351, pruned_loss=0.04349, over 23488.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2369, pruned_loss=0.03737, over 2521562.38 frames. ], batch size: 285, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:34:16,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:34:19,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:34:19,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:34:19,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:23,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:34:23,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:25,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1559226.6666666667, ans=0.125 2023-10-04 06:34:26,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:34:28,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:31,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 06:34:31,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 06:34:31,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 06:34:32,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:34:32,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:34:32,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:34:34,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:34:34,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:34:34,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:35,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:36,849 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 06:34:38,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:34:45,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:34:46,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:34:49,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 06:34:52,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:34:52,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:34:53,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:34:54,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:34:56,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:34:56,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:34:58,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:34:59,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 06:35:03,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:35:05,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:35:05,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:35:05,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:35:06,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:35:08,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 06:35:11,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1559426.6666666667, ans=0.125 2023-10-04 06:35:12,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:35:14,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:35:16,219 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.635e+02 1.999e+02 2.185e+02 2.526e+02 4.113e+02, threshold=4.369e+02, percent-clipped=0.0 2023-10-04 06:35:16,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:35:17,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:35:17,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 06:35:18,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:35:18,990 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 06:35:21,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:35:26,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:35:26,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:35:28,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 06:35:28,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:35:28,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:35:30,714 INFO [train.py:1046] (1/4) Epoch 45, batch 200, loss[loss=0.1429, simple_loss=0.2235, pruned_loss=0.03113, over 23428.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2375, pruned_loss=0.03787, over 3007991.99 frames. ], batch size: 119, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:35:32,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 06:35:33,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 06:35:36,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:35:37,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:35:40,809 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.36 vs. limit=15.0 2023-10-04 06:35:43,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:35:43,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:35:43,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:35:46,940 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.78 vs. limit=15.0 2023-10-04 06:36:00,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:36:02,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:36:02,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:36:02,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1559693.3333333333, ans=0.125 2023-10-04 06:36:02,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1559693.3333333333, ans=0.125 2023-10-04 06:36:03,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:36:05,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 06:36:05,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:36:08,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:08,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:36:09,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:36:09,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:36:11,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 06:36:12,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:36:12,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:36:13,571 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.64 vs. limit=15.0 2023-10-04 06:36:17,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:36:22,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:36:24,705 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.72 vs. limit=15.0 2023-10-04 06:36:28,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:29,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:36:31,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1559826.6666666667, ans=0.125 2023-10-04 06:36:35,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:37,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 06:36:37,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:36:37,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:36:37,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:36:39,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:36:39,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 06:36:40,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:36:40,979 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 06:36:42,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1559893.3333333333, ans=0.125 2023-10-04 06:36:43,594 INFO [train.py:1046] (1/4) Epoch 45, batch 250, loss[loss=0.15, simple_loss=0.2212, pruned_loss=0.03942, over 23569.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2361, pruned_loss=0.03721, over 3394982.07 frames. ], batch size: 232, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:36:43,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:45,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:36:46,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:48,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:36:50,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:36:50,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:52,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:36:53,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1559893.3333333333, ans=0.0 2023-10-04 06:36:56,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:36:58,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1559960.0, ans=0.125 2023-10-04 06:37:04,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:37:06,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:37:07,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:37:09,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1559960.0, ans=0.0 2023-10-04 06:37:14,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 06:37:14,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:37:14,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1560026.6666666667, ans=0.2 2023-10-04 06:37:16,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:37:16,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:37:18,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:37:18,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:37:19,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:37:22,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:37:24,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 06:37:25,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:37:25,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:37:27,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:37:27,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:37:27,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:37:28,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:37:28,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:37:29,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:37:31,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:37:31,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:37:35,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:37:39,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:37:41,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:37:42,790 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 2.020e+02 2.227e+02 2.604e+02 5.544e+02, threshold=4.454e+02, percent-clipped=1.0 2023-10-04 06:37:47,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:37:50,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:37:50,455 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:37:53,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 06:37:55,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:37:55,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:37:58,152 INFO [train.py:1046] (1/4) Epoch 45, batch 300, loss[loss=0.1568, simple_loss=0.2381, pruned_loss=0.03778, over 23531.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2337, pruned_loss=0.0368, over 3691666.06 frames. ], batch size: 93, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:37:58,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 06:37:58,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:37:58,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:37:58,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 06:38:03,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:38:03,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:38:07,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:38:08,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1560226.6666666667, ans=0.125 2023-10-04 06:38:09,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 06:38:10,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:38:11,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:38:11,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 06:38:11,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:38:14,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:38:15,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1560293.3333333333, ans=0.2 2023-10-04 06:38:19,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:38:21,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 06:38:24,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 06:38:24,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:38:28,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:38:30,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:38:30,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 06:38:30,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:38:31,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1560360.0, ans=0.125 2023-10-04 06:38:32,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:38:35,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:38:35,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:38:40,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 06:38:40,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 06:38:40,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:38:43,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:38:44,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 06:38:46,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:38:49,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:38:52,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:38:52,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 06:38:55,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:38:56,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:38:58,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:39:00,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:39:02,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 06:39:02,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:39:02,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:39:03,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 06:39:04,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:39:05,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1560493.3333333333, ans=0.125 2023-10-04 06:39:06,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:06,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:39:07,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:39:09,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:11,742 INFO [train.py:1046] (1/4) Epoch 45, batch 350, loss[loss=0.1543, simple_loss=0.2408, pruned_loss=0.03395, over 24085.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2321, pruned_loss=0.03684, over 3914021.52 frames. ], batch size: 80, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:39:11,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:39:11,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 06:39:16,020 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:18,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1560560.0, ans=0.125 2023-10-04 06:39:20,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:39:23,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:39:24,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:27,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 06:39:29,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:39:29,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 06:39:32,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:32,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 06:39:32,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:39:36,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 06:39:38,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:39:38,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:39:39,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:39:42,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:39:42,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:39:42,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:39:42,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:39:42,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1560693.3333333333, ans=0.07 2023-10-04 06:39:43,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:39:44,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:39:44,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:49,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1560693.3333333333, ans=0.0 2023-10-04 06:39:50,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1560693.3333333333, ans=0.07 2023-10-04 06:39:51,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:39:51,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:39:53,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:39:54,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:39:58,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1560760.0, ans=0.0 2023-10-04 06:39:59,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 06:39:59,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:40:01,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1560760.0, ans=0.125 2023-10-04 06:40:04,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:40:04,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:40:04,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:40:05,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 06:40:06,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1560760.0, ans=0.125 2023-10-04 06:40:07,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:09,863 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 06:40:09,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 06:40:09,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:40:10,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1560826.6666666667, ans=0.125 2023-10-04 06:40:11,196 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 1.951e+02 2.094e+02 2.424e+02 4.061e+02, threshold=4.188e+02, percent-clipped=0.0 2023-10-04 06:40:12,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:40:12,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 06:40:15,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:17,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:40:19,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:40:20,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:20,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:40:22,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:40:26,053 INFO [train.py:1046] (1/4) Epoch 45, batch 400, loss[loss=0.149, simple_loss=0.224, pruned_loss=0.03701, over 23404.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2321, pruned_loss=0.0368, over 4084336.93 frames. ], batch size: 285, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:40:26,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:40:28,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:40:30,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 06:40:30,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:30,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:40:31,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1560893.3333333333, ans=0.2 2023-10-04 06:40:33,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:40:33,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:40:36,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:40:38,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:40:39,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 06:40:41,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 06:40:41,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:40:42,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 06:40:42,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:40:45,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:40:45,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:40:45,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 06:40:45,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:40:47,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:40:47,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:40:47,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:50,630 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 06:40:50,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 06:40:56,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:40:57,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:40:57,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 06:40:59,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 06:41:03,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:41:05,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:41:11,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 06:41:13,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:41:13,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 06:41:16,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:41:18,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:41:19,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 06:41:20,910 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.64 vs. limit=15.0 2023-10-04 06:41:21,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:41:24,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:41:24,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:41:27,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:41:27,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 06:41:29,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_na.min_abs, batch_count=1561160.0, ans=0.02 2023-10-04 06:41:30,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:41:30,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 06:41:32,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1561160.0, ans=0.025 2023-10-04 06:41:33,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:41:33,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:41:34,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 06:41:37,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:41:38,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:41:39,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 06:41:40,721 INFO [train.py:1046] (1/4) Epoch 45, batch 450, loss[loss=0.1428, simple_loss=0.2194, pruned_loss=0.03311, over 23492.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2323, pruned_loss=0.03664, over 4220608.81 frames. ], batch size: 134, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:41:40,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 06:41:40,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:41:40,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:41:42,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:41:42,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 06:41:42,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:41:43,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:41:46,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:41:55,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:41:55,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:41:57,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 06:41:59,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 06:42:03,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:42:06,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:42:06,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:42:09,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:42:10,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:42:13,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 06:42:13,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 06:42:16,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 06:42:16,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:42:17,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:42:19,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:42:22,277 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 06:42:22,286 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 06:42:22,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:42:24,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:42:25,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 06:42:28,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:42:28,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:42:30,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 06:42:30,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 06:42:32,242 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.07 vs. limit=12.0 2023-10-04 06:42:33,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:42:34,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:42:35,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:42:37,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 06:42:39,789 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.888e+02 2.086e+02 2.350e+02 3.841e+02, threshold=4.172e+02, percent-clipped=0.0 2023-10-04 06:42:41,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:42:42,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 06:42:42,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 06:42:44,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:42:48,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:42:51,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:42:53,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:42:53,292 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 06:42:54,431 INFO [train.py:1046] (1/4) Epoch 45, batch 500, loss[loss=0.1612, simple_loss=0.2367, pruned_loss=0.04286, over 23672.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2337, pruned_loss=0.03706, over 4304963.36 frames. ], batch size: 232, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:42:56,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:42:57,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:42:57,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:42:59,242 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 06:43:00,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 06:43:00,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:43:02,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:43:05,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:43:06,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:43:08,880 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.79 vs. limit=6.0 2023-10-04 06:43:09,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:43:09,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:43:10,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:19,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:43:19,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 06:43:21,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:43:22,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:43:22,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 06:43:22,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:43:27,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:43:29,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:43:29,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:43:29,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:43:30,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 06:43:33,413 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 06:43:34,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:43:38,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:38,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:39,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:39,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 06:43:42,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 06:43:43,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:43:45,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:43:46,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1561760.0, ans=0.2 2023-10-04 06:43:47,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:43:51,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:57,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:43:59,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 06:43:59,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:43:59,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:44:03,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 06:44:03,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 06:44:05,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:44:09,062 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.16 vs. limit=15.0 2023-10-04 06:44:09,784 INFO [train.py:1046] (1/4) Epoch 45, batch 550, loss[loss=0.1545, simple_loss=0.2488, pruned_loss=0.03011, over 24295.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2348, pruned_loss=0.03731, over 4397630.90 frames. ], batch size: 74, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:44:09,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 06:44:12,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 06:44:12,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:44:12,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 06:44:12,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:44:12,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:44:14,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:14,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:14,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:44:15,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:44:16,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1561893.3333333333, ans=0.125 2023-10-04 06:44:19,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:44:19,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 06:44:20,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:44:26,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:44:26,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:26,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1561960.0, ans=0.2 2023-10-04 06:44:28,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:44:30,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:30,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1561960.0, ans=0.125 2023-10-04 06:44:34,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 06:44:35,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 06:44:35,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1561960.0, ans=0.125 2023-10-04 06:44:35,516 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.18 vs. limit=15.0 2023-10-04 06:44:37,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:44:42,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:44:42,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:44:44,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:44:44,688 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.46 vs. limit=22.5 2023-10-04 06:44:48,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:44:48,038 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 06:44:48,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:49,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 06:44:52,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:44:52,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:44:52,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:44:54,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:44:55,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 06:44:57,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 06:44:58,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:44:58,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:44:58,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:44:58,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:45:01,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:45:02,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:45:03,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1562093.3333333333, ans=0.1 2023-10-04 06:45:06,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:45:07,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:45:07,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 06:45:09,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:45:10,267 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.015e+02 2.214e+02 2.569e+02 3.684e+02, threshold=4.428e+02, percent-clipped=0.0 2023-10-04 06:45:10,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:45:12,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:45:12,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:45:15,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:45:15,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 06:45:20,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 06:45:24,043 INFO [train.py:1046] (1/4) Epoch 45, batch 600, loss[loss=0.1513, simple_loss=0.2278, pruned_loss=0.0374, over 23466.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2342, pruned_loss=0.03714, over 4476431.70 frames. ], batch size: 134, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:45:25,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 06:45:28,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:45:28,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:45:28,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:45:34,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:45:36,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:45:36,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1562226.6666666667, ans=0.025 2023-10-04 06:45:37,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 06:45:40,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:45:40,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:45:43,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:45:44,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 06:45:44,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:45:45,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1562293.3333333333, ans=0.125 2023-10-04 06:45:51,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 06:45:55,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:45:55,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:45:55,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:45:56,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1562360.0, ans=0.1 2023-10-04 06:46:02,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:46:02,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:46:02,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:46:08,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:46:12,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1562426.6666666667, ans=0.125 2023-10-04 06:46:13,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:46:13,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:46:13,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:46:14,505 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.86 vs. limit=10.0 2023-10-04 06:46:20,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 06:46:23,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1562493.3333333333, ans=0.125 2023-10-04 06:46:24,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:46:24,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:46:30,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 06:46:30,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:46:31,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1562493.3333333333, ans=0.0 2023-10-04 06:46:33,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 06:46:33,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:46:34,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:46:37,086 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.58 vs. limit=15.0 2023-10-04 06:46:39,053 INFO [train.py:1046] (1/4) Epoch 45, batch 650, loss[loss=0.1538, simple_loss=0.2288, pruned_loss=0.03937, over 23758.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2332, pruned_loss=0.03696, over 4519697.52 frames. ], batch size: 179, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:46:39,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 06:46:41,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:46:41,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1562560.0, ans=0.125 2023-10-04 06:46:42,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:46:45,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:46:45,823 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.40 vs. limit=22.5 2023-10-04 06:46:46,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:46:48,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 06:46:49,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:46:52,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:46:52,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:46:53,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1562626.6666666667, ans=0.125 2023-10-04 06:46:57,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:46:59,134 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.82 vs. limit=22.5 2023-10-04 06:47:01,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 06:47:01,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:47:02,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:47:05,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1562626.6666666667, ans=0.125 2023-10-04 06:47:06,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:47:06,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 06:47:09,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:47:09,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:09,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:47:11,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:12,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:47:14,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:47:14,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1562693.3333333333, ans=0.1 2023-10-04 06:47:15,398 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 06:47:15,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:47:15,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:47:15,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1562693.3333333333, ans=0.125 2023-10-04 06:47:19,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:19,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:47:19,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:47:20,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:47:22,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 06:47:22,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:47:23,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:47:24,076 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:47:25,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 06:47:25,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:47:26,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 06:47:28,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 06:47:29,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 06:47:29,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:29,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:47:29,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:47:29,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:47:32,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:47:39,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:39,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:47:40,357 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.090e+02 2.308e+02 2.598e+02 4.000e+02, threshold=4.617e+02, percent-clipped=0.0 2023-10-04 06:47:40,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:47:44,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:47:44,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 06:47:45,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:47:52,606 INFO [train.py:1046] (1/4) Epoch 45, batch 700, loss[loss=0.1593, simple_loss=0.2374, pruned_loss=0.0406, over 23373.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2328, pruned_loss=0.03661, over 4570682.29 frames. ], batch size: 93, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:47:52,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:47:52,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:47:52,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:47:53,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:47:58,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 06:48:00,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 06:48:01,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 06:48:01,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:48:03,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:48:04,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 06:48:09,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:48:11,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:48:12,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:48:13,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 06:48:13,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:48:16,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:48:17,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 06:48:19,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:48:19,960 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=10.17 vs. limit=10.0 2023-10-04 06:48:20,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 06:48:22,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 06:48:26,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:48:26,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:48:28,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:48:31,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1563026.6666666667, ans=0.125 2023-10-04 06:48:32,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:48:32,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 06:48:34,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1563026.6666666667, ans=0.125 2023-10-04 06:48:38,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:48:38,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:48:40,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 06:48:42,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:48:43,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:48:43,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1563093.3333333333, ans=0.125 2023-10-04 06:48:45,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:48:50,800 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.53 vs. limit=15.0 2023-10-04 06:48:51,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:48:51,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 06:48:54,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 06:48:54,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 06:48:57,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:49:00,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:49:00,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:49:03,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:49:03,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 06:49:06,724 INFO [train.py:1046] (1/4) Epoch 45, batch 750, loss[loss=0.1457, simple_loss=0.2262, pruned_loss=0.03258, over 23385.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2322, pruned_loss=0.03661, over 4595946.38 frames. ], batch size: 119, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:49:08,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 06:49:08,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 06:49:08,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 06:49:10,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 06:49:10,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 06:49:10,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:49:11,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1563226.6666666667, ans=0.0 2023-10-04 06:49:12,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 06:49:13,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:49:13,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:49:16,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:49:18,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:49:18,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:49:18,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:49:22,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:49:22,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:49:23,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:49:25,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:49:25,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:49:26,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 06:49:26,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:49:28,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:49:30,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:49:31,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 06:49:33,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 06:49:33,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:49:36,132 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.10 vs. limit=5.0 2023-10-04 06:49:36,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 06:49:36,467 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 06:49:36,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 06:49:36,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:49:36,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:49:38,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:49:42,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:49:44,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:49:44,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:49:45,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:49:46,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:49:46,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 06:49:48,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:49:49,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 06:49:49,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:49:53,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:49:53,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 06:49:53,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:49:59,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:49:59,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:50:00,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1563426.6666666667, ans=0.1 2023-10-04 06:50:01,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:02,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1563426.6666666667, ans=0.125 2023-10-04 06:50:04,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:50:07,901 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 2.004e+02 2.219e+02 2.508e+02 4.389e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-04 06:50:08,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 06:50:09,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:50:10,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:50:12,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:50:12,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:50:13,016 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.29 vs. limit=15.0 2023-10-04 06:50:14,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:50:14,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:50:15,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1563493.3333333333, ans=0.0 2023-10-04 06:50:16,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1563493.3333333333, ans=0.125 2023-10-04 06:50:20,456 INFO [train.py:1046] (1/4) Epoch 45, batch 800, loss[loss=0.1542, simple_loss=0.2342, pruned_loss=0.03712, over 24291.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2336, pruned_loss=0.03698, over 4622264.46 frames. ], batch size: 61, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:50:21,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:50:21,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:23,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:50:23,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:50:23,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1563560.0, ans=0.025 2023-10-04 06:50:24,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:24,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:27,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:32,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:50:32,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:50:36,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 06:50:38,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:39,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:50:39,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:50:40,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:50:40,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 06:50:41,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:50:41,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 06:50:44,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:45,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:50:46,200 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.77 vs. limit=12.0 2023-10-04 06:50:48,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:50:48,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:50:49,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:51,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:54,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:50:55,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:50:55,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 06:50:56,756 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 06:50:58,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 06:50:58,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:50:58,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:50:59,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:59,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:51:00,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1563693.3333333333, ans=0.125 2023-10-04 06:51:03,393 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 06:51:04,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 06:51:06,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:51:08,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:51:12,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:51:12,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1563760.0, ans=0.0 2023-10-04 06:51:16,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:51:18,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 06:51:18,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:51:22,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 06:51:28,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:51:31,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:51:31,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 06:51:31,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1563826.6666666667, ans=0.2 2023-10-04 06:51:32,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:51:32,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:51:34,461 INFO [train.py:1046] (1/4) Epoch 45, batch 850, loss[loss=0.1563, simple_loss=0.2307, pruned_loss=0.04096, over 23711.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2348, pruned_loss=0.03728, over 4649214.74 frames. ], batch size: 149, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:51:34,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 06:51:34,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:51:37,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:51:39,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:51:41,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1563893.3333333333, ans=0.0 2023-10-04 06:51:42,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:51:42,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:51:43,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 06:51:45,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 06:51:45,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 06:51:45,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:51:45,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:51:46,064 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.20 vs. limit=15.0 2023-10-04 06:51:47,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1563893.3333333333, ans=0.0 2023-10-04 06:51:48,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:51:48,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:51:48,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:51:52,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:51:53,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:51:53,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 06:51:53,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1563960.0, ans=0.125 2023-10-04 06:51:56,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 06:51:59,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1563960.0, ans=0.1 2023-10-04 06:52:00,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:52:00,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 06:52:03,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 06:52:07,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 06:52:09,024 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 06:52:09,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:52:09,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:52:09,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 06:52:11,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:52:13,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:52:13,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 06:52:16,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:52:16,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:52:17,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:52:19,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 06:52:20,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:52:20,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:52:21,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 06:52:24,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1564093.3333333333, ans=0.1 2023-10-04 06:52:26,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:52:26,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:52:26,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:52:26,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:52:27,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:52:30,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:52:31,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:52:33,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:52:33,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:52:33,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:52:34,558 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.76 vs. limit=6.0 2023-10-04 06:52:34,988 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 2.097e+02 2.346e+02 2.719e+02 4.087e+02, threshold=4.692e+02, percent-clipped=0.0 2023-10-04 06:52:41,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:52:43,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:52:43,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 06:52:44,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:52:44,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:52:46,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 06:52:48,995 INFO [train.py:1046] (1/4) Epoch 45, batch 900, loss[loss=0.1337, simple_loss=0.2187, pruned_loss=0.02438, over 24559.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2353, pruned_loss=0.03716, over 4672531.09 frames. ], batch size: 60, lr: 2.26e-03, grad_scale: 32.0 2023-10-04 06:52:51,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:52:54,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:52:54,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 06:52:57,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:52:57,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 06:52:58,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 06:53:00,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:53:01,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:53:01,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:53:01,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:53:12,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:53:12,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:53:14,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:53:16,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:53:20,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 06:53:23,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:53:25,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 06:53:27,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:53:27,388 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 06:53:28,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 06:53:34,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:53:34,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:53:35,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:53:39,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:53:39,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:53:42,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 06:53:42,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:53:44,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 06:53:45,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:53:47,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:53:48,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:53:48,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:53:51,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 06:53:51,750 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 06:53:53,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 06:53:54,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 06:53:56,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1564493.3333333333, ans=0.05 2023-10-04 06:53:57,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:53:59,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 06:54:03,120 INFO [train.py:1046] (1/4) Epoch 45, batch 950, loss[loss=0.155, simple_loss=0.2422, pruned_loss=0.03391, over 24541.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.236, pruned_loss=0.03757, over 4667537.58 frames. ], batch size: 71, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 06:54:07,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:54:09,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:54:09,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:54:10,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:54:12,447 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 06:54:16,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:54:16,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:54:17,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:54:17,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:54:17,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 06:54:18,612 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.51 vs. limit=15.0 2023-10-04 06:54:21,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 06:54:21,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:54:22,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1564626.6666666667, ans=0.125 2023-10-04 06:54:23,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 06:54:25,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:54:28,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:54:28,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:54:28,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:54:29,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 06:54:30,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:54:32,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:54:33,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:54:40,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:54:40,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:54:43,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1564693.3333333333, ans=0.125 2023-10-04 06:54:44,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 06:54:44,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 06:54:44,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:54:46,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:54:47,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:54:47,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:54:49,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1564760.0, ans=0.07 2023-10-04 06:54:52,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 06:54:52,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1564760.0, ans=0.125 2023-10-04 06:54:53,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:54:53,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1564760.0, ans=0.125 2023-10-04 06:54:55,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1564760.0, ans=0.125 2023-10-04 06:54:56,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:54:57,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:54:57,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 06:54:57,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:54:57,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:54:58,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 06:55:01,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:55:02,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1564826.6666666667, ans=0.0 2023-10-04 06:55:04,548 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.998e+02 2.203e+02 2.459e+02 3.275e+02, threshold=4.407e+02, percent-clipped=0.0 2023-10-04 06:55:04,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:55:10,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:55:10,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1564826.6666666667, ans=0.125 2023-10-04 06:55:12,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 06:55:12,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 06:55:16,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:55:17,331 INFO [train.py:1046] (1/4) Epoch 45, batch 1000, loss[loss=0.14, simple_loss=0.2214, pruned_loss=0.02926, over 24364.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2353, pruned_loss=0.03737, over 4687495.45 frames. ], batch size: 61, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 06:55:19,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1564893.3333333333, ans=0.0 2023-10-04 06:55:20,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 06:55:20,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:55:25,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:55:26,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 06:55:26,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 06:55:30,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:55:30,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:55:32,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:55:34,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 06:55:37,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 06:55:39,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 06:55:40,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:55:44,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 06:55:45,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 06:55:45,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 06:55:47,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:55:47,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:55:57,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:55:59,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:55:59,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:56:00,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:56:00,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 06:56:00,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:56:01,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:56:01,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:56:02,006 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 06:56:02,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1565093.3333333333, ans=0.125 2023-10-04 06:56:03,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 06:56:05,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 06:56:07,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 06:56:10,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:56:12,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1565093.3333333333, ans=0.07 2023-10-04 06:56:16,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:56:18,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:56:18,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:56:18,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:56:20,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 06:56:21,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:56:21,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 06:56:23,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 06:56:24,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:56:24,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:56:28,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:56:29,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:56:29,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1565160.0, ans=0.2 2023-10-04 06:56:30,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:56:32,347 INFO [train.py:1046] (1/4) Epoch 45, batch 1050, loss[loss=0.1487, simple_loss=0.2254, pruned_loss=0.03595, over 18965.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2343, pruned_loss=0.03739, over 4686083.08 frames. ], batch size: 41, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 06:56:33,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:56:35,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:56:35,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1565226.6666666667, ans=0.1 2023-10-04 06:56:36,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:56:37,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:56:40,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:56:42,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:56:44,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:56:47,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:56:47,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:56:47,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:56:49,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:56:50,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 06:56:50,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:56:50,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 06:56:51,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:56:51,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 06:56:53,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 06:56:59,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:57:00,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:57:00,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:57:00,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1565360.0, ans=0.0 2023-10-04 06:57:01,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 06:57:01,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 06:57:02,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:57:06,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 06:57:08,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 06:57:08,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:57:12,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1565360.0, ans=0.125 2023-10-04 06:57:13,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 06:57:15,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 06:57:16,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:57:16,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:57:19,340 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.72 vs. limit=15.0 2023-10-04 06:57:20,505 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:57:21,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:57:24,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 06:57:25,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 06:57:27,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 06:57:27,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:57:27,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:57:29,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1565426.6666666667, ans=0.07 2023-10-04 06:57:30,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 06:57:33,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:57:34,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:57:34,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:57:34,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:57:36,081 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 2.013e+02 2.292e+02 2.767e+02 4.836e+02, threshold=4.583e+02, percent-clipped=4.0 2023-10-04 06:57:36,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:57:39,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:57:39,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 06:57:40,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:57:40,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 06:57:40,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 06:57:41,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:57:43,697 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.05 vs. limit=22.5 2023-10-04 06:57:46,716 INFO [train.py:1046] (1/4) Epoch 45, batch 1100, loss[loss=0.1521, simple_loss=0.229, pruned_loss=0.03766, over 22731.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2331, pruned_loss=0.03704, over 4677914.71 frames. ], batch size: 322, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 06:57:46,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:57:51,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:57:55,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:57:57,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:57:57,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:57:58,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 06:58:00,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:58:02,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:58:04,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:58:06,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:58:06,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 06:58:08,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 06:58:09,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:58:09,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:58:09,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1565626.6666666667, ans=0.125 2023-10-04 06:58:11,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:58:12,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:58:17,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:58:20,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 06:58:22,095 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 06:58:22,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:58:22,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1565693.3333333333, ans=0.2 2023-10-04 06:58:23,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1565693.3333333333, ans=0.0 2023-10-04 06:58:24,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:58:26,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:58:26,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:58:28,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 06:58:28,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:58:28,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:58:28,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:58:29,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:58:29,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 06:58:33,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1565760.0, ans=0.125 2023-10-04 06:58:36,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:58:36,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 06:58:37,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:58:40,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:58:42,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1565760.0, ans=0.0 2023-10-04 06:58:43,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 06:58:43,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:58:44,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1565826.6666666667, ans=0.1 2023-10-04 06:58:46,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:58:48,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:58:50,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:58:50,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 06:58:51,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:58:51,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:58:53,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 06:58:55,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:58:55,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 06:58:56,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:58:56,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:58:56,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:59:00,277 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.95 vs. limit=22.5 2023-10-04 06:59:00,779 INFO [train.py:1046] (1/4) Epoch 45, batch 1150, loss[loss=0.1576, simple_loss=0.2432, pruned_loss=0.03603, over 23382.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2338, pruned_loss=0.03714, over 4687399.98 frames. ], batch size: 93, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 06:59:02,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:59:02,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1565893.3333333333, ans=0.0 2023-10-04 06:59:04,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:59:06,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:59:06,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:59:06,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 06:59:06,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:59:06,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1565893.3333333333, ans=0.125 2023-10-04 06:59:09,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 06:59:11,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:59:11,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:59:11,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1565893.3333333333, ans=0.125 2023-10-04 06:59:16,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 06:59:21,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:59:24,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:59:25,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:59:25,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 06:59:25,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:59:27,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:59:30,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 06:59:33,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:59:34,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:59:42,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1566026.6666666667, ans=0.0 2023-10-04 06:59:42,568 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=4.95 vs. limit=15.0 2023-10-04 06:59:43,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:59:46,524 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.81 vs. limit=10.0 2023-10-04 06:59:46,576 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.83 vs. limit=15.0 2023-10-04 06:59:47,732 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1566093.3333333333, ans=0.2 2023-10-04 06:59:48,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:59:48,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 06:59:50,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:59:50,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:59:50,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1566093.3333333333, ans=0.125 2023-10-04 06:59:55,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1566093.3333333333, ans=0.125 2023-10-04 06:59:57,246 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 06:59:58,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:00:03,662 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 07:00:04,888 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.046e+02 2.266e+02 2.583e+02 3.861e+02, threshold=4.532e+02, percent-clipped=0.0 2023-10-04 07:00:07,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:00:08,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:00:10,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:00:10,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:00:11,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:00:13,434 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1566226.6666666667, ans=0.125 2023-10-04 07:00:14,502 INFO [train.py:1046] (1/4) Epoch 45, batch 1200, loss[loss=0.1537, simple_loss=0.2316, pruned_loss=0.03796, over 24690.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2346, pruned_loss=0.03717, over 4700328.65 frames. ], batch size: 65, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:00:16,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:00:16,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:00:18,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:00:18,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:00:18,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:00:22,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:00:24,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:00:25,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:00:25,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:00:29,180 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 07:00:31,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 07:00:33,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:00:34,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:00:37,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:00:39,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:00:39,065 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 07:00:40,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:00:46,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:00:46,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:00:46,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1566360.0, ans=0.125 2023-10-04 07:00:47,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 07:00:48,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:00:52,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 07:00:57,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 07:00:57,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:00:57,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:00:57,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1566360.0, ans=0.125 2023-10-04 07:00:59,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:00:59,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1566426.6666666667, ans=0.2 2023-10-04 07:01:00,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:01:01,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:01:01,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:01:03,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:01:03,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 07:01:04,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:01:05,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:01:05,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:01:07,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:01:07,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:01:08,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1566426.6666666667, ans=0.125 2023-10-04 07:01:12,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 07:01:14,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:01:17,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 07:01:19,942 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 07:01:20,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1566493.3333333333, ans=0.125 2023-10-04 07:01:21,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:01:23,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:01:24,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:01:26,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:01:28,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 07:01:30,154 INFO [train.py:1046] (1/4) Epoch 45, batch 1250, loss[loss=0.1398, simple_loss=0.2183, pruned_loss=0.0306, over 23498.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2353, pruned_loss=0.0373, over 4701977.23 frames. ], batch size: 134, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:01:32,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:01:33,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:01:34,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 07:01:37,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:01:38,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:01:41,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:01:42,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:01:44,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:01:44,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:01:47,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:01:49,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:01:51,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 07:01:51,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:01:52,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:01:54,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:01:56,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:01:57,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:02:02,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 07:02:03,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:02:05,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:02:06,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 07:02:06,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:02:06,756 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 07:02:06,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:02:08,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:02:10,144 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.74 vs. limit=22.5 2023-10-04 07:02:10,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:02:13,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:02:13,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:02:16,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 07:02:16,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 07:02:16,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1566760.0, ans=0.0 2023-10-04 07:02:17,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 07:02:20,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:02:20,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 07:02:20,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:02:20,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1566760.0, ans=0.125 2023-10-04 07:02:26,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 07:02:26,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:02:28,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 07:02:28,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 07:02:29,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:02:29,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:02:30,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:02:32,845 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.067e+02 2.229e+02 2.579e+02 4.151e+02, threshold=4.458e+02, percent-clipped=0.0 2023-10-04 07:02:32,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 07:02:34,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:02:36,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:02:37,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 07:02:40,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 07:02:43,075 INFO [train.py:1046] (1/4) Epoch 45, batch 1300, loss[loss=0.1757, simple_loss=0.2473, pruned_loss=0.05209, over 19919.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2354, pruned_loss=0.03739, over 4685577.56 frames. ], batch size: 389, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:02:43,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:02:43,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 07:02:46,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:02:47,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:02:47,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:02:48,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:02:50,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1566893.3333333333, ans=0.125 2023-10-04 07:02:51,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:02:52,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 07:02:53,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1566893.3333333333, ans=0.125 2023-10-04 07:02:55,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1566893.3333333333, ans=0.0 2023-10-04 07:02:56,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:02:59,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:03:00,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 07:03:04,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:03:09,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:03:11,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:03:14,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:03:14,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:03:14,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:03:15,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 07:03:15,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 07:03:19,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:03:19,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:03:21,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 07:03:22,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 07:03:25,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:03:28,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:03:28,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 07:03:28,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:03:28,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 07:03:31,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:03:36,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:03:36,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:03:41,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 07:03:42,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 07:03:43,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 07:03:45,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1567160.0, ans=0.125 2023-10-04 07:03:46,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:03:49,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 07:03:50,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:03:50,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1567160.0, ans=0.0 2023-10-04 07:03:56,819 INFO [train.py:1046] (1/4) Epoch 45, batch 1350, loss[loss=0.1557, simple_loss=0.2419, pruned_loss=0.03471, over 24457.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.234, pruned_loss=0.03707, over 4692301.09 frames. ], batch size: 66, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:03:56,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 07:04:01,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:04:02,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:04:04,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:04:06,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:04:07,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:04:07,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:04:12,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:04:12,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 07:04:12,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1567293.3333333333, ans=0.0 2023-10-04 07:04:15,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 07:04:15,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:04:19,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 07:04:20,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:04:21,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:04:21,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 07:04:23,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1567293.3333333333, ans=0.05 2023-10-04 07:04:24,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 07:04:26,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 07:04:28,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:04:28,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 07:04:39,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:04:47,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:04:47,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:04:48,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 07:04:48,438 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.60 vs. limit=15.0 2023-10-04 07:04:50,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:04:52,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 07:04:52,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 07:04:53,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:04:55,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:04:57,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1567493.3333333333, ans=0.07 2023-10-04 07:04:58,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 07:04:58,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:05:01,350 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.751e+02 2.135e+02 2.425e+02 2.910e+02 3.812e+02, threshold=4.850e+02, percent-clipped=0.0 2023-10-04 07:05:02,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 07:05:04,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 07:05:11,837 INFO [train.py:1046] (1/4) Epoch 45, batch 1400, loss[loss=0.1458, simple_loss=0.2254, pruned_loss=0.03308, over 23608.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2323, pruned_loss=0.03684, over 4699308.84 frames. ], batch size: 149, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:05:11,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 07:05:13,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:05:16,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:05:16,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:05:20,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 07:05:23,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 07:05:29,795 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.66 vs. limit=22.5 2023-10-04 07:05:33,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:05:37,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:05:38,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:05:38,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:05:42,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:05:44,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 07:05:53,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:05:53,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:05:57,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 07:05:57,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:05:57,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:05:59,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:06:00,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:06:00,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:06:00,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:06:02,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:06:03,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 07:06:03,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:06:09,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:12,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:06:18,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 07:06:20,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 07:06:21,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:06:22,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 07:06:24,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:06:25,403 INFO [train.py:1046] (1/4) Epoch 45, batch 1450, loss[loss=0.1524, simple_loss=0.2475, pruned_loss=0.02861, over 24286.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2325, pruned_loss=0.03652, over 4710453.69 frames. ], batch size: 74, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:06:25,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:06:27,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:06:29,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:06:29,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:29,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 07:06:36,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:06:37,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:06:38,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:06:38,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 07:06:40,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:06:40,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 07:06:41,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:41,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:06:41,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 07:06:43,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:06:44,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:06:45,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 07:06:45,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:06:47,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:06:47,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:50,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:06:52,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:06:52,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:06:55,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1568026.6666666667, ans=0.125 2023-10-04 07:06:56,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:06:56,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:57,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:06:57,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:06:59,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:59,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:06:59,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1568026.6666666667, ans=0.125 2023-10-04 07:07:03,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 07:07:07,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:07:11,204 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 07:07:11,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1568093.3333333333, ans=0.0 2023-10-04 07:07:11,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1568093.3333333333, ans=0.125 2023-10-04 07:07:12,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:07:14,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:07:14,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:07:15,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 07:07:19,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:07:20,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 07:07:20,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 07:07:20,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:07:23,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:07:23,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:07:25,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 07:07:26,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1568160.0, ans=0.0 2023-10-04 07:07:27,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1568160.0, ans=0.125 2023-10-04 07:07:28,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 07:07:30,138 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.783e+02 2.030e+02 2.333e+02 2.690e+02 5.278e+02, threshold=4.666e+02, percent-clipped=1.0 2023-10-04 07:07:30,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 07:07:32,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:07:33,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:07:37,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1568160.0, ans=0.05 2023-10-04 07:07:39,667 INFO [train.py:1046] (1/4) Epoch 45, batch 1500, loss[loss=0.1601, simple_loss=0.2471, pruned_loss=0.03658, over 24442.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2333, pruned_loss=0.03675, over 4719223.35 frames. ], batch size: 69, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:07:42,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 07:07:42,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:07:42,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:07:43,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:07:43,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:07:45,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:07:46,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 07:07:48,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:07:49,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:07:49,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:07:50,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:07:52,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:07:54,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:07:58,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:07:58,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 07:07:59,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:08:00,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:08:01,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:08:03,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 07:08:08,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 07:08:08,899 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:08:09,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:08:10,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 07:08:12,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:08:15,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:08:16,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:08:16,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:08:16,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 07:08:18,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:08:18,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:08:18,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 07:08:18,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:08:24,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:08:24,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 07:08:24,518 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:08:30,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 07:08:30,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:08:35,109 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 07:08:35,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:08:35,701 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.29 vs. limit=15.0 2023-10-04 07:08:36,472 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 07:08:36,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:08:38,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:08:38,517 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 07:08:39,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:08:42,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 07:08:43,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:08:46,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:08:47,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:08:48,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:08:48,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1568493.3333333333, ans=0.0 2023-10-04 07:08:49,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:08:49,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:08:50,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 07:08:52,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 07:08:52,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:08:53,555 INFO [train.py:1046] (1/4) Epoch 45, batch 1550, loss[loss=0.1628, simple_loss=0.2475, pruned_loss=0.03904, over 24046.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2351, pruned_loss=0.03757, over 4701225.14 frames. ], batch size: 80, lr: 2.26e-03, grad_scale: 4.0 2023-10-04 07:08:53,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 07:08:53,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 07:08:55,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:08:58,814 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:08:58,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:08:58,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:09:00,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:09:01,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:09:06,247 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 07:09:06,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:09:07,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:09:07,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 07:09:10,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:09:10,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 07:09:12,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:09:12,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 07:09:13,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 07:09:13,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 07:09:13,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:09:15,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:09:15,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1568626.6666666667, ans=0.125 2023-10-04 07:09:20,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:09:23,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 07:09:23,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 07:09:29,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:09:34,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:09:34,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:09:34,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:09:35,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 07:09:40,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:09:41,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:09:44,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:09:45,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1568760.0, ans=0.0 2023-10-04 07:09:47,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:09:47,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:09:49,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 07:09:49,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:09:50,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:09:50,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:09:51,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 07:09:51,988 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 07:09:55,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:09:59,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 07:10:00,710 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.950e+02 2.179e+02 2.499e+02 3.800e+02, threshold=4.358e+02, percent-clipped=0.0 2023-10-04 07:10:05,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:10:06,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:10:08,091 INFO [train.py:1046] (1/4) Epoch 45, batch 1600, loss[loss=0.1495, simple_loss=0.2417, pruned_loss=0.02862, over 24441.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2358, pruned_loss=0.03751, over 4720139.79 frames. ], batch size: 69, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:10:08,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 07:10:08,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:10:09,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:10:09,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:10:09,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:10:11,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:10:11,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1568893.3333333333, ans=0.125 2023-10-04 07:10:12,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:10:14,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 07:10:15,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 07:10:17,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 07:10:17,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1568893.3333333333, ans=0.0 2023-10-04 07:10:18,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:10:20,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 07:10:21,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:10:24,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:10:25,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1568960.0, ans=0.015 2023-10-04 07:10:28,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:10:28,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1568960.0, ans=0.1 2023-10-04 07:10:33,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 07:10:35,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:10:35,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 07:10:36,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:10:36,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 07:10:38,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1569026.6666666667, ans=0.125 2023-10-04 07:10:42,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 07:10:50,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:10:50,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 07:10:50,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1569026.6666666667, ans=0.125 2023-10-04 07:10:51,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:10:51,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1569093.3333333333, ans=0.0 2023-10-04 07:10:52,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:10:52,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:10:54,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 07:10:57,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1569093.3333333333, ans=0.125 2023-10-04 07:10:58,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 07:11:01,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:11:01,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:11:03,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:11:04,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:11:06,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:11:07,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:11:09,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:11:15,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:11:16,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:11:19,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 07:11:19,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:11:19,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 07:11:22,721 INFO [train.py:1046] (1/4) Epoch 45, batch 1650, loss[loss=0.1544, simple_loss=0.2277, pruned_loss=0.04056, over 23923.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2353, pruned_loss=0.03726, over 4724474.21 frames. ], batch size: 195, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:11:24,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:11:24,373 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:11:24,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1569226.6666666667, ans=0.125 2023-10-04 07:11:26,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:11:28,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:11:28,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 07:11:28,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 07:11:28,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 07:11:28,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 07:11:32,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:11:33,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:11:33,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:11:34,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:11:36,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:11:38,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 07:11:40,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:11:40,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:11:40,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:11:40,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:11:41,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 07:11:41,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 07:11:47,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:11:50,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:11:56,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 07:11:58,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:11:58,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1569360.0, ans=0.1 2023-10-04 07:12:00,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 07:12:03,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:12:06,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:12:06,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:12:07,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:12:08,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:12:08,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:12:11,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:12:11,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:12:13,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:12:14,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:12:14,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:12:16,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:12:18,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:12:20,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 07:12:21,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:12:21,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 07:12:23,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 07:12:23,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 07:12:23,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:12:23,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:12:23,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:12:25,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:12:25,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 07:12:27,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:12:29,079 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.045e+02 2.430e+02 3.009e+02 4.606e+02, threshold=4.861e+02, percent-clipped=4.0 2023-10-04 07:12:30,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:12:30,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:12:30,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1569493.3333333333, ans=0.125 2023-10-04 07:12:31,246 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=3.97 vs. limit=12.0 2023-10-04 07:12:32,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 07:12:36,953 INFO [train.py:1046] (1/4) Epoch 45, batch 1700, loss[loss=0.1546, simple_loss=0.2428, pruned_loss=0.03321, over 24640.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.235, pruned_loss=0.03712, over 4714027.92 frames. ], batch size: 68, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:12:37,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:12:37,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:12:38,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 07:12:38,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:12:38,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:12:38,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:12:42,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:12:42,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:12:42,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 07:12:44,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:12:50,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1569626.6666666667, ans=0.0 2023-10-04 07:12:52,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:12:56,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:13:00,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:13:01,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:13:01,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:13:01,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1569626.6666666667, ans=0.125 2023-10-04 07:13:03,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:13:04,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 07:13:08,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:13:09,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:09,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:13:12,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:13:13,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 07:13:13,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 07:13:15,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:15,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 07:13:17,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:13:23,636 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.74 vs. limit=15.0 2023-10-04 07:13:24,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:13:25,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:13:27,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:13:30,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:13:30,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 07:13:30,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:13:31,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:31,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 07:13:31,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:13:31,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:13:32,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:32,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:13:36,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:13:36,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:13:36,944 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.49 vs. limit=15.0 2023-10-04 07:13:37,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:13:37,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:13:37,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:13:37,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1569826.6666666667, ans=0.125 2023-10-04 07:13:38,238 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.97 vs. limit=6.0 2023-10-04 07:13:40,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:13:41,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 07:13:43,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:13:44,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:13:47,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 07:13:50,347 INFO [train.py:1046] (1/4) Epoch 45, batch 1750, loss[loss=0.1624, simple_loss=0.2302, pruned_loss=0.04732, over 23805.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2338, pruned_loss=0.03696, over 4710730.75 frames. ], batch size: 212, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:13:51,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:13:52,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1569893.3333333333, ans=0.1 2023-10-04 07:13:53,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:13:53,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:13:55,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 07:13:55,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:55,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1569893.3333333333, ans=0.1 2023-10-04 07:13:59,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:13:59,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:14:02,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1569893.3333333333, ans=0.125 2023-10-04 07:14:03,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 07:14:07,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:14:10,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 07:14:10,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:14:11,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:14:13,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 07:14:14,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 07:14:16,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:14:16,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 07:14:24,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:14:26,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:14:26,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:14:30,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:14:30,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:14:31,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:14:35,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:14:35,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1570093.3333333333, ans=0.125 2023-10-04 07:14:38,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:14:38,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:14:39,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 07:14:41,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:14:43,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 07:14:45,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:14:47,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:14:47,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1570093.3333333333, ans=0.125 2023-10-04 07:14:48,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:14:51,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:14:51,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 07:14:52,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:14:54,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:14:56,584 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.970e+02 2.244e+02 2.815e+02 4.858e+02, threshold=4.488e+02, percent-clipped=0.0 2023-10-04 07:14:56,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1570160.0, ans=0.125 2023-10-04 07:14:58,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:14:58,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1570160.0, ans=0.125 2023-10-04 07:15:00,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:15:02,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:15:02,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 07:15:04,204 INFO [train.py:1046] (1/4) Epoch 45, batch 1800, loss[loss=0.1541, simple_loss=0.2342, pruned_loss=0.03696, over 24347.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.233, pruned_loss=0.03673, over 4713370.75 frames. ], batch size: 61, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:15:04,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:15:06,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:15:06,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:15:06,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:15:06,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:15:08,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:15:09,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:15:10,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:15:12,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 07:15:12,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1570226.6666666667, ans=0.0 2023-10-04 07:15:13,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:15:16,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:15:17,010 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.21 vs. limit=12.0 2023-10-04 07:15:19,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:15:19,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1570293.3333333333, ans=0.2 2023-10-04 07:15:21,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:15:23,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:15:23,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:15:25,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:15:28,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:15:28,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 07:15:29,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:15:31,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:15:35,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 07:15:37,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 07:15:38,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 07:15:38,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:15:40,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:15:40,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:15:41,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:15:42,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1570360.0, ans=0.125 2023-10-04 07:15:48,329 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 07:15:49,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:15:50,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:15:52,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 07:15:52,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 07:15:52,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:15:55,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:15:55,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:15:57,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 07:16:05,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:16:05,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 07:16:06,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:16:06,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:16:07,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:16:08,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 07:16:09,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:16:11,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:16:14,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 07:16:14,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:16:17,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:16:17,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:16:17,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:16:18,864 INFO [train.py:1046] (1/4) Epoch 45, batch 1850, loss[loss=0.1573, simple_loss=0.2377, pruned_loss=0.03842, over 23933.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2333, pruned_loss=0.03681, over 4713176.56 frames. ], batch size: 196, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:16:18,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:16:20,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:16:20,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:16:21,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:16:23,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1570560.0, ans=0.125 2023-10-04 07:16:23,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1570560.0, ans=0.125 2023-10-04 07:16:24,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:16:24,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:16:24,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1570560.0, ans=0.125 2023-10-04 07:16:32,156 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.00 vs. limit=15.0 2023-10-04 07:16:33,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:16:33,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 07:16:35,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 07:16:39,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 07:16:44,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:16:44,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 07:16:44,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 07:16:51,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:16:53,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 07:16:55,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:16:56,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:17:01,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 07:17:01,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:03,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:17:04,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:17:06,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:17:09,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:17:13,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:17:13,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:13,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 07:17:13,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:17:15,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:17:17,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:17:21,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 07:17:21,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:17:24,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:17:25,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:17:25,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 07:17:25,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 07:17:27,196 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 2.009e+02 2.134e+02 2.480e+02 4.687e+02, threshold=4.268e+02, percent-clipped=1.0 2023-10-04 07:17:27,393 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 07:17:28,728 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 07:17:30,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:17:30,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:17:30,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:17:30,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:31,519 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 07:17:31,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:17:32,795 INFO [train.py:1046] (1/4) Epoch 45, batch 1900, loss[loss=0.1455, simple_loss=0.2278, pruned_loss=0.03157, over 23428.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2335, pruned_loss=0.03655, over 4709159.30 frames. ], batch size: 106, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:17:32,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:34,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:17:34,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:17:35,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:17:35,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 07:17:37,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:37,688 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 07:17:37,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:17:39,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:17:44,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:17:44,886 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.13 vs. limit=6.0 2023-10-04 07:17:45,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:17:47,355 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 07:17:47,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 07:17:50,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:17:50,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:17:50,624 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 07:17:51,898 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 07:17:54,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 07:17:56,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:18:00,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 07:18:01,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 07:18:03,472 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.23 vs. limit=15.0 2023-10-04 07:18:12,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 07:18:13,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 07:18:13,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:18:13,988 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 07:18:13,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 07:18:15,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 07:18:15,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1571026.6666666667, ans=0.0 2023-10-04 07:18:17,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 07:18:17,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:18:20,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 07:18:23,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:18:26,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:18:26,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 07:18:27,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:18:30,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 07:18:30,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:18:36,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:18:36,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:18:37,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:18:37,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:18:39,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:18:39,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 07:18:41,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:18:44,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:18:44,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:18:46,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:18:46,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:18:46,704 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.35 vs. limit=15.0 2023-10-04 07:18:47,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:18:47,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1571226.6666666667, ans=0.125 2023-10-04 07:18:48,848 INFO [train.py:1046] (1/4) Epoch 45, batch 1950, loss[loss=0.1629, simple_loss=0.257, pruned_loss=0.03439, over 24398.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2348, pruned_loss=0.03705, over 4712338.29 frames. ], batch size: 77, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:18:48,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:18:51,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:18:53,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:18:53,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:18:53,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:18:56,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 07:18:56,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1571226.6666666667, ans=0.0 2023-10-04 07:18:57,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 07:18:59,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:00,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:01,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:19:01,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:03,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:03,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:19:04,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:19:04,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:19:06,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:19:06,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:12,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:16,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:19:16,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:16,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:19:16,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 07:19:16,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:19:16,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:19:18,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:21,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:22,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:19:24,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:19:27,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:19:27,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:19:29,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 07:19:29,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:19:33,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:19:34,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:19:35,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:19:43,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:44,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:47,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:52,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:55,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:19:55,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:55,933 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.34 vs. limit=22.5 2023-10-04 07:19:56,527 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.043e+02 2.250e+02 2.599e+02 4.071e+02, threshold=4.500e+02, percent-clipped=0.0 2023-10-04 07:19:56,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 07:19:56,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:19:58,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:59,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 07:20:00,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:20:02,319 INFO [train.py:1046] (1/4) Epoch 45, batch 2000, loss[loss=0.1577, simple_loss=0.2534, pruned_loss=0.031, over 24339.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2352, pruned_loss=0.03706, over 4723459.79 frames. ], batch size: 74, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:20:05,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:20:06,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:20:07,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:20:08,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:20:10,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:20:11,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 07:20:13,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:20:18,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:20:19,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 07:20:19,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:20:19,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:20:22,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:20:23,036 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1571626.6666666667, ans=0.1 2023-10-04 07:20:26,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 07:20:27,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:28,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:28,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:30,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 07:20:32,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:20:33,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 07:20:33,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:20:36,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:20:37,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 07:20:37,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:37,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:20:38,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:20:39,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 07:20:43,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 07:20:43,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:20:43,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:20:45,810 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.45 vs. limit=15.0 2023-10-04 07:20:47,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:20:48,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:20:49,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:20:49,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:20:50,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:20:52,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:20:52,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:20:52,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:20:52,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:55,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:20:57,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 07:21:01,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:21:03,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:03,792 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.93 vs. limit=15.0 2023-10-04 07:21:07,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:07,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:21:11,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:12,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:21:12,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:15,608 INFO [train.py:1046] (1/4) Epoch 45, batch 2050, loss[loss=0.1537, simple_loss=0.2354, pruned_loss=0.036, over 24307.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.234, pruned_loss=0.03712, over 4702562.00 frames. ], batch size: 61, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:21:15,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:21:15,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:21:18,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:18,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:19,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1571893.3333333333, ans=0.125 2023-10-04 07:21:20,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:21:21,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:26,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:21:26,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1571893.3333333333, ans=0.0 2023-10-04 07:21:27,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:21:27,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:29,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:21:32,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 07:21:32,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:21:33,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:21:33,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:21:33,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1571960.0, ans=0.125 2023-10-04 07:21:42,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:21:42,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:43,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 07:21:47,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:47,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 07:21:47,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:21:50,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:21:52,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:21:54,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:21:56,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:21:56,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1572026.6666666667, ans=0.1 2023-10-04 07:21:57,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:21:59,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:21:59,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:22:02,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:22:03,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:22:06,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:22:08,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:22:12,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:22:16,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:22:17,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 07:22:20,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1572160.0, ans=0.2 2023-10-04 07:22:23,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:22:24,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:22:25,688 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 2.015e+02 2.281e+02 2.581e+02 4.368e+02, threshold=4.563e+02, percent-clipped=0.0 2023-10-04 07:22:27,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:22:29,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 07:22:30,491 INFO [train.py:1046] (1/4) Epoch 45, batch 2100, loss[loss=0.1578, simple_loss=0.2322, pruned_loss=0.04166, over 23421.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2332, pruned_loss=0.03677, over 4701996.53 frames. ], batch size: 119, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:22:31,938 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 07:22:31,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:22:33,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:22:33,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:22:34,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:22:34,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 07:22:34,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 07:22:37,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:22:40,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:22:40,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:22:41,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1572226.6666666667, ans=0.125 2023-10-04 07:22:42,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:22:43,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:22:43,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 07:22:44,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:22:46,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 07:22:46,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 07:22:47,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:22:47,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:22:47,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 07:22:47,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 07:22:50,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1572293.3333333333, ans=0.2 2023-10-04 07:22:52,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 07:22:52,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:22:56,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:22:56,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:22:59,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:23:01,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 07:23:01,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:23:01,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 07:23:04,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 07:23:04,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:23:04,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 07:23:04,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 07:23:04,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 07:23:07,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:23:08,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:23:09,740 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.76 vs. limit=15.0 2023-10-04 07:23:11,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:23:11,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:23:13,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:23:16,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:23:16,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 07:23:16,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:23:16,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:23:16,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:23:16,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 07:23:18,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 07:23:19,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 07:23:23,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:23:25,445 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:23:26,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 07:23:32,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:23:33,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:23:33,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:23:34,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:23:34,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 07:23:35,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:23:37,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:23:37,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:23:38,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:23:38,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:23:40,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 07:23:40,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1572493.3333333333, ans=0.125 2023-10-04 07:23:41,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 07:23:42,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:23:44,318 INFO [train.py:1046] (1/4) Epoch 45, batch 2150, loss[loss=0.1574, simple_loss=0.251, pruned_loss=0.0319, over 24631.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2332, pruned_loss=0.03618, over 4711189.26 frames. ], batch size: 68, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:23:46,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:23:46,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:23:46,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:23:47,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:23:48,685 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=4.88 vs. limit=10.0 2023-10-04 07:23:53,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 07:23:55,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:23:56,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:23:57,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:23:57,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:23:58,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:23:58,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1572626.6666666667, ans=0.125 2023-10-04 07:24:01,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:24:01,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:24:01,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:24:07,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:07,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 07:24:11,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:24:12,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:24:14,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:14,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:24:15,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:15,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:24:15,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:24:15,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:24:17,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:24:18,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 07:24:20,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:24:20,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:24:21,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:24:22,598 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.03 vs. limit=12.0 2023-10-04 07:24:23,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:24:26,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:24:27,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:24:27,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:24:27,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1572760.0, ans=0.125 2023-10-04 07:24:29,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:24:29,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 07:24:29,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:24:31,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:24:33,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:33,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:24:35,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:24:35,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:24:36,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:36,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 07:24:38,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 07:24:38,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:24:38,254 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 07:24:38,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:24:38,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:24:39,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 07:24:39,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:24:39,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 07:24:39,876 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1572760.0, ans=0.1 2023-10-04 07:24:40,930 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 07:24:40,931 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 07:24:40,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 07:24:42,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:24:43,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:24:43,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:24:43,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:45,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 07:24:47,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:24:47,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:48,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1572826.6666666667, ans=0.125 2023-10-04 07:24:51,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1572826.6666666667, ans=0.2 2023-10-04 07:24:54,369 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 1.986e+02 2.257e+02 2.566e+02 4.341e+02, threshold=4.515e+02, percent-clipped=0.0 2023-10-04 07:24:55,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:24:55,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 07:24:57,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1572893.3333333333, ans=0.125 2023-10-04 07:24:58,662 INFO [train.py:1046] (1/4) Epoch 45, batch 2200, loss[loss=0.1403, simple_loss=0.2294, pruned_loss=0.02561, over 24347.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2331, pruned_loss=0.03623, over 4718007.36 frames. ], batch size: 61, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:25:00,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:25:00,972 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.66 vs. limit=15.0 2023-10-04 07:25:06,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:25:06,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:25:06,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:25:08,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:25:09,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:25:10,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:25:10,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 07:25:12,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1572960.0, ans=0.125 2023-10-04 07:25:14,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 07:25:18,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:25:18,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1572960.0, ans=0.125 2023-10-04 07:25:18,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1572960.0, ans=0.125 2023-10-04 07:25:22,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 07:25:26,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:25:26,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:25:26,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1572960.0, ans=0.0 2023-10-04 07:25:27,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:25:30,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:25:31,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 07:25:36,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:25:38,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:25:38,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 07:25:39,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1573026.6666666667, ans=0.125 2023-10-04 07:25:40,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:25:42,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:25:43,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:25:45,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:25:47,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 07:25:48,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1573093.3333333333, ans=0.125 2023-10-04 07:25:49,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:25:52,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 07:25:55,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:25:55,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:25:55,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:25:57,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:25:57,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:25:58,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:25:58,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:25:58,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:25:58,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:26:00,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:26:04,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 07:26:04,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:26:06,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:26:08,786 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 07:26:08,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:26:10,902 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 07:26:12,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:26:12,231 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 07:26:13,495 INFO [train.py:1046] (1/4) Epoch 45, batch 2250, loss[loss=0.1836, simple_loss=0.2567, pruned_loss=0.05525, over 23781.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2345, pruned_loss=0.0372, over 4709219.93 frames. ], batch size: 164, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:26:13,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:26:14,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 07:26:16,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:26:17,750 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 07:26:17,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:26:18,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1573226.6666666667, ans=0.125 2023-10-04 07:26:20,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:26:26,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:26:26,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1573293.3333333333, ans=0.1 2023-10-04 07:26:26,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1573293.3333333333, ans=0.0 2023-10-04 07:26:28,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:26:30,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:26:32,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:26:32,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:26:38,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 07:26:38,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:26:39,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:26:41,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1573293.3333333333, ans=0.125 2023-10-04 07:26:42,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 07:26:42,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:26:44,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:26:45,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:26:48,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:26:48,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1573360.0, ans=0.2 2023-10-04 07:26:50,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:26:50,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:26:52,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 07:26:54,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:26:57,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:27:01,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:27:02,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:27:03,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:27:03,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:27:05,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:27:07,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:27:07,906 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1573426.6666666667, ans=0.0 2023-10-04 07:27:12,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:27:13,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:27:19,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:27:19,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:27:19,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1573493.3333333333, ans=0.2 2023-10-04 07:27:20,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:27:25,020 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 1.969e+02 2.340e+02 2.701e+02 4.212e+02, threshold=4.680e+02, percent-clipped=0.0 2023-10-04 07:27:25,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 07:27:26,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:27:26,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 07:27:26,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:27:28,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:27:29,710 INFO [train.py:1046] (1/4) Epoch 45, batch 2300, loss[loss=0.1685, simple_loss=0.2375, pruned_loss=0.04977, over 23683.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2345, pruned_loss=0.03698, over 4725292.91 frames. ], batch size: 232, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:27:29,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 07:27:33,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:27:34,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:27:39,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:27:39,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:27:43,152 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 07:27:43,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:27:51,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:27:51,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 07:27:52,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:27:53,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:27:53,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 07:27:53,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:27:55,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:27:56,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:27:59,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:28:01,753 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.11 vs. limit=15.0 2023-10-04 07:28:02,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:28:03,169 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.36 vs. limit=15.0 2023-10-04 07:28:06,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:28:08,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1573693.3333333333, ans=0.0 2023-10-04 07:28:09,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:28:09,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:28:13,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:28:15,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:28:19,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:28:21,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:28:21,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:28:21,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 07:28:25,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 07:28:25,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:28:26,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:28:28,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:28:28,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:28:28,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 07:28:28,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:28:28,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 07:28:29,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:28:29,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:28:29,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 07:28:36,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:28:40,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:28:42,831 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.60 vs. limit=10.0 2023-10-04 07:28:43,545 INFO [train.py:1046] (1/4) Epoch 45, batch 2350, loss[loss=0.1527, simple_loss=0.2355, pruned_loss=0.03491, over 23535.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2355, pruned_loss=0.03716, over 4732770.49 frames. ], batch size: 120, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:28:43,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:28:43,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:28:44,404 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.40 vs. limit=12.0 2023-10-04 07:28:45,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:28:46,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 07:28:46,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:28:47,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:28:47,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 07:28:53,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:28:53,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 07:28:59,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 07:29:02,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:29:03,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1573960.0, ans=0.2 2023-10-04 07:29:04,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:29:04,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:29:04,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:29:06,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:29:06,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 07:29:09,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:29:13,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 07:29:15,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:29:15,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1574026.6666666667, ans=0.1 2023-10-04 07:29:18,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:29:18,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:29:19,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:29:21,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 07:29:22,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:29:23,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:29:23,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:29:25,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:29:28,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:29:30,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 07:29:31,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:29:31,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1574093.3333333333, ans=0.0 2023-10-04 07:29:33,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:29:33,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:29:36,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 07:29:37,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:29:39,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 07:29:39,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:29:43,246 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.71 vs. limit=10.0 2023-10-04 07:29:45,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 07:29:48,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1574160.0, ans=0.125 2023-10-04 07:29:49,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 07:29:49,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:29:49,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 07:29:50,595 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 07:29:50,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 07:29:52,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 07:29:53,380 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.701e+02 2.034e+02 2.312e+02 2.597e+02 4.365e+02, threshold=4.623e+02, percent-clipped=0.0 2023-10-04 07:29:55,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:29:58,610 INFO [train.py:1046] (1/4) Epoch 45, batch 2400, loss[loss=0.145, simple_loss=0.2252, pruned_loss=0.03238, over 21533.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2359, pruned_loss=0.03705, over 4735718.67 frames. ], batch size: 47, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:29:58,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:30:01,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:30:03,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:30:04,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 07:30:04,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 07:30:12,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 07:30:12,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:30:15,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 07:30:16,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:30:17,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:30:17,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 07:30:22,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:30:23,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1574293.3333333333, ans=0.125 2023-10-04 07:30:24,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 07:30:29,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:30:29,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1574360.0, ans=0.0 2023-10-04 07:30:33,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 07:30:34,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:30:37,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:30:40,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1574426.6666666667, ans=0.0 2023-10-04 07:30:42,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:30:42,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 07:30:42,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:30:49,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:30:49,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1574426.6666666667, ans=0.2 2023-10-04 07:30:52,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:30:53,034 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.17 vs. limit=22.5 2023-10-04 07:30:55,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:30:55,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:30:55,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 07:30:55,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:30:55,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:30:57,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:30:57,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 07:30:57,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1574493.3333333333, ans=0.025 2023-10-04 07:31:01,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:31:03,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:31:03,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 07:31:03,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 07:31:03,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1574493.3333333333, ans=0.125 2023-10-04 07:31:05,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:31:05,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:31:07,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 07:31:07,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 07:31:08,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 07:31:08,641 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 07:31:10,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 07:31:11,743 INFO [train.py:1046] (1/4) Epoch 45, batch 2450, loss[loss=0.1394, simple_loss=0.2246, pruned_loss=0.02703, over 24290.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2354, pruned_loss=0.03698, over 4741864.46 frames. ], batch size: 61, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:31:11,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:31:11,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:31:11,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:31:13,818 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 07:31:15,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:31:15,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 07:31:19,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:31:19,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:31:22,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:31:22,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:31:22,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 07:31:26,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:31:28,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:31:29,557 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.33 vs. limit=6.0 2023-10-04 07:31:31,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:31:32,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:31:32,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:31:33,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 07:31:36,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:31:37,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:31:39,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:31:44,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:31:44,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:31:45,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:31:46,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:31:48,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 07:31:49,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:31:52,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1574693.3333333333, ans=0.0 2023-10-04 07:31:55,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:31:57,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:31:57,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:31:58,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:31:58,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:32:00,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:32:02,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 07:32:04,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:32:04,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:32:07,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:32:07,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:32:13,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:32:13,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 07:32:14,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1574826.6666666667, ans=0.5 2023-10-04 07:32:15,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:32:15,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:32:15,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 07:32:17,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:32:18,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:32:21,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:32:22,524 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 2.042e+02 2.331e+02 2.732e+02 3.935e+02, threshold=4.662e+02, percent-clipped=0.0 2023-10-04 07:32:24,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:32:24,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:32:26,611 INFO [train.py:1046] (1/4) Epoch 45, batch 2500, loss[loss=0.1561, simple_loss=0.236, pruned_loss=0.03811, over 21945.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2343, pruned_loss=0.03655, over 4727336.36 frames. ], batch size: 48, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:32:28,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 07:32:28,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:32:28,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1574893.3333333333, ans=0.125 2023-10-04 07:32:36,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:32:42,415 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.55 vs. limit=15.0 2023-10-04 07:32:44,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:32:45,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:32:46,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:32:46,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 07:32:52,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:32:52,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:32:53,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 07:32:53,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 07:32:54,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 07:32:54,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:32:56,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:32:56,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 07:32:56,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:32:58,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 07:32:58,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:33:00,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:33:02,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:33:05,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:33:05,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 07:33:07,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:33:08,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:33:10,277 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1575093.3333333333, ans=0.125 2023-10-04 07:33:11,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:33:15,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1575093.3333333333, ans=0.125 2023-10-04 07:33:16,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:33:20,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:33:24,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:33:26,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 07:33:26,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:33:26,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 07:33:28,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:33:28,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:33:29,480 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 07:33:29,480 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 07:33:29,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 07:33:31,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1575160.0, ans=0.1 2023-10-04 07:33:32,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:33:33,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 07:33:33,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 07:33:35,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:33:35,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 07:33:37,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1575160.0, ans=0.2 2023-10-04 07:33:38,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 07:33:40,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:33:41,301 INFO [train.py:1046] (1/4) Epoch 45, batch 2550, loss[loss=0.1553, simple_loss=0.2376, pruned_loss=0.03654, over 23471.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2347, pruned_loss=0.0369, over 4713991.52 frames. ], batch size: 106, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:33:43,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:33:44,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:33:47,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:33:48,153 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.11 vs. limit=15.0 2023-10-04 07:33:48,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 07:33:48,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:33:49,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1575226.6666666667, ans=0.0 2023-10-04 07:33:49,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1575226.6666666667, ans=0.0 2023-10-04 07:33:53,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 07:33:53,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:33:53,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1575226.6666666667, ans=0.125 2023-10-04 07:33:56,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:33:58,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:33:58,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 07:33:59,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:33:59,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:33:59,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:34:00,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:34:02,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 07:34:02,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 07:34:02,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:34:02,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 07:34:16,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:34:20,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1575360.0, ans=0.1 2023-10-04 07:34:21,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:34:21,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:34:21,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:34:23,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:34:26,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1575426.6666666667, ans=0.125 2023-10-04 07:34:29,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:34:29,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1575426.6666666667, ans=0.0 2023-10-04 07:34:30,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:34:31,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:34:31,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:34:32,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:34:32,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:34:35,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:34:35,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:34:36,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1575426.6666666667, ans=0.125 2023-10-04 07:34:37,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1575426.6666666667, ans=0.2 2023-10-04 07:34:39,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:34:40,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 07:34:40,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:34:40,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:34:42,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:34:42,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:34:44,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:34:50,977 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.015e+02 2.178e+02 2.479e+02 3.759e+02, threshold=4.356e+02, percent-clipped=0.0 2023-10-04 07:34:51,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:34:52,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:34:55,042 INFO [train.py:1046] (1/4) Epoch 45, batch 2600, loss[loss=0.1519, simple_loss=0.2321, pruned_loss=0.03584, over 23805.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2351, pruned_loss=0.03704, over 4719902.71 frames. ], batch size: 179, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:34:55,205 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 07:34:55,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1575560.0, ans=0.2 2023-10-04 07:34:58,696 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 07:34:58,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:35:00,058 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 07:35:00,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 07:35:00,169 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 07:35:04,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:35:04,331 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 07:35:05,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 07:35:05,771 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 07:35:08,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:35:11,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 07:35:11,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 07:35:13,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:35:14,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 07:35:16,556 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 07:35:16,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 07:35:24,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:35:24,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:35:25,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:35:25,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 07:35:27,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:35:31,961 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 07:35:37,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:35:38,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:35:38,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 07:35:38,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:35:38,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:35:40,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 07:35:45,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:35:45,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:35:46,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:35:50,523 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 07:35:50,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:35:51,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:35:59,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:35:59,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:35:59,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 07:36:00,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:36:00,993 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.33 vs. limit=15.0 2023-10-04 07:36:01,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:36:03,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:36:04,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1575826.6666666667, ans=0.09899494936611666 2023-10-04 07:36:08,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 07:36:10,052 INFO [train.py:1046] (1/4) Epoch 45, batch 2650, loss[loss=0.1529, simple_loss=0.2427, pruned_loss=0.03156, over 24471.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2356, pruned_loss=0.03685, over 4733219.71 frames. ], batch size: 69, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:36:10,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:36:10,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1575893.3333333333, ans=0.1 2023-10-04 07:36:10,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1575893.3333333333, ans=10.0 2023-10-04 07:36:11,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:36:16,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 07:36:16,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:36:18,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 07:36:20,236 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 07:36:20,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:36:23,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:36:25,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 07:36:27,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:36:29,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:36:31,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 07:36:31,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:36:31,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:36:33,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 07:36:36,029 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 07:36:37,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:36:40,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 07:36:40,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:36:40,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 07:36:45,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:36:45,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 07:36:45,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:36:45,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:36:45,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1576026.6666666667, ans=0.125 2023-10-04 07:36:49,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 07:36:49,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 07:36:52,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1576026.6666666667, ans=0.2 2023-10-04 07:36:53,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:36:53,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1576093.3333333333, ans=0.1 2023-10-04 07:36:57,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 07:36:57,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:36:58,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:36:58,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:36:59,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:36:59,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:37:01,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:37:03,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:37:04,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:37:05,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:37:05,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:37:06,201 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.22 vs. limit=6.0 2023-10-04 07:37:07,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:37:07,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:37:07,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:37:08,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:37:10,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 07:37:14,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:37:14,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:37:15,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:37:15,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 07:37:15,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1576160.0, ans=0.0 2023-10-04 07:37:17,823 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.28 vs. limit=22.5 2023-10-04 07:37:19,696 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 2.024e+02 2.233e+02 2.497e+02 3.556e+02, threshold=4.465e+02, percent-clipped=0.0 2023-10-04 07:37:19,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:37:21,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:37:22,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:37:24,510 INFO [train.py:1046] (1/4) Epoch 45, batch 2700, loss[loss=0.1522, simple_loss=0.2409, pruned_loss=0.03177, over 24614.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2367, pruned_loss=0.03712, over 4727157.74 frames. ], batch size: 68, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:37:24,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:37:24,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:37:26,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:37:29,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:37:29,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 07:37:31,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:37:34,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 07:37:35,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:37:35,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:37:35,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:37:38,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:37:38,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:37:38,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:37:38,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:37:38,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 07:37:40,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:37:41,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:37:43,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:37:43,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:37:43,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1576293.3333333333, ans=0.125 2023-10-04 07:37:45,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:37:47,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 07:37:47,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:37:50,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:37:50,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:37:50,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1576293.3333333333, ans=0.0 2023-10-04 07:37:58,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:37:58,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:37:58,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:37:58,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:38:01,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:38:04,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:38:04,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:38:04,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:38:09,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:38:09,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:38:15,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1576426.6666666667, ans=0.125 2023-10-04 07:38:16,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:38:16,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:38:19,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:38:19,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:38:22,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:38:23,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:38:24,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:38:24,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1576493.3333333333, ans=0.125 2023-10-04 07:38:25,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:28,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:38:28,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:38:30,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:38:31,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:38:31,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:38:35,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 07:38:35,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:38:39,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:38:39,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 07:38:40,774 INFO [train.py:1046] (1/4) Epoch 45, batch 2750, loss[loss=0.1647, simple_loss=0.2556, pruned_loss=0.03684, over 24401.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2358, pruned_loss=0.03714, over 4732420.97 frames. ], batch size: 69, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:38:40,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 07:38:40,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:38:42,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:38:43,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:38:45,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:45,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:38:45,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:49,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:38:49,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 07:38:49,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:38:49,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:49,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 07:38:49,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:38:49,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:38:55,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 07:38:56,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1576626.6666666667, ans=0.125 2023-10-04 07:38:57,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:38:58,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:59,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1576626.6666666667, ans=0.1 2023-10-04 07:39:00,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:39:01,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 07:39:01,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1576626.6666666667, ans=0.125 2023-10-04 07:39:02,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:39:04,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:39:04,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:39:04,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:39:08,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:39:08,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 07:39:09,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:39:10,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:39:11,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:39:13,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1576693.3333333333, ans=0.0 2023-10-04 07:39:16,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1576693.3333333333, ans=0.1 2023-10-04 07:39:17,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1576693.3333333333, ans=0.0 2023-10-04 07:39:18,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:39:20,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:39:20,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:39:25,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:39:25,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:39:26,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:39:27,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1576760.0, ans=0.125 2023-10-04 07:39:32,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:39:34,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:39:34,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 07:39:38,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:39:41,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 07:39:46,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 07:39:48,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:39:48,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 07:39:50,135 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.025e+02 2.213e+02 2.495e+02 4.523e+02, threshold=4.427e+02, percent-clipped=1.0 2023-10-04 07:39:50,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:39:51,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:39:51,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 07:39:51,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1576826.6666666667, ans=0.0 2023-10-04 07:39:53,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:39:54,450 INFO [train.py:1046] (1/4) Epoch 45, batch 2800, loss[loss=0.1551, simple_loss=0.2206, pruned_loss=0.04484, over 23724.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2345, pruned_loss=0.03688, over 4728237.36 frames. ], batch size: 179, lr: 2.26e-03, grad_scale: 32.0 2023-10-04 07:39:54,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 07:39:55,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:39:56,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:39:56,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 07:39:56,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:39:57,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:39:59,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:39:59,330 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 07:39:59,330 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 07:40:03,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:40:05,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:40:05,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:40:08,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:40:09,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 07:40:12,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 07:40:12,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1576960.0, ans=0.0 2023-10-04 07:40:14,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 07:40:14,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1576960.0, ans=0.2 2023-10-04 07:40:15,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:40:15,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:40:15,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:40:16,310 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.03 vs. limit=15.0 2023-10-04 07:40:19,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:40:19,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:40:19,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:40:21,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:40:24,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1577026.6666666667, ans=0.125 2023-10-04 07:40:28,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:40:30,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:40:32,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:40:34,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:40:34,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:40:39,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:40:39,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 07:40:39,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:40:40,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:40:41,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:40:45,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:40:46,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:40:50,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:40:51,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1577093.3333333333, ans=0.125 2023-10-04 07:40:52,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:40:52,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:40:52,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:40:52,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:40:52,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:40:53,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:40:55,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 07:40:55,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:40:56,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:40:56,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:40:59,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 07:41:00,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:41:00,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:41:00,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:41:01,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 07:41:08,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:41:08,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:41:08,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:41:09,407 INFO [train.py:1046] (1/4) Epoch 45, batch 2850, loss[loss=0.1425, simple_loss=0.2292, pruned_loss=0.02796, over 24316.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2341, pruned_loss=0.03661, over 4743839.85 frames. ], batch size: 61, lr: 2.26e-03, grad_scale: 32.0 2023-10-04 07:41:11,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:41:11,933 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.88 vs. limit=6.0 2023-10-04 07:41:14,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:41:15,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:41:15,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:41:17,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:41:17,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1577226.6666666667, ans=0.2 2023-10-04 07:41:18,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:41:19,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:41:19,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 07:41:26,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 07:41:26,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:41:29,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 07:41:31,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:41:31,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1577293.3333333333, ans=0.1 2023-10-04 07:41:33,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 07:41:33,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 07:41:33,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1577293.3333333333, ans=0.125 2023-10-04 07:41:34,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:41:46,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:41:48,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:41:48,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:41:48,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:41:49,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:41:49,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:41:51,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:41:52,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 07:41:54,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:41:54,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:41:55,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:41:55,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:41:55,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1577426.6666666667, ans=0.125 2023-10-04 07:41:58,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:41:58,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1577426.6666666667, ans=0.2 2023-10-04 07:42:00,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:42:00,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:42:02,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:42:02,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:42:03,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:42:04,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:42:06,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:42:10,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:42:12,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 07:42:12,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 07:42:15,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 07:42:15,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:42:15,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 07:42:16,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:42:17,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:42:17,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:42:17,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:42:17,983 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 07:42:19,175 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.756e+02 2.081e+02 2.377e+02 2.931e+02 5.661e+02, threshold=4.754e+02, percent-clipped=4.0 2023-10-04 07:42:19,279 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 07:42:19,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:42:19,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:42:23,479 INFO [train.py:1046] (1/4) Epoch 45, batch 2900, loss[loss=0.1451, simple_loss=0.2277, pruned_loss=0.03124, over 24564.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2346, pruned_loss=0.03651, over 4751586.63 frames. ], batch size: 60, lr: 2.26e-03, grad_scale: 32.0 2023-10-04 07:42:23,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:42:23,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:42:24,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:42:25,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 07:42:29,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:42:29,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 07:42:29,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 07:42:30,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:42:32,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:42:33,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:42:34,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:42:39,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:42:39,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:42:42,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:42:42,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 07:42:42,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:42:45,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:42:46,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 07:42:46,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 07:42:46,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=1577626.6666666667, ans=0.05 2023-10-04 07:42:50,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:42:50,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 07:42:50,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:42:53,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:42:53,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:42:56,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:42:57,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:42:57,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1577693.3333333333, ans=0.125 2023-10-04 07:43:02,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:43:05,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:43:08,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 07:43:08,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 07:43:08,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:43:13,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:43:14,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 07:43:14,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:43:21,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:43:30,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:43:30,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:43:30,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1577826.6666666667, ans=0.0 2023-10-04 07:43:32,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 07:43:33,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:43:33,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 07:43:34,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:43:34,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:43:38,096 INFO [train.py:1046] (1/4) Epoch 45, batch 2950, loss[loss=0.1508, simple_loss=0.2246, pruned_loss=0.03851, over 23746.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2355, pruned_loss=0.03649, over 4751942.08 frames. ], batch size: 164, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:43:42,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:43:42,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 07:43:44,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:43:44,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:43:47,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:43:47,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:43:49,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 07:43:49,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 07:43:49,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1577893.3333333333, ans=0.0 2023-10-04 07:43:50,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:43:50,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:43:57,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:43:58,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:43:59,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:44:01,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:44:03,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:44:03,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:44:06,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:44:07,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:44:07,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:44:11,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 07:44:17,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 07:44:17,108 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 07:44:17,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:44:20,350 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 07:44:21,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 07:44:21,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:44:23,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:44:23,057 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 07:44:23,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 07:44:24,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 07:44:25,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:44:25,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:44:26,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1578093.3333333333, ans=0.125 2023-10-04 07:44:30,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:44:31,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:44:31,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:44:32,733 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 07:44:32,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:44:32,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 07:44:38,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:44:39,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1578160.0, ans=0.1 2023-10-04 07:44:40,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:44:40,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 07:44:42,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:44:43,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 07:44:46,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:44:46,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1578160.0, ans=0.05 2023-10-04 07:44:48,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:44:48,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:44:48,807 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.66 vs. limit=15.0 2023-10-04 07:44:49,416 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.943e+02 2.173e+02 2.663e+02 4.538e+02, threshold=4.346e+02, percent-clipped=0.0 2023-10-04 07:44:49,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:44:49,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 07:44:50,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:44:51,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:44:51,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:44:52,366 INFO [train.py:1046] (1/4) Epoch 45, batch 3000, loss[loss=0.1515, simple_loss=0.2396, pruned_loss=0.03167, over 24437.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2358, pruned_loss=0.03675, over 4749501.62 frames. ], batch size: 63, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:44:52,366 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 07:45:04,919 INFO [train.py:1078] (1/4) Epoch 45, validation: loss=0.3664, simple_loss=0.2817, pruned_loss=0.2256, over 1125622.00 frames. 2023-10-04 07:45:04,920 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 07:45:04,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:45:05,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:45:07,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:45:08,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:45:09,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 07:45:11,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:45:14,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:45:14,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:45:16,527 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 07:45:17,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 07:45:19,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:45:19,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:45:20,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 07:45:20,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:45:20,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1578293.3333333333, ans=0.125 2023-10-04 07:45:25,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:45:35,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:45:41,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 07:45:41,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:45:44,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:45:44,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:45:44,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:45:47,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:45:47,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 07:45:48,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 07:45:49,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:45:50,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:45:53,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:45:53,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:45:54,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:45:54,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:45:57,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:45:57,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:45:57,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:45:58,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:46:00,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 07:46:01,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:46:01,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1578426.6666666667, ans=0.125 2023-10-04 07:46:03,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:03,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:46:07,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:46:07,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:46:11,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 07:46:11,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 07:46:11,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:46:11,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 07:46:12,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:46:15,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 07:46:18,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 07:46:18,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 07:46:19,805 INFO [train.py:1046] (1/4) Epoch 45, batch 3050, loss[loss=0.1512, simple_loss=0.2315, pruned_loss=0.03541, over 24418.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.237, pruned_loss=0.03699, over 4743619.58 frames. ], batch size: 58, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:46:19,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 07:46:19,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 07:46:19,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 07:46:21,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:46:21,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:46:21,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:46:21,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:22,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:46:25,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 07:46:28,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:46:31,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:46:31,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:46:34,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:36,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 07:46:42,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 07:46:43,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 07:46:43,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:46:47,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:46:48,770 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.92 vs. limit=12.0 2023-10-04 07:46:51,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1578693.3333333333, ans=0.0 2023-10-04 07:46:52,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:52,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:46:52,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:46:54,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:46:54,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 07:46:56,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:46:56,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:46:56,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:46:58,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:59,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:47:02,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:47:02,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 07:47:02,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:47:02,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:47:06,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:47:06,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:47:08,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:47:08,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:47:13,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:47:13,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:47:16,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1578760.0, ans=0.125 2023-10-04 07:47:20,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:47:20,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:47:20,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:47:21,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:47:21,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 07:47:23,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:47:23,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 07:47:24,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:47:24,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:47:26,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 07:47:27,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:47:30,496 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 2.029e+02 2.330e+02 2.767e+02 4.279e+02, threshold=4.661e+02, percent-clipped=0.0 2023-10-04 07:47:32,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:47:32,876 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1578893.3333333333, ans=0.125 2023-10-04 07:47:33,803 INFO [train.py:1046] (1/4) Epoch 45, batch 3100, loss[loss=0.1597, simple_loss=0.2472, pruned_loss=0.03611, over 24050.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2364, pruned_loss=0.03684, over 4745816.15 frames. ], batch size: 80, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:47:33,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:47:34,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1578893.3333333333, ans=0.1 2023-10-04 07:47:35,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:47:38,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 07:47:41,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 07:47:42,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 07:47:45,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:47:48,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:47:48,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:47:51,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 07:47:55,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:48:01,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 07:48:06,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 07:48:06,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:06,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:48:06,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:48:06,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 07:48:08,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:48:08,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 07:48:08,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:48:10,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:48:11,064 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.35 vs. limit=15.0 2023-10-04 07:48:11,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 07:48:13,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:48:16,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:48:16,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 07:48:18,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 07:48:18,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:19,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:48:21,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:48:21,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:21,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:48:22,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:48:22,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:48:25,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:48:25,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:48:25,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:25,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 07:48:29,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:48:31,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 07:48:34,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:48:34,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 07:48:34,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:48:36,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:36,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 07:48:45,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 07:48:48,177 INFO [train.py:1046] (1/4) Epoch 45, batch 3150, loss[loss=0.1299, simple_loss=0.192, pruned_loss=0.03388, over 23350.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2347, pruned_loss=0.03664, over 4737134.66 frames. ], batch size: 285, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:48:48,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:48:48,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:51,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:48:51,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:48:52,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 07:48:52,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:48:52,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 07:48:53,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 07:48:55,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:48:56,973 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 07:48:57,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1579226.6666666667, ans=0.1 2023-10-04 07:49:01,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 07:49:01,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:49:02,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1579293.3333333333, ans=0.1 2023-10-04 07:49:03,345 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 07:49:03,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 07:49:06,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 07:49:06,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 07:49:06,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 07:49:06,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:49:06,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:49:07,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:49:10,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 07:49:11,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:49:11,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:49:13,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:49:14,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:49:20,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 07:49:20,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:49:22,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:49:23,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:49:23,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 07:49:26,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 07:49:27,564 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.46 vs. limit=5.0 2023-10-04 07:49:27,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:49:27,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 07:49:27,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 07:49:29,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:49:29,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:49:31,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:49:31,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 07:49:33,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 07:49:33,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 07:49:33,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:49:35,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:49:35,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:49:35,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 07:49:35,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:49:38,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 07:49:38,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:49:40,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 07:49:40,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 07:49:41,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:49:41,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:49:41,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 07:49:44,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 07:49:44,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:49:47,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:49:49,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:49:50,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:49:53,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:49:55,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:49:57,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 07:49:59,212 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.953e+02 2.189e+02 2.513e+02 3.532e+02, threshold=4.378e+02, percent-clipped=0.0 2023-10-04 07:50:02,612 INFO [train.py:1046] (1/4) Epoch 45, batch 3200, loss[loss=0.1393, simple_loss=0.225, pruned_loss=0.0268, over 24563.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2336, pruned_loss=0.03626, over 4730671.36 frames. ], batch size: 60, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 07:50:04,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:50:04,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:50:07,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1579560.0, ans=0.125 2023-10-04 07:50:08,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:50:10,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:50:10,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 07:50:10,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:50:15,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:50:17,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:50:25,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:50:32,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1579693.3333333333, ans=0.125 2023-10-04 07:50:35,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 07:50:35,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:50:39,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 07:50:40,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:50:44,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:50:44,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:50:44,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:50:48,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 07:50:50,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 07:50:52,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 07:50:54,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 07:50:55,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:51:01,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:51:01,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:51:01,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:51:02,512 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 07:51:02,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 07:51:07,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:51:08,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 07:51:09,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 07:51:11,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 07:51:12,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 07:51:15,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:51:17,263 INFO [train.py:1046] (1/4) Epoch 45, batch 3250, loss[loss=0.1459, simple_loss=0.2197, pruned_loss=0.03608, over 24280.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2336, pruned_loss=0.03646, over 4730518.42 frames. ], batch size: 56, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 07:51:17,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:51:17,312 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 07:51:17,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:51:17,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:17,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1579893.3333333333, ans=0.125 2023-10-04 07:51:18,727 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 07:51:22,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:51:24,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:51:33,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:51:33,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 07:51:35,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:51:36,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:51:36,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:51:36,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:51:38,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 07:51:41,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:41,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:51:41,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:51:41,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:41,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:42,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:51:44,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:51:45,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:51:49,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:51:49,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:50,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:51:51,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:51:51,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:51:56,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 07:51:56,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:51:57,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:51:59,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:51:59,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:52:05,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:52:08,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1580093.3333333333, ans=0.1 2023-10-04 07:52:12,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:52:12,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:52:12,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 07:52:12,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:52:12,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 07:52:14,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:52:17,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 07:52:17,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 07:52:18,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:52:19,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:52:21,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:52:23,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 07:52:23,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:52:25,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:52:27,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:52:28,381 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.442e+02 1.929e+02 2.141e+02 2.387e+02 3.299e+02, threshold=4.283e+02, percent-clipped=0.0 2023-10-04 07:52:28,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 07:52:28,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:52:30,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1580226.6666666667, ans=0.2 2023-10-04 07:52:31,158 INFO [train.py:1046] (1/4) Epoch 45, batch 3300, loss[loss=0.1431, simple_loss=0.2295, pruned_loss=0.02835, over 24310.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2344, pruned_loss=0.03633, over 4738887.20 frames. ], batch size: 61, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 07:52:31,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:52:31,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 07:52:34,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:52:34,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 07:52:37,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 07:52:38,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 07:52:38,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:52:41,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:52:41,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:52:41,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:52:43,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1580226.6666666667, ans=0.1 2023-10-04 07:52:43,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1580226.6666666667, ans=0.1 2023-10-04 07:52:44,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 07:52:45,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:52:46,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:52:47,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:52:48,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1580293.3333333333, ans=0.0 2023-10-04 07:52:48,589 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.45 vs. limit=10.0 2023-10-04 07:52:52,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 07:52:52,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:52:52,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:52:53,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:52:55,428 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 07:52:55,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:52:55,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 07:52:55,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:52:55,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:52:56,940 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 07:52:58,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:52:58,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:53:01,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:53:01,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 07:53:03,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 07:53:03,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:53:04,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:53:07,245 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 07:53:08,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 07:53:08,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:53:11,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 07:53:13,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:53:13,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1580360.0, ans=0.125 2023-10-04 07:53:16,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:53:17,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:53:19,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:53:20,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:53:20,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:53:20,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:53:21,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:53:21,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:53:23,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:53:23,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1580426.6666666667, ans=0.0 2023-10-04 07:53:25,082 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 07:53:26,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 07:53:27,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 07:53:29,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:53:29,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:53:32,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:53:32,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:53:32,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:53:32,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:53:33,156 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:53:34,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:53:34,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:53:36,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:53:39,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 07:53:39,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:53:39,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:53:41,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:53:42,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:53:42,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:53:45,547 INFO [train.py:1046] (1/4) Epoch 45, batch 3350, loss[loss=0.175, simple_loss=0.2648, pruned_loss=0.04254, over 24398.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2351, pruned_loss=0.03643, over 4740440.75 frames. ], batch size: 77, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:53:45,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:53:45,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:53:48,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:53:49,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:53:51,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:53:54,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:53:56,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:53:57,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:53:58,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:54:00,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 07:54:02,136 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 07:54:02,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:54:06,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 07:54:06,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 07:54:08,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:54:08,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:54:09,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:54:10,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 07:54:10,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:54:11,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:54:11,528 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.78 vs. limit=15.0 2023-10-04 07:54:15,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:54:16,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:54:16,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1580693.3333333333, ans=0.1 2023-10-04 07:54:17,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:54:17,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:54:21,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:54:23,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:54:23,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:54:27,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:54:28,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:54:28,781 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1580760.0, ans=0.0 2023-10-04 07:54:30,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:54:30,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:54:32,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:54:34,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 07:54:35,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:54:35,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 07:54:35,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:54:37,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 07:54:38,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:54:40,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:54:45,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:54:46,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 07:54:46,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:54:47,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:54:49,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:54:54,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:54:55,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 07:54:57,148 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 1.935e+02 2.206e+02 2.412e+02 3.759e+02, threshold=4.413e+02, percent-clipped=0.0 2023-10-04 07:54:57,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:54:57,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:54:59,079 INFO [train.py:1046] (1/4) Epoch 45, batch 3400, loss[loss=0.156, simple_loss=0.2356, pruned_loss=0.03822, over 23294.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2359, pruned_loss=0.03659, over 4741543.87 frames. ], batch size: 119, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:54:59,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:54:59,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 07:55:00,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:55:00,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 07:55:01,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:55:02,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:55:03,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:55:05,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:55:05,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 07:55:09,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 07:55:09,770 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 07:55:09,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:55:13,206 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.97 vs. limit=22.5 2023-10-04 07:55:13,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:55:13,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:55:14,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:55:16,030 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.46 vs. limit=15.0 2023-10-04 07:55:16,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:55:20,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:55:21,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 07:55:25,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:55:29,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:55:29,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:55:30,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 07:55:35,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:55:39,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 07:55:41,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1581026.6666666667, ans=0.125 2023-10-04 07:55:42,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1581093.3333333333, ans=0.0 2023-10-04 07:55:43,988 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:55:45,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:55:45,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 07:55:46,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:55:46,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:55:47,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:55:48,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:55:52,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:55:55,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:55:55,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:56:00,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:56:02,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 07:56:07,798 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:56:10,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 07:56:13,666 INFO [train.py:1046] (1/4) Epoch 45, batch 3450, loss[loss=0.1716, simple_loss=0.2527, pruned_loss=0.04529, over 23386.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2362, pruned_loss=0.03714, over 4734729.18 frames. ], batch size: 93, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:56:15,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 07:56:17,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 07:56:17,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:56:19,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:56:19,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 07:56:20,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:56:25,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 07:56:28,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:56:28,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:56:29,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:56:29,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:56:31,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:56:37,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 07:56:41,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1581293.3333333333, ans=0.125 2023-10-04 07:56:43,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 07:56:44,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:56:45,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:56:46,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:56:48,625 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.28 vs. limit=22.5 2023-10-04 07:56:49,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 07:56:50,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:56:55,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:56:56,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:56:57,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:56:57,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:56:59,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 07:56:59,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:57:01,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:57:02,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1581426.6666666667, ans=0.1 2023-10-04 07:57:05,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:57:07,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 07:57:12,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:57:14,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:57:16,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:19,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:57:23,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:23,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:57:23,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_na.min_abs, batch_count=1581493.3333333333, ans=0.02 2023-10-04 07:57:24,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:57:24,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:57:26,408 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 1.981e+02 2.112e+02 2.378e+02 3.937e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-04 07:57:27,805 INFO [train.py:1046] (1/4) Epoch 45, batch 3500, loss[loss=0.1594, simple_loss=0.23, pruned_loss=0.04444, over 23706.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2346, pruned_loss=0.03694, over 4730999.25 frames. ], batch size: 179, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:57:29,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:57:32,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:57:33,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 07:57:35,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 07:57:38,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 07:57:41,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:57:41,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 07:57:44,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1581626.6666666667, ans=0.09899494936611666 2023-10-04 07:57:48,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:57:48,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:57:48,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:57:50,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:57:50,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:57:50,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:51,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:57:51,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 07:57:54,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:54,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:57:55,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:57:58,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:58,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 07:58:00,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:58:03,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:58:05,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:58:07,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:58:09,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:58:11,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:58:12,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 07:58:12,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1581760.0, ans=0.125 2023-10-04 07:58:13,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 07:58:13,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 07:58:14,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:58:15,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1581760.0, ans=0.0 2023-10-04 07:58:16,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:58:16,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:58:16,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:58:20,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 07:58:20,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:58:25,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:58:25,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 07:58:25,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 07:58:25,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:58:26,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:58:27,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:58:29,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:58:29,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1581826.6666666667, ans=0.0 2023-10-04 07:58:31,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 07:58:31,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff3.min_abs, batch_count=1581826.6666666667, ans=0.2 2023-10-04 07:58:32,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:58:32,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:58:34,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 07:58:36,209 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.44 vs. limit=15.0 2023-10-04 07:58:37,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 07:58:40,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:58:42,164 INFO [train.py:1046] (1/4) Epoch 45, batch 3550, loss[loss=0.1532, simple_loss=0.226, pruned_loss=0.04016, over 23879.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2339, pruned_loss=0.03668, over 4724159.22 frames. ], batch size: 195, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:58:42,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:58:42,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:58:42,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:58:45,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:58:46,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1581893.3333333333, ans=0.125 2023-10-04 07:58:52,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:58:53,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 07:58:57,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:58:57,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:58:59,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:59:00,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:59:00,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:59:04,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:59:04,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:59:05,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:59:05,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:59:05,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:59:10,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:59:10,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:59:12,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:59:12,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:59:13,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:59:13,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 07:59:13,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:59:15,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:59:16,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 07:59:20,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:59:20,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_abs, batch_count=1582026.6666666667, ans=0.5 2023-10-04 07:59:21,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:59:23,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:59:24,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 07:59:24,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:59:28,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 07:59:28,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:59:29,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:59:31,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:59:34,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 07:59:34,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1582093.3333333333, ans=0.0 2023-10-04 07:59:35,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:59:42,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:59:42,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 07:59:43,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:59:48,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:59:49,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 07:59:54,810 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.938e+02 2.121e+02 2.490e+02 3.705e+02, threshold=4.241e+02, percent-clipped=0.0 2023-10-04 07:59:54,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 07:59:54,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:59:56,218 INFO [train.py:1046] (1/4) Epoch 45, batch 3600, loss[loss=0.146, simple_loss=0.2309, pruned_loss=0.0305, over 24448.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.233, pruned_loss=0.03618, over 4725306.31 frames. ], batch size: 63, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 07:59:56,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:59:58,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:59:59,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:00:00,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:00:04,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:00:05,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1582226.6666666667, ans=0.125 2023-10-04 08:00:08,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:00:09,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:00:09,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:00:11,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:00:11,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 08:00:16,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:00:17,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:00:17,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1582293.3333333333, ans=0.0 2023-10-04 08:00:20,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:00:23,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:00:23,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:00:23,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:00:23,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 08:00:24,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:00:26,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:00:27,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:00:28,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1582360.0, ans=0.0 2023-10-04 08:00:30,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:00:31,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:00:32,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:00:34,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 08:00:36,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=1582360.0, ans=0.05 2023-10-04 08:00:37,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1582360.0, ans=0.125 2023-10-04 08:00:42,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:00:43,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:00:45,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 08:00:48,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:00:51,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:00:51,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1582426.6666666667, ans=0.2 2023-10-04 08:00:55,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:01:00,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:01:00,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:01:00,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 08:01:02,249 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.59 vs. limit=10.0 2023-10-04 08:01:02,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 08:01:04,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 08:01:05,102 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.23 vs. limit=15.0 2023-10-04 08:01:06,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:01:06,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:01:07,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 08:01:07,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:01:07,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:01:07,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:01:10,166 INFO [train.py:1046] (1/4) Epoch 45, batch 3650, loss[loss=0.1548, simple_loss=0.2461, pruned_loss=0.03174, over 24455.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2335, pruned_loss=0.03628, over 4737008.18 frames. ], batch size: 69, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 08:01:10,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 08:01:10,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 08:01:15,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:01:15,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 08:01:19,134 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:01:20,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 08:01:21,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:01:24,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 08:01:25,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 08:01:30,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:01:30,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:01:31,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:01:34,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 08:01:34,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:01:35,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 08:01:35,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:01:35,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:01:35,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 08:01:37,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 08:01:37,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:01:37,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:01:40,490 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.96 vs. limit=15.0 2023-10-04 08:01:41,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:01:42,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 08:01:43,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 08:01:45,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:01:47,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 08:01:49,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:01:49,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:01:53,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:01:54,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:01:54,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:01:56,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:01:57,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:01:59,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1582760.0, ans=0.125 2023-10-04 08:02:00,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:02:04,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:02:05,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:02:05,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:02:06,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1582760.0, ans=0.1 2023-10-04 08:02:07,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 08:02:07,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:02:08,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:02:15,053 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 08:02:20,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:02:20,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:02:21,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:02:21,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:02:21,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:02:24,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:02:25,456 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.054e+02 2.374e+02 2.935e+02 4.345e+02, threshold=4.749e+02, percent-clipped=1.0 2023-10-04 08:02:25,484 INFO [train.py:1046] (1/4) Epoch 45, batch 3700, loss[loss=0.1588, simple_loss=0.2312, pruned_loss=0.04318, over 23644.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2339, pruned_loss=0.0367, over 4740786.22 frames. ], batch size: 135, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:02:26,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 08:02:26,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:02:29,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 08:02:29,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=1582893.3333333333, ans=0.05 2023-10-04 08:02:31,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:02:31,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:02:33,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:02:33,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 08:02:33,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:02:35,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:02:35,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:02:36,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:02:40,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:02:40,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:02:41,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:02:41,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:02:41,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 08:02:43,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1582960.0, ans=0.125 2023-10-04 08:02:44,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:02:46,347 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 08:02:55,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:02:55,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 08:02:56,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:02:56,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 08:02:56,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:02:59,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1583026.6666666667, ans=0.1 2023-10-04 08:03:00,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:03:02,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 08:03:02,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:03:03,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:03:06,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:03:06,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:03:09,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:03:14,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:03:14,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 08:03:14,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:03:14,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 08:03:17,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1583093.3333333333, ans=0.125 2023-10-04 08:03:20,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:03:21,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:03:24,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:03:24,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 08:03:26,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:03:26,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 08:03:26,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:03:27,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:03:29,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:03:29,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 08:03:30,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 08:03:31,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:03:32,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:03:33,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:03:35,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:03:38,463 INFO [train.py:1046] (1/4) Epoch 45, batch 3750, loss[loss=0.1437, simple_loss=0.2253, pruned_loss=0.03101, over 24470.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2355, pruned_loss=0.03764, over 4728944.77 frames. ], batch size: 63, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:03:38,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:03:40,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:03:41,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:03:43,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 08:03:44,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 08:03:47,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:03:47,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 08:03:49,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:03:50,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:03:54,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:03:54,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:03:58,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:04:00,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:04:01,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:04:01,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1583293.3333333333, ans=0.125 2023-10-04 08:04:02,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:04:05,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:04:06,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 08:04:07,087 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1583360.0, ans=0.09899494936611666 2023-10-04 08:04:08,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:04:08,957 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.07 vs. limit=22.5 2023-10-04 08:04:09,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:04:09,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:04:15,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 08:04:17,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 08:04:19,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:04:19,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:04:21,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:04:25,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:04:26,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 08:04:28,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 08:04:32,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:04:34,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:04:36,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:04:39,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:04:43,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 08:04:45,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:04:47,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:04:48,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1583493.3333333333, ans=0.09899494936611666 2023-10-04 08:04:49,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:04:51,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 08:04:52,546 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 2.048e+02 2.267e+02 2.690e+02 4.764e+02, threshold=4.534e+02, percent-clipped=1.0 2023-10-04 08:04:52,573 INFO [train.py:1046] (1/4) Epoch 45, batch 3800, loss[loss=0.143, simple_loss=0.2213, pruned_loss=0.03229, over 23750.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2353, pruned_loss=0.03767, over 4716425.65 frames. ], batch size: 149, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:04:58,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:05:02,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:05:02,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 08:05:04,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 08:05:05,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:05:07,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:05:07,864 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.69 vs. limit=15.0 2023-10-04 08:05:09,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 08:05:10,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 08:05:10,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:05:11,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:05:12,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1583626.6666666667, ans=0.1 2023-10-04 08:05:13,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:05:13,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:05:13,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:05:14,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 08:05:18,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 08:05:18,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:05:21,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:05:23,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:05:25,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:05:26,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 08:05:26,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:05:28,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:05:29,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:05:32,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 08:05:32,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1583693.3333333333, ans=0.0 2023-10-04 08:05:33,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 08:05:35,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:05:39,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1583760.0, ans=0.125 2023-10-04 08:05:42,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:05:46,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:05:48,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 08:05:50,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 08:05:51,080 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.77 vs. limit=22.5 2023-10-04 08:05:51,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:05:53,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:05:53,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:05:56,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 08:05:59,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 08:05:59,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 08:05:59,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:06:00,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:06:02,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1583826.6666666667, ans=0.2 2023-10-04 08:06:06,071 INFO [train.py:1046] (1/4) Epoch 45, batch 3850, loss[loss=0.147, simple_loss=0.2316, pruned_loss=0.03121, over 24620.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2336, pruned_loss=0.03716, over 4718919.76 frames. ], batch size: 68, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:06:06,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:06:07,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:06:12,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:06:12,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 08:06:13,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:06:14,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1583893.3333333333, ans=0.5 2023-10-04 08:06:15,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:06:19,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:06:21,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:06:24,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 08:06:24,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 08:06:31,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:32,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:06:33,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:06:35,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:06:38,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:38,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:06:38,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:06:38,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:06:39,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:06:42,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:06:45,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:45,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:06:45,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 08:06:45,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 08:06:45,700 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1584026.6666666667, ans=0.04949747468305833 2023-10-04 08:06:46,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:06:46,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:48,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:06:50,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:50,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 08:06:53,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 08:06:53,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:06:55,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 08:06:58,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 08:07:03,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:07:04,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:07:06,728 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.54 vs. limit=15.0 2023-10-04 08:07:08,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:07:08,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 08:07:12,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 08:07:12,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:07:13,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:07:16,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 08:07:16,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:07:17,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:19,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:19,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:07:19,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 08:07:21,120 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.982e+02 2.141e+02 2.479e+02 3.654e+02, threshold=4.281e+02, percent-clipped=0.0 2023-10-04 08:07:21,147 INFO [train.py:1046] (1/4) Epoch 45, batch 3900, loss[loss=0.1575, simple_loss=0.2503, pruned_loss=0.03236, over 24481.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2331, pruned_loss=0.03695, over 4712799.00 frames. ], batch size: 69, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:07:21,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:07:21,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 08:07:21,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:21,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:07:24,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:07:24,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:26,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:07:26,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:07:26,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:07:27,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:07:27,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 08:07:29,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:32,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:07:33,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 08:07:33,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:07:33,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:07:37,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 08:07:37,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:38,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:07:40,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 08:07:40,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:07:42,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 08:07:43,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:44,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 08:07:46,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 08:07:48,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1584293.3333333333, ans=0.1 2023-10-04 08:07:49,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:07:51,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:07:51,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:07:52,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:07:55,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1584360.0, ans=0.2 2023-10-04 08:07:56,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:07:58,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:08:01,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:08:01,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:08:02,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:08:06,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:08:06,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:08:15,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:08:17,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:08:17,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1584426.6666666667, ans=0.2 2023-10-04 08:08:23,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:08:26,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:08:27,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 08:08:27,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 08:08:27,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:08:29,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 08:08:31,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:08:31,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 08:08:31,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1584493.3333333333, ans=0.125 2023-10-04 08:08:35,209 INFO [train.py:1046] (1/4) Epoch 45, batch 3950, loss[loss=0.15, simple_loss=0.2294, pruned_loss=0.0353, over 23463.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2326, pruned_loss=0.03665, over 4698353.69 frames. ], batch size: 106, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:08:36,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:08:38,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 08:08:38,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:08:41,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:08:42,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:08:47,050 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 08:08:47,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:08:48,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 08:08:48,521 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 08:08:48,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1584626.6666666667, ans=10.0 2023-10-04 08:08:50,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:08:55,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:08:55,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:08:55,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:08:57,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 08:09:00,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:09:02,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:09:02,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:09:02,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:09:02,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:09:04,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1584693.3333333333, ans=0.1 2023-10-04 08:09:12,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:09:12,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:09:14,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1584693.3333333333, ans=0.0 2023-10-04 08:09:17,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 08:09:18,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1584760.0, ans=0.0 2023-10-04 08:09:20,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1584760.0, ans=0.0 2023-10-04 08:09:23,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 08:09:23,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 08:09:23,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:09:24,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:09:29,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1584760.0, ans=0.0 2023-10-04 08:09:32,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:09:32,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:09:32,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:09:33,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:09:33,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 08:09:33,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1584826.6666666667, ans=0.07 2023-10-04 08:09:38,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:09:39,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:09:39,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1584826.6666666667, ans=0.0 2023-10-04 08:09:42,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1584826.6666666667, ans=0.05 2023-10-04 08:09:43,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=1584826.6666666667, ans=6.0 2023-10-04 08:09:44,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 08:09:49,586 INFO [train.py:1046] (1/4) Epoch 45, batch 4000, loss[loss=0.1385, simple_loss=0.2184, pruned_loss=0.02927, over 24602.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2334, pruned_loss=0.03682, over 4706025.92 frames. ], batch size: 60, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:09:51,385 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 2.026e+02 2.265e+02 2.595e+02 5.973e+02, threshold=4.529e+02, percent-clipped=1.0 2023-10-04 08:09:54,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:09:58,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1584893.3333333333, ans=0.125 2023-10-04 08:10:00,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:10:06,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:10:06,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:10:08,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:10:08,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 08:10:09,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:10:09,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 08:10:09,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:10:09,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 08:10:09,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1584960.0, ans=0.1 2023-10-04 08:10:12,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:10:13,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:10:13,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1584960.0, ans=0.1 2023-10-04 08:10:15,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:10:15,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:10:15,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:10:15,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 08:10:17,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:10:19,742 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 08:10:19,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:10:19,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:10:24,300 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 08:10:24,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 08:10:24,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:10:25,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=1585026.6666666667, ans=0.025 2023-10-04 08:10:28,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1585026.6666666667, ans=0.125 2023-10-04 08:10:29,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 08:10:30,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1585026.6666666667, ans=0.2 2023-10-04 08:10:31,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:10:34,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:10:35,856 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 08:10:37,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:10:37,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 08:10:37,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:10:38,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:10:40,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:10:40,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:10:41,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:10:41,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:10:42,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 08:10:42,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:10:44,383 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 08:10:44,874 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.53 vs. limit=15.0 2023-10-04 08:10:49,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:10:52,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 08:10:55,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:10:55,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:10:55,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:10:56,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:10:58,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1585160.0, ans=0.125 2023-10-04 08:10:59,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:11:03,979 INFO [train.py:1046] (1/4) Epoch 45, batch 4050, loss[loss=0.1509, simple_loss=0.2308, pruned_loss=0.03549, over 23642.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2341, pruned_loss=0.0368, over 4715081.81 frames. ], batch size: 120, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:11:04,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 08:11:04,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 08:11:06,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:11:06,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:11:07,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:11:08,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:11:10,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:11:12,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:11:15,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:11:17,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 08:11:18,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:11:18,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:11:19,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1585293.3333333333, ans=0.125 2023-10-04 08:11:22,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:11:23,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:11:27,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 08:11:29,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 08:11:29,316 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 08:11:31,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:11:38,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 08:11:38,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:11:38,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1585360.0, ans=0.2 2023-10-04 08:11:40,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1585360.0, ans=0.125 2023-10-04 08:11:42,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:11:45,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:11:45,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:11:45,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:11:50,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:11:51,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 08:11:53,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:11:54,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:11:56,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 08:11:58,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:11:59,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1585426.6666666667, ans=0.1 2023-10-04 08:12:08,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 08:12:08,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:12:08,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:12:09,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 08:12:09,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 08:12:09,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:12:10,493 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.36 vs. limit=15.0 2023-10-04 08:12:12,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:12:12,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:12:13,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:12:15,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1585493.3333333333, ans=0.0 2023-10-04 08:12:17,832 INFO [train.py:1046] (1/4) Epoch 45, batch 4100, loss[loss=0.1494, simple_loss=0.2235, pruned_loss=0.0377, over 23700.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2346, pruned_loss=0.03683, over 4720868.99 frames. ], batch size: 164, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:12:20,952 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.704e+02 1.978e+02 2.170e+02 2.458e+02 4.039e+02, threshold=4.339e+02, percent-clipped=0.0 2023-10-04 08:12:21,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 08:12:22,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 08:12:24,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 08:12:25,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 08:12:25,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:12:25,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:12:27,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:12:27,331 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:12:27,409 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 08:12:31,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:12:32,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:12:32,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:12:32,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:12:38,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:12:39,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:12:40,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:12:40,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 08:12:40,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:12:40,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:12:42,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:12:42,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:12:42,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 08:12:45,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:12:45,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 08:12:47,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:12:50,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:12:50,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 08:12:51,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:12:52,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:12:52,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:12:56,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 08:12:58,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:12:59,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:13:00,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 08:13:00,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:13:00,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:13:05,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:13:11,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:13:14,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:13:15,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:13:21,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:13:21,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:13:26,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:13:27,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:13:30,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:13:30,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:13:32,182 INFO [train.py:1046] (1/4) Epoch 45, batch 4150, loss[loss=0.1397, simple_loss=0.2212, pruned_loss=0.02914, over 24453.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2348, pruned_loss=0.03688, over 4723613.22 frames. ], batch size: 63, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:13:32,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:13:32,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:13:33,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 08:13:35,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:13:35,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 08:13:36,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 08:13:36,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 08:13:37,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:13:43,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:13:43,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:13:46,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1585960.0, ans=0.125 2023-10-04 08:13:47,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:13:48,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:13:48,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:13:51,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:13:51,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:13:53,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 08:13:57,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:14:00,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:14:02,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 08:14:04,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 08:14:04,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:14:07,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 08:14:07,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:14:07,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:14:07,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1586026.6666666667, ans=0.125 2023-10-04 08:14:11,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:14:12,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:14:15,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 08:14:18,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:14:19,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:14:21,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 08:14:21,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1586093.3333333333, ans=0.125 2023-10-04 08:14:22,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:14:22,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 08:14:24,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:14:26,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:14:26,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:14:26,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 08:14:26,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:14:26,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 08:14:29,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 08:14:31,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 08:14:31,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:14:31,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:14:31,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 08:14:33,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 08:14:34,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:14:34,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 08:14:34,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:14:35,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:14:37,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 08:14:37,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:14:41,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:14:42,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1586160.0, ans=0.125 2023-10-04 08:14:43,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1586160.0, ans=0.1 2023-10-04 08:14:44,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 08:14:45,767 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=5.02 vs. limit=5.0 2023-10-04 08:14:46,511 INFO [train.py:1046] (1/4) Epoch 45, batch 4200, loss[loss=0.155, simple_loss=0.2446, pruned_loss=0.03269, over 24658.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2332, pruned_loss=0.03673, over 4705521.94 frames. ], batch size: 73, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:14:46,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:14:49,037 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.036e+02 2.349e+02 2.806e+02 3.824e+02, threshold=4.697e+02, percent-clipped=0.0 2023-10-04 08:14:49,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:14:49,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:14:51,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:14:51,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:14:53,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 08:14:57,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 08:14:58,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:15:00,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:15:02,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:15:05,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 08:15:08,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:15:08,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:15:10,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 08:15:10,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:15:11,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:15:11,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:15:13,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:15:13,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:15:14,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 08:15:14,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:15:19,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 08:15:20,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:15:22,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:15:23,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:15:26,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1586360.0, ans=0.125 2023-10-04 08:15:28,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:15:28,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 08:15:28,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:15:29,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:15:34,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1586426.6666666667, ans=0.0 2023-10-04 08:15:35,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:15:35,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:15:39,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:15:43,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 08:15:45,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:15:47,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1586493.3333333333, ans=0.2 2023-10-04 08:15:48,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1586493.3333333333, ans=0.0 2023-10-04 08:15:50,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:15:51,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:15:54,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 08:15:59,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:16:01,106 INFO [train.py:1046] (1/4) Epoch 45, batch 4250, loss[loss=0.1508, simple_loss=0.2379, pruned_loss=0.03184, over 24444.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2331, pruned_loss=0.03654, over 4713793.33 frames. ], batch size: 63, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:16:04,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:16:04,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:16:05,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:16:10,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:16:10,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 08:16:11,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:16:12,180 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.83 vs. limit=10.0 2023-10-04 08:16:13,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1586560.0, ans=0.125 2023-10-04 08:16:14,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:16:16,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1586626.6666666667, ans=0.0 2023-10-04 08:16:18,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:16:23,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:16:23,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:16:23,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1586626.6666666667, ans=0.2 2023-10-04 08:16:25,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:16:25,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:16:27,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:16:28,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:16:29,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1586626.6666666667, ans=0.125 2023-10-04 08:16:30,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:16:33,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:16:34,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:16:36,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 08:16:39,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 08:16:39,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:16:40,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:16:40,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:16:43,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:16:43,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:16:43,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:16:46,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 08:16:48,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:16:50,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:16:53,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:16:53,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 08:16:53,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:16:53,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1586760.0, ans=0.125 2023-10-04 08:16:55,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 08:16:55,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:16:56,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:17:00,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:17:00,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:17:01,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 08:17:03,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:17:04,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:17:08,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:17:09,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1586826.6666666667, ans=0.125 2023-10-04 08:17:11,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:17:11,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:17:13,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:17:15,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:17:16,425 INFO [train.py:1046] (1/4) Epoch 45, batch 4300, loss[loss=0.1459, simple_loss=0.2246, pruned_loss=0.03365, over 23577.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2325, pruned_loss=0.03642, over 4708564.63 frames. ], batch size: 149, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:17:16,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:17:16,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:17:16,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 08:17:18,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:17:19,313 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.964e+02 2.175e+02 2.393e+02 4.014e+02, threshold=4.349e+02, percent-clipped=0.0 2023-10-04 08:17:22,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:17:22,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:17:28,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:17:34,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:17:34,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 08:17:34,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:17:37,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:17:37,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:17:37,649 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 08:17:40,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1586960.0, ans=0.0 2023-10-04 08:17:41,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 08:17:43,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:17:45,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1587026.6666666667, ans=0.0 2023-10-04 08:17:46,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 08:17:46,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:17:46,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 08:17:49,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 08:17:50,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:17:52,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:17:52,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:17:53,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:17:54,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:17:54,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:17:56,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 08:17:57,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 08:17:59,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:18:02,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:18:02,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 08:18:02,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:18:04,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:18:04,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 08:18:04,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 08:18:04,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 08:18:05,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:18:05,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 08:18:06,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 08:18:08,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:18:10,104 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 08:18:11,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:18:13,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:18:13,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:18:14,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 08:18:16,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:18:16,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:18:16,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:18:17,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:18:17,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:18:20,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:18:23,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:18:24,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:18:24,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:18:28,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 08:18:29,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:18:30,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1587226.6666666667, ans=0.0 2023-10-04 08:18:30,995 INFO [train.py:1046] (1/4) Epoch 45, batch 4350, loss[loss=0.1557, simple_loss=0.2364, pruned_loss=0.0375, over 23316.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2332, pruned_loss=0.03674, over 4702684.05 frames. ], batch size: 93, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:18:31,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1587226.6666666667, ans=0.04949747468305833 2023-10-04 08:18:33,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:18:35,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:18:36,580 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.80 vs. limit=15.0 2023-10-04 08:18:37,814 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.68 vs. limit=15.0 2023-10-04 08:18:38,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:18:38,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:18:44,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:18:48,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:18:51,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:18:51,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:18:54,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:18:56,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:18:58,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:19:02,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 08:19:04,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:19:05,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:08,903 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.47 vs. limit=22.5 2023-10-04 08:19:10,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:13,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 08:19:15,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:19:17,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:19:21,177 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 08:19:23,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:19:23,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:19:25,208 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 08:19:25,270 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 08:19:25,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:19:26,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:19:26,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:19:27,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:19:29,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:19:29,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:19:32,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 08:19:32,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:32,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:19:32,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:33,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 08:19:35,940 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 08:19:35,944 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 08:19:35,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 08:19:38,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:19:38,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:19:40,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:19:40,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:19:42,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 08:19:44,207 INFO [train.py:1046] (1/4) Epoch 45, batch 4400, loss[loss=0.137, simple_loss=0.2185, pruned_loss=0.02775, over 17012.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2346, pruned_loss=0.03694, over 4705533.44 frames. ], batch size: 36, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:19:45,646 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 08:19:45,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:46,926 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 2.000e+02 2.178e+02 2.504e+02 3.543e+02, threshold=4.357e+02, percent-clipped=0.0 2023-10-04 08:19:49,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:19:49,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:50,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:19:52,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 08:19:52,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 08:19:53,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 08:19:53,509 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 08:19:54,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:19:54,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:19:56,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 08:19:58,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:20:00,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:20:00,303 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 08:20:03,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:20:03,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 08:20:05,095 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 08:20:08,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 08:20:08,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 08:20:08,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 08:20:09,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:20:09,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:20:09,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:20:11,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:20:13,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 08:20:13,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 08:20:15,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:20:16,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:20:16,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:20:18,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:20:18,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:20:18,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 08:20:20,210 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 08:20:23,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:20:24,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1587693.3333333333, ans=0.05 2023-10-04 08:20:29,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:20:31,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 08:20:35,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:20:39,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:20:42,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:20:42,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 08:20:43,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:20:43,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:20:43,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:20:44,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:20:47,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 08:20:51,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 08:20:53,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 08:20:53,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:20:53,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 08:20:54,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:20:56,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1587893.3333333333, ans=0.125 2023-10-04 08:20:57,387 INFO [train.py:1046] (1/4) Epoch 45, batch 4450, loss[loss=0.1608, simple_loss=0.2446, pruned_loss=0.03852, over 24017.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2359, pruned_loss=0.03743, over 4713388.86 frames. ], batch size: 80, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:20:57,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:21:00,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 08:21:02,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:21:04,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:21:04,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:21:12,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:21:12,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:21:15,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:21:18,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:21:21,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:21:21,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:21:22,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 08:21:22,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:21:23,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:21:23,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:21:23,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:21:25,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 08:21:27,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1588026.6666666667, ans=0.125 2023-10-04 08:21:31,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:21:31,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:21:32,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:21:34,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:21:34,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:21:38,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 08:21:40,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 08:21:40,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 08:21:40,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:21:43,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:21:43,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 08:21:45,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1588093.3333333333, ans=0.0 2023-10-04 08:21:47,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:21:50,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:21:50,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 08:21:50,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:21:50,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:21:50,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:21:50,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:21:51,419 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.17 vs. limit=6.0 2023-10-04 08:21:53,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:21:56,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:21:58,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 08:21:59,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:21:59,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1588160.0, ans=0.125 2023-10-04 08:22:01,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:22:02,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:22:03,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:22:03,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 08:22:06,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:22:10,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 08:22:11,844 INFO [train.py:1046] (1/4) Epoch 45, batch 4500, loss[loss=0.1436, simple_loss=0.226, pruned_loss=0.03063, over 24491.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2356, pruned_loss=0.03754, over 4701548.14 frames. ], batch size: 63, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:22:13,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:22:13,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1588226.6666666667, ans=0.125 2023-10-04 08:22:15,286 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.063e+02 2.420e+02 3.061e+02 5.300e+02, threshold=4.841e+02, percent-clipped=1.0 2023-10-04 08:22:16,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:22:17,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1588226.6666666667, ans=0.0 2023-10-04 08:22:18,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 08:22:18,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 08:22:18,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff2.min_abs, batch_count=1588226.6666666667, ans=0.1 2023-10-04 08:22:18,866 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.37 vs. limit=10.0 2023-10-04 08:22:20,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:22:23,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:22:25,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:22:27,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:22:28,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:22:28,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:22:28,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:22:40,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:22:40,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:22:42,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1588360.0, ans=0.1 2023-10-04 08:22:43,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:22:44,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:22:44,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:22:46,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1588360.0, ans=0.125 2023-10-04 08:22:51,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:22:56,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:22:58,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:23:00,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:23:00,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 08:23:02,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:23:02,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:23:06,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:23:06,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:23:08,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:23:08,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 08:23:08,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:23:08,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:23:12,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:23:12,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:23:16,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:23:18,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:23:20,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:23:20,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 08:23:22,225 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.67 vs. limit=15.0 2023-10-04 08:23:22,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 08:23:22,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 08:23:25,639 INFO [train.py:1046] (1/4) Epoch 45, batch 4550, loss[loss=0.1793, simple_loss=0.2651, pruned_loss=0.04672, over 23348.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2339, pruned_loss=0.03759, over 4669857.08 frames. ], batch size: 93, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:23:25,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 08:23:29,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 08:23:29,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1588560.0, ans=0.125 2023-10-04 08:23:30,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:23:30,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff2.min_abs, batch_count=1588560.0, ans=0.1 2023-10-04 08:23:31,520 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.47 vs. limit=15.0 2023-10-04 08:23:33,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:23:33,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:23:33,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1588560.0, ans=0.0 2023-10-04 08:23:36,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:23:41,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:23:42,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:23:44,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:23:44,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:23:44,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:23:44,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1588626.6666666667, ans=0.125 2023-10-04 08:23:45,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1588626.6666666667, ans=0.1 2023-10-04 08:23:46,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:23:46,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:23:50,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:23:52,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 08:23:54,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 08:23:54,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:23:55,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 08:23:56,227 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.49 vs. limit=22.5 2023-10-04 08:24:00,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 08:24:01,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:24:03,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 08:24:04,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:24:07,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:07,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:07,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:24:10,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 08:24:13,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:24:16,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:16,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:24:17,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:24:18,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 08:24:19,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 08:24:19,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:24:20,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 08:24:22,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 08:24:24,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:24:25,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:24:25,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:24:27,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:27,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:24:28,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 08:24:30,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 08:24:31,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:24:31,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 08:24:33,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 08:24:33,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:24:33,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 08:24:33,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1588826.6666666667, ans=0.0 2023-10-04 08:24:36,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:24:36,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:24:38,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:24:38,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:40,216 INFO [train.py:1046] (1/4) Epoch 45, batch 4600, loss[loss=0.1375, simple_loss=0.2121, pruned_loss=0.03146, over 22636.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2333, pruned_loss=0.03698, over 4687901.21 frames. ], batch size: 50, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:24:40,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 08:24:40,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:24:42,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:24:43,542 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.936e+02 2.256e+02 2.645e+02 3.814e+02, threshold=4.511e+02, percent-clipped=0.0 2023-10-04 08:24:46,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:24:46,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:24:49,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:24:49,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:24:50,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:24:52,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 08:24:52,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:24:56,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:24:56,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:24:59,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:01,964 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.76 vs. limit=10.0 2023-10-04 08:25:06,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 08:25:06,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:09,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:09,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1589026.6666666667, ans=0.0 2023-10-04 08:25:13,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:25:13,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:25:17,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 08:25:17,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:25:17,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:25:23,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:24,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:25:26,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:25:29,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 08:25:29,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 08:25:34,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:25:35,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:25:38,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:25:38,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 08:25:38,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:39,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 08:25:39,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:25:39,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:25:41,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:25:41,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:25:43,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:25:43,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 08:25:44,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 08:25:44,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 08:25:44,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:25:47,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:25:47,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:25:49,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:25:54,417 INFO [train.py:1046] (1/4) Epoch 45, batch 4650, loss[loss=0.1391, simple_loss=0.2189, pruned_loss=0.02969, over 23355.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2334, pruned_loss=0.03664, over 4705206.50 frames. ], batch size: 134, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:26:00,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:26:03,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:26:04,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:26:04,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:26:04,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:26:04,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:26:06,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:26:09,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 08:26:12,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:26:12,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1589293.3333333333, ans=0.125 2023-10-04 08:26:15,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 08:26:15,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:26:15,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 08:26:17,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:26:18,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 08:26:18,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 08:26:18,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:26:18,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:26:21,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:26:21,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:26:22,985 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 08:26:25,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:26:27,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 08:26:29,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1589360.0, ans=0.0 2023-10-04 08:26:30,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:26:30,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:26:32,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 08:26:33,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:26:36,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:26:36,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1589360.0, ans=0.2 2023-10-04 08:26:37,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:26:42,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1589426.6666666667, ans=0.125 2023-10-04 08:26:43,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:26:45,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:26:47,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:26:47,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:26:47,907 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.62 vs. limit=22.5 2023-10-04 08:26:49,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 08:26:49,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 08:26:51,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 08:26:51,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 08:26:54,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:27:00,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:27:00,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:27:00,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 08:27:00,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:27:02,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:27:02,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:27:03,724 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:27:04,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:27:06,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:27:06,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:27:07,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:27:08,963 INFO [train.py:1046] (1/4) Epoch 45, batch 4700, loss[loss=0.1352, simple_loss=0.2183, pruned_loss=0.026, over 24325.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2337, pruned_loss=0.03666, over 4706109.43 frames. ], batch size: 61, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:27:10,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:27:11,761 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.716e+02 2.046e+02 2.404e+02 2.908e+02 6.182e+02, threshold=4.807e+02, percent-clipped=8.0 2023-10-04 08:27:11,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:27:11,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:27:13,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 08:27:15,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:27:16,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 08:27:22,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:27:23,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:27:23,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:27:25,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:27:26,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:27:32,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 08:27:32,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 08:27:33,394 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.61 vs. limit=10.0 2023-10-04 08:27:34,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:27:35,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:27:35,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:27:38,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:27:44,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:27:45,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 08:27:48,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:27:55,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 08:27:56,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:27:56,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1589760.0, ans=0.0 2023-10-04 08:27:58,227 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:27:59,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:01,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 08:28:02,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:28:02,671 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.02 vs. limit=15.0 2023-10-04 08:28:07,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:28:07,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 08:28:09,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:11,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:28:15,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:28:15,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:28:15,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 08:28:15,976 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 08:28:17,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:28:20,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:20,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:20,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 08:28:20,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1589826.6666666667, ans=0.125 2023-10-04 08:28:21,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:22,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1589893.3333333333, ans=0.1 2023-10-04 08:28:22,904 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.36 vs. limit=15.0 2023-10-04 08:28:23,203 INFO [train.py:1046] (1/4) Epoch 45, batch 4750, loss[loss=0.1724, simple_loss=0.2555, pruned_loss=0.04467, over 23308.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2345, pruned_loss=0.03659, over 4723173.07 frames. ], batch size: 93, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:28:26,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 08:28:27,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:28:28,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:28:31,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:28:31,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:28:33,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 08:28:34,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:28:37,045 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:28:38,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 08:28:39,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:28:39,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:28:40,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:28:47,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 08:28:52,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:28:52,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1590026.6666666667, ans=0.1 2023-10-04 08:28:54,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 08:28:55,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:28:57,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:28:57,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:28:58,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:29:00,182 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 08:29:00,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 08:29:00,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1590026.6666666667, ans=0.125 2023-10-04 08:29:05,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 08:29:07,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:29:09,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:29:12,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:29:12,274 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 08:29:12,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:29:15,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:29:16,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:29:18,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 08:29:18,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 08:29:18,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:29:19,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:29:19,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:29:22,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 08:29:22,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 08:29:23,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 08:29:24,063 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:29:27,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:29:27,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1590160.0, ans=0.125 2023-10-04 08:29:30,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:29:30,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 08:29:30,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:29:31,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:29:32,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:29:33,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:29:34,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:29:37,035 INFO [train.py:1046] (1/4) Epoch 45, batch 4800, loss[loss=0.1629, simple_loss=0.2372, pruned_loss=0.04428, over 23654.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.236, pruned_loss=0.03687, over 4713765.93 frames. ], batch size: 256, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 08:29:38,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:29:38,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 08:29:38,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 08:29:40,328 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.033e+02 2.288e+02 2.597e+02 3.954e+02, threshold=4.576e+02, percent-clipped=0.0 2023-10-04 08:29:41,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 08:29:43,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:29:44,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:29:44,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 08:29:48,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1590226.6666666667, ans=0.125 2023-10-04 08:29:49,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:29:50,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:29:55,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:29:56,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:29:56,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:29:56,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 08:29:58,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:29:59,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:30:01,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:30:02,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1590293.3333333333, ans=0.2 2023-10-04 08:30:03,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:05,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:05,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:30:07,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1590360.0, ans=0.125 2023-10-04 08:30:07,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1590360.0, ans=0.0 2023-10-04 08:30:08,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:08,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 08:30:08,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:30:09,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:30:12,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:13,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1590360.0, ans=0.125 2023-10-04 08:30:14,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:30:16,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:30:17,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:30:18,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 08:30:20,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:30:21,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 08:30:21,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 08:30:21,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:30:21,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:30:21,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:30:21,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:30:21,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:30:24,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:30:24,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1590426.6666666667, ans=0.125 2023-10-04 08:30:25,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:30:29,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:30:29,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:30,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:30:35,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 08:30:35,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:30:35,210 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1590493.3333333333, ans=0.125 2023-10-04 08:30:36,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:36,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:30:37,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:30:42,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:30:42,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:30:42,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:44,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:30:44,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:30:44,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:30:47,024 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.40 vs. limit=6.0 2023-10-04 08:30:48,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:30:48,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:50,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:30:51,425 INFO [train.py:1046] (1/4) Epoch 45, batch 4850, loss[loss=0.1612, simple_loss=0.253, pruned_loss=0.03471, over 24315.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2367, pruned_loss=0.03731, over 4705811.18 frames. ], batch size: 74, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 08:30:51,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 08:30:53,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 08:30:54,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:54,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:54,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:30:54,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:56,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1590560.0, ans=0.2 2023-10-04 08:30:57,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:31:03,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 08:31:04,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:31:09,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:31:09,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 08:31:09,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:31:14,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:31:14,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1590626.6666666667, ans=0.0 2023-10-04 08:31:15,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:31:17,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:31:17,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 08:31:20,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:31:23,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:31:23,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:31:24,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:31:24,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 08:31:27,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:31:27,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:31:27,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1590693.3333333333, ans=0.125 2023-10-04 08:31:31,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:31:31,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 08:31:31,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 08:31:32,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:31:37,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1590760.0, ans=0.0 2023-10-04 08:31:39,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:31:39,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 08:31:41,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:31:41,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:31:45,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:31:46,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 08:31:46,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:31:48,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 08:31:48,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:31:49,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:31:49,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 08:31:54,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1590826.6666666667, ans=0.125 2023-10-04 08:31:58,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:32:03,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:32:03,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:32:05,194 INFO [train.py:1046] (1/4) Epoch 45, batch 4900, loss[loss=0.1619, simple_loss=0.2313, pruned_loss=0.04626, over 23784.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2361, pruned_loss=0.03746, over 4685545.77 frames. ], batch size: 212, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:32:08,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 08:32:08,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:32:09,335 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.637e+02 2.000e+02 2.208e+02 2.639e+02 4.240e+02, threshold=4.416e+02, percent-clipped=0.0 2023-10-04 08:32:12,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:32:12,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:32:12,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:32:14,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1590893.3333333333, ans=0.125 2023-10-04 08:32:16,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 08:32:22,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 08:32:25,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 08:32:26,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 08:32:27,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:32:27,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:32:27,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:32:27,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:32:27,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:32:28,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 08:32:33,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 08:32:33,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:32:34,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:32:36,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:32:37,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:32:39,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:32:40,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:32:40,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 08:32:42,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:32:43,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:32:43,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 08:32:43,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 08:32:47,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 08:32:50,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:32:51,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:32:51,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:32:52,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:32:52,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 08:32:52,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:32:53,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 08:32:56,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:32:56,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1591093.3333333333, ans=0.0 2023-10-04 08:32:57,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 08:32:58,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1591093.3333333333, ans=0.125 2023-10-04 08:32:59,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:33:00,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 08:33:02,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:33:02,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 08:33:03,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 08:33:09,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:33:10,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:33:12,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1591160.0, ans=0.2 2023-10-04 08:33:13,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 08:33:13,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 08:33:13,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:33:15,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:33:19,774 INFO [train.py:1046] (1/4) Epoch 45, batch 4950, loss[loss=0.15, simple_loss=0.2263, pruned_loss=0.0369, over 23462.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2347, pruned_loss=0.03711, over 4692680.02 frames. ], batch size: 134, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:33:19,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:33:19,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:33:19,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:33:21,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 08:33:22,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 08:33:24,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:33:25,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 08:33:27,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 08:33:27,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 08:33:27,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:33:27,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1591226.6666666667, ans=0.0 2023-10-04 08:33:28,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 08:33:28,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:33:28,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:33:28,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:33:28,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:33:30,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:33:31,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:33:33,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:33:34,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:33:37,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:33:37,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:33:40,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:33:46,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:33:47,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:33:49,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:33:51,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:33:52,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:33:54,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 08:33:54,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 08:33:58,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:33:59,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:33:59,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:34:01,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:34:01,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:34:02,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:34:02,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1591426.6666666667, ans=0.0 2023-10-04 08:34:05,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:34:05,508 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:34:06,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:34:09,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:34:10,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:34:10,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:34:12,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 08:34:12,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:34:14,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:34:18,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:34:18,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:34:18,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:34:20,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:34:21,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:34:21,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:34:25,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:34:25,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:34:25,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:34:26,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 08:34:29,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:34:33,779 INFO [train.py:1046] (1/4) Epoch 45, batch 5000, loss[loss=0.1655, simple_loss=0.2397, pruned_loss=0.04571, over 23751.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2343, pruned_loss=0.03696, over 4702754.61 frames. ], batch size: 232, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:34:35,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 08:34:35,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 08:34:39,135 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.771e+02 2.112e+02 2.442e+02 2.961e+02 4.557e+02, threshold=4.884e+02, percent-clipped=1.0 2023-10-04 08:34:43,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:34:43,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:34:43,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 08:34:44,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 08:34:47,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:34:48,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 08:34:49,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:34:49,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:34:49,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 08:34:50,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:34:50,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:34:53,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 08:34:53,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:34:53,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:34:54,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 08:34:56,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 08:34:56,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:34:57,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 08:34:57,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:34:57,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:34:58,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:34:58,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 08:34:58,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1591626.6666666667, ans=0.125 2023-10-04 08:34:59,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 08:34:59,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1591626.6666666667, ans=0.1 2023-10-04 08:35:00,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 08:35:00,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:35:00,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1591626.6666666667, ans=0.125 2023-10-04 08:35:02,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:35:03,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 08:35:03,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:35:04,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:35:06,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:35:07,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 08:35:09,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 08:35:10,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:35:11,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:35:15,962 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 08:35:17,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:35:17,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1591760.0, ans=0.1 2023-10-04 08:35:19,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:35:19,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:23,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 08:35:23,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:35:23,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:35:24,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1591760.0, ans=0.125 2023-10-04 08:35:25,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:35:27,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 08:35:29,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:35:29,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1591760.0, ans=0.1 2023-10-04 08:35:30,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:35:33,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:35:37,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 08:35:41,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:47,220 INFO [train.py:1046] (1/4) Epoch 45, batch 5050, loss[loss=0.1675, simple_loss=0.2575, pruned_loss=0.03873, over 24026.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2342, pruned_loss=0.0368, over 4711471.18 frames. ], batch size: 80, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:35:50,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:35:50,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:51,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:35:51,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:35:53,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:35:53,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:35:53,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:54,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1591893.3333333333, ans=0.1 2023-10-04 08:35:58,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:58,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 08:35:59,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:35:59,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1591893.3333333333, ans=0.2 2023-10-04 08:36:01,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:36:02,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:36:04,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 08:36:04,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:36:05,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:36:05,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1591960.0, ans=0.125 2023-10-04 08:36:07,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:36:08,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:36:08,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:36:16,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 08:36:16,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 08:36:18,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:36:18,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 08:36:18,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:36:21,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:36:21,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:36:23,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:36:23,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 08:36:24,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 08:36:25,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:36:27,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:36:30,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:36:32,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 08:36:33,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:36:35,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 08:36:36,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:36:36,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:36:38,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:36:38,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:36:38,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1592093.3333333333, ans=0.125 2023-10-04 08:36:40,120 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.30 vs. limit=22.5 2023-10-04 08:36:40,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:36:42,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:36:43,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:36:43,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:36:43,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:36:44,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 08:36:44,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:36:46,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:36:46,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1592160.0, ans=0.125 2023-10-04 08:36:46,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1592160.0, ans=0.1 2023-10-04 08:36:49,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:36:49,275 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 08:36:49,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 08:36:50,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:36:52,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:36:52,371 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 08:36:55,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:36:55,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 08:36:55,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:36:55,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1592160.0, ans=0.0 2023-10-04 08:36:57,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:36:58,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:36:58,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 08:37:00,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 08:37:01,992 INFO [train.py:1046] (1/4) Epoch 45, batch 5100, loss[loss=0.143, simple_loss=0.2382, pruned_loss=0.02393, over 24463.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2352, pruned_loss=0.03691, over 4711507.59 frames. ], batch size: 69, lr: 2.24e-03, grad_scale: 8.0 2023-10-04 08:37:03,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:37:03,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:37:03,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:37:05,962 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 08:37:07,265 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.979e+02 2.119e+02 2.358e+02 3.619e+02, threshold=4.239e+02, percent-clipped=0.0 2023-10-04 08:37:08,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:37:10,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 08:37:10,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 08:37:10,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:37:10,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1592226.6666666667, ans=0.125 2023-10-04 08:37:12,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:37:13,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:37:14,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 08:37:14,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 08:37:20,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:37:20,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:37:24,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:37:28,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 08:37:28,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:37:29,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:37:29,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 08:37:32,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:37:32,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:37:32,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 08:37:35,738 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 08:37:37,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:37:37,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 08:37:37,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 08:37:39,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:37:45,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1592426.6666666667, ans=0.125 2023-10-04 08:37:48,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:37:49,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 08:37:51,326 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 08:37:51,337 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 08:37:52,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 08:37:52,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:37:55,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 08:38:01,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 08:38:03,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 08:38:06,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:38:10,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 08:38:11,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1592493.3333333333, ans=0.125 2023-10-04 08:38:12,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 08:38:12,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 08:38:15,177 INFO [train.py:1046] (1/4) Epoch 45, batch 5150, loss[loss=0.1486, simple_loss=0.2393, pruned_loss=0.02895, over 24648.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2354, pruned_loss=0.037, over 4717042.03 frames. ], batch size: 73, lr: 2.24e-03, grad_scale: 8.0 2023-10-04 08:38:18,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:38:18,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:38:18,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:38:18,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:38:18,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 08:38:19,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:38:20,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 08:38:20,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 08:38:22,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 08:38:22,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:38:22,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 08:38:23,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:38:25,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 08:38:26,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:38:26,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:38:33,590 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.63 vs. limit=22.5 2023-10-04 08:38:34,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:38:34,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 08:38:36,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:38:36,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:38:37,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:38:37,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:38:37,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:38:39,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:38:39,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:38:39,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 08:38:40,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:38:40,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:38:43,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:38:44,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 08:38:46,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:38:50,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:38:53,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 08:38:54,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:39:01,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:39:02,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:39:05,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:39:05,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:39:08,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 08:39:11,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:39:11,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1592760.0, ans=0.125 2023-10-04 08:39:12,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:39:12,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:39:15,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:39:16,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:39:17,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1592826.6666666667, ans=0.0 2023-10-04 08:39:18,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 08:39:18,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1592826.6666666667, ans=0.0 2023-10-04 08:39:22,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:39:23,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 08:39:27,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:39:27,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:39:28,936 INFO [train.py:1046] (1/4) Epoch 45, batch 5200, loss[loss=0.1399, simple_loss=0.2054, pruned_loss=0.03723, over 22684.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2359, pruned_loss=0.03757, over 4706911.93 frames. ], batch size: 322, lr: 2.24e-03, grad_scale: 16.0 2023-10-04 08:39:28,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 08:39:29,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:39:29,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:39:29,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:39:32,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:39:33,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:39:35,578 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.683e+02 2.121e+02 2.351e+02 2.872e+02 5.392e+02, threshold=4.702e+02, percent-clipped=2.0 2023-10-04 08:39:38,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:39:41,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 08:39:41,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:39:41,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:39:43,714 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.94 vs. limit=15.0 2023-10-04 08:39:44,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:39:45,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:39:45,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:39:47,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 08:39:47,811 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.10 vs. limit=22.5 2023-10-04 08:39:48,897 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.73 vs. limit=15.0 2023-10-04 08:39:49,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:39:49,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:39:52,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 08:39:53,957 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1592960.0, ans=0.125 2023-10-04 08:39:55,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:39:57,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:39:57,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 08:39:57,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 08:40:01,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 08:40:01,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:40:01,994 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 08:40:02,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:40:03,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:40:03,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:40:05,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 08:40:05,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:40:06,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:40:09,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 08:40:09,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 08:40:11,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 08:40:15,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 08:40:15,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1593093.3333333333, ans=0.125 2023-10-04 08:40:16,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:40:20,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:40:22,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:40:22,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 08:40:24,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:40:24,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 08:40:24,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:40:25,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:40:28,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:40:30,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:40:33,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:40:33,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:40:33,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:40:39,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:40:39,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 08:40:41,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:40:41,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:40:42,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:40:42,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 08:40:43,939 INFO [train.py:1046] (1/4) Epoch 45, batch 5250, loss[loss=0.1468, simple_loss=0.2308, pruned_loss=0.03142, over 23141.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2343, pruned_loss=0.03737, over 4703717.55 frames. ], batch size: 105, lr: 2.24e-03, grad_scale: 16.0 2023-10-04 08:40:44,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:40:46,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:40:48,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:40:48,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:40:49,106 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.28 vs. limit=10.0 2023-10-04 08:40:50,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:40:57,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:40:57,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:41:00,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:41:01,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:41:03,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 08:41:03,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:41:06,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:41:09,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.whiten.whitening_limit, batch_count=1593293.3333333333, ans=15.0 2023-10-04 08:41:24,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1593360.0, ans=0.125 2023-10-04 08:41:27,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1593426.6666666667, ans=0.1 2023-10-04 08:41:47,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1593493.3333333333, ans=0.125 2023-10-04 08:41:52,484 INFO [train.py:1046] (1/4) Epoch 45, batch 5300, loss[loss=0.1511, simple_loss=0.2326, pruned_loss=0.03481, over 23487.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2344, pruned_loss=0.03728, over 4700556.53 frames. ], batch size: 134, lr: 2.24e-03, grad_scale: 16.0 2023-10-04 08:41:58,015 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 2.084e+02 2.270e+02 2.444e+02 3.408e+02, threshold=4.540e+02, percent-clipped=0.0 2023-10-04 08:42:01,272 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.93 vs. limit=6.0 2023-10-04 08:42:05,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1593626.6666666667, ans=0.125 2023-10-04 08:42:06,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:42:06,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 08:42:06,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 08:42:06,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:42:06,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:42:06,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:42:06,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:42:07,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:42:07,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:07,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:42:07,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 08:42:07,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:42:07,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 08:42:07,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 08:42:07,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 08:42:07,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 08:42:07,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 08:42:07,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 08:42:07,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:42:08,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:42:08,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:42:08,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:42:08,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:42:09,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:42:09,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:42:09,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:42:09,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:42:09,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:42:09,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:42:09,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:42:09,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:42:09,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 08:42:09,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:42:10,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:42:10,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 08:42:10,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 08:42:10,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:42:10,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:42:10,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 08:42:10,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 08:42:10,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:42:10,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:42:11,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:42:11,594 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 08:42:11,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 08:42:11,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:42:11,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:42:11,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 08:42:11,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 08:42:11,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 08:42:12,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:42:13,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1593640.0, ans=0.0 2023-10-04 08:42:13,767 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.30 vs. limit=15.0 2023-10-04 08:42:16,538 INFO [train.py:1046] (1/4) Epoch 46, batch 0, loss[loss=0.1476, simple_loss=0.2343, pruned_loss=0.03047, over 23350.00 frames. ], tot_loss[loss=0.1476, simple_loss=0.2343, pruned_loss=0.03047, over 23350.00 frames. ], batch size: 93, lr: 2.22e-03, grad_scale: 32.0 2023-10-04 08:42:16,538 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 08:42:27,598 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([4.5247, 4.2774, 3.6266, 4.3952, 3.6740, 3.9047, 4.2283, 4.2503], device='cuda:1') 2023-10-04 08:42:28,891 INFO [train.py:1078] (1/4) Epoch 46, validation: loss=0.3372, simple_loss=0.2742, pruned_loss=0.2001, over 1125622.00 frames. 2023-10-04 08:42:28,891 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 08:42:28,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 08:42:29,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:42:32,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:42:36,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:42:36,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:42:36,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:36,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 08:42:39,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 08:42:40,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:42,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:45,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1593706.6666666667, ans=0.125 2023-10-04 08:42:46,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:46,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1593706.6666666667, ans=0.125 2023-10-04 08:42:47,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:42:47,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:42:47,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:42:49,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 08:42:49,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1593706.6666666667, ans=0.0 2023-10-04 08:42:49,658 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.53 vs. limit=15.0 2023-10-04 08:42:51,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:42:59,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:42:59,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:43:01,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 08:43:03,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:43:03,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:43:03,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1593773.3333333333, ans=0.2 2023-10-04 08:43:06,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:43:10,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:43:11,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1593840.0, ans=0.125 2023-10-04 08:43:13,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:43:17,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1593840.0, ans=0.125 2023-10-04 08:43:18,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 08:43:22,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 08:43:24,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:43:24,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:43:24,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:43:25,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:43:27,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 08:43:30,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:43:32,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:43:32,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1593906.6666666667, ans=0.125 2023-10-04 08:43:34,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:43:40,200 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 08:43:41,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:43:42,896 INFO [train.py:1046] (1/4) Epoch 46, batch 50, loss[loss=0.1406, simple_loss=0.2159, pruned_loss=0.03267, over 24454.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2349, pruned_loss=0.03592, over 1074035.11 frames. ], batch size: 58, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:43:44,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:43:46,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:43:46,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 08:43:47,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:43:47,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:43:48,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:43:51,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:43:52,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:43:55,253 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.65 vs. limit=15.0 2023-10-04 08:43:55,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 08:43:57,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:44:01,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:44:03,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 08:44:06,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 08:44:06,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:44:08,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:44:08,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:44:09,521 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.11 vs. limit=15.0 2023-10-04 08:44:10,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:44:11,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 08:44:11,706 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:44:12,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 08:44:12,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:44:21,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:44:21,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:44:21,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:44:22,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 08:44:23,064 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:44:24,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:44:25,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:44:25,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 08:44:26,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:44:28,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 08:44:37,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:44:37,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:44:38,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:44:41,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:44:41,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:44:43,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 08:44:44,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 08:44:45,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:44:47,275 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.970e+02 2.347e+02 2.916e+02 8.307e+02, threshold=4.693e+02, percent-clipped=7.0 2023-10-04 08:44:47,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:44:48,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:44:48,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:44:48,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 08:44:50,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 08:44:51,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 08:44:51,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:44:51,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:44:52,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 08:44:53,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 08:44:53,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:44:54,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:44:55,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:44:55,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:44:57,054 INFO [train.py:1046] (1/4) Epoch 46, batch 100, loss[loss=0.1436, simple_loss=0.2217, pruned_loss=0.03277, over 23494.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2371, pruned_loss=0.0372, over 1880702.83 frames. ], batch size: 106, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:44:57,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:45:01,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:45:04,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:45:07,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 08:45:07,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:45:10,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:45:10,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:45:10,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:45:10,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:45:10,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:45:13,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 08:45:15,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:45:16,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:45:16,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:45:16,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:45:20,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 08:45:21,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:45:22,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:45:23,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:45:24,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:45:28,776 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 08:45:28,801 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 08:45:30,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:45:30,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:45:34,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:45:35,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1594440.0, ans=0.2 2023-10-04 08:45:37,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:45:39,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:45:45,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:45:47,166 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 08:45:49,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 08:45:51,255 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.55 vs. limit=15.0 2023-10-04 08:45:52,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:45:52,995 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.63 vs. limit=15.0 2023-10-04 08:45:53,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:45:56,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:45:58,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:46:00,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1594573.3333333333, ans=0.0 2023-10-04 08:46:01,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:46:02,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:46:05,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:46:05,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:46:07,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:46:07,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:46:07,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:46:07,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 08:46:08,983 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 08:46:08,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:46:09,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:46:10,309 INFO [train.py:1046] (1/4) Epoch 46, batch 150, loss[loss=0.1521, simple_loss=0.2311, pruned_loss=0.03652, over 23666.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2382, pruned_loss=0.03741, over 2512179.82 frames. ], batch size: 256, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:46:10,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:10,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:46:10,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 08:46:10,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:46:11,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:46:11,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:11,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:46:13,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:46:14,365 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.08 vs. limit=15.0 2023-10-04 08:46:14,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:46:14,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:46:18,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:46:21,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:46:21,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:46:23,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:25,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:46:25,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:28,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:46:28,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:30,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1594706.6666666667, ans=0.95 2023-10-04 08:46:31,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1594706.6666666667, ans=0.125 2023-10-04 08:46:32,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 08:46:32,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 08:46:32,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 08:46:35,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:46:35,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:46:36,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:46:36,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:46:36,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:46:38,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:38,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:40,048 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 08:46:41,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:46:47,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:46:52,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:46:53,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 08:46:56,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1594840.0, ans=0.0 2023-10-04 08:46:57,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:46:57,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:46:57,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:46:59,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:47:00,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:47:00,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:47:00,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:47:02,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 08:47:04,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:47:06,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:47:06,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:47:06,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:47:07,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:47:09,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 08:47:12,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:47:13,935 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.745e+02 2.011e+02 2.305e+02 2.782e+02 3.592e+02, threshold=4.611e+02, percent-clipped=0.0 2023-10-04 08:47:14,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:47:15,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:47:17,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:47:17,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 08:47:17,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:47:17,350 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 08:47:20,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:47:21,518 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.67 vs. limit=22.5 2023-10-04 08:47:22,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1594906.6666666667, ans=0.125 2023-10-04 08:47:24,666 INFO [train.py:1046] (1/4) Epoch 46, batch 200, loss[loss=0.1601, simple_loss=0.2365, pruned_loss=0.04183, over 23693.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2387, pruned_loss=0.03813, over 3007828.59 frames. ], batch size: 232, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:47:24,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:47:26,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:47:28,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 08:47:28,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:47:30,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:47:31,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 08:47:33,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 08:47:35,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:47:35,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:47:39,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:47:39,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:47:39,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:47:43,001 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.26 vs. limit=15.0 2023-10-04 08:47:46,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1595040.0, ans=0.1 2023-10-04 08:47:58,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:47:58,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:47:59,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:48:01,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:48:01,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 08:48:01,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:48:02,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:04,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:48:04,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:48:04,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:48:05,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 08:48:05,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:48:05,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:48:10,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:48:16,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:48:23,536 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.72 vs. limit=15.0 2023-10-04 08:48:25,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:25,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:48:26,434 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.11 vs. limit=15.0 2023-10-04 08:48:32,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:35,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 08:48:35,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:48:35,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:48:35,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:48:36,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:48:36,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1595306.6666666667, ans=0.5 2023-10-04 08:48:37,885 INFO [train.py:1046] (1/4) Epoch 46, batch 250, loss[loss=0.1635, simple_loss=0.2505, pruned_loss=0.03826, over 24378.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2386, pruned_loss=0.0376, over 3387629.99 frames. ], batch size: 77, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:48:37,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 08:48:38,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:48:39,400 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 08:48:40,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:41,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1595306.6666666667, ans=0.125 2023-10-04 08:48:44,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:48:44,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:46,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:48:47,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:48:47,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:49,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:48:49,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1595306.6666666667, ans=0.0 2023-10-04 08:48:51,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:49:02,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:49:02,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1595373.3333333333, ans=0.2 2023-10-04 08:49:05,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:49:05,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1595373.3333333333, ans=0.1 2023-10-04 08:49:06,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:49:11,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 08:49:12,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:49:13,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:49:15,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:49:15,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:49:15,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:49:17,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:49:18,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:49:21,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 08:49:21,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:49:21,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:49:22,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:49:22,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:49:23,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:49:26,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:49:26,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:49:27,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:49:30,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:49:30,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:49:33,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:49:39,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:49:42,116 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 2.009e+02 2.186e+02 2.461e+02 3.268e+02, threshold=4.371e+02, percent-clipped=0.0 2023-10-04 08:49:42,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:49:44,329 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.19 vs. limit=10.0 2023-10-04 08:49:46,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:49:48,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:49:52,410 INFO [train.py:1046] (1/4) Epoch 46, batch 300, loss[loss=0.1593, simple_loss=0.2479, pruned_loss=0.03529, over 24118.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.237, pruned_loss=0.03691, over 3688474.63 frames. ], batch size: 80, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:49:52,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 08:49:53,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:49:53,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:49:55,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 08:49:55,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:49:56,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:49:56,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 08:49:57,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1595640.0, ans=0.1 2023-10-04 08:49:58,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1595640.0, ans=0.0 2023-10-04 08:50:01,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:50:01,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:50:04,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:50:04,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 08:50:06,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:50:07,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 08:50:09,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 08:50:09,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:50:13,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:50:14,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1595706.6666666667, ans=0.09899494936611666 2023-10-04 08:50:17,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:50:17,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 08:50:21,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 08:50:21,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:24,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:50:25,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:25,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 08:50:25,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:50:27,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1595773.3333333333, ans=0.125 2023-10-04 08:50:28,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:50:30,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:50:31,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:50:35,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 08:50:35,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 08:50:35,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:50:37,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:38,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1595840.0, ans=0.125 2023-10-04 08:50:39,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 08:50:41,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:50:45,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:50:46,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:50:46,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 08:50:49,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:49,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:50:52,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:55,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:50:56,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 08:50:56,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:50:57,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:50:59,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 08:50:59,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:59,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:01,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:51:01,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1595906.6666666667, ans=0.0 2023-10-04 08:51:02,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:02,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:07,711 INFO [train.py:1046] (1/4) Epoch 46, batch 350, loss[loss=0.1515, simple_loss=0.2242, pruned_loss=0.03938, over 23583.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2351, pruned_loss=0.03655, over 3915982.90 frames. ], batch size: 232, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:51:07,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:51:07,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 08:51:11,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:16,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:51:16,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1595973.3333333333, ans=0.125 2023-10-04 08:51:19,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1595973.3333333333, ans=0.125 2023-10-04 08:51:20,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:20,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:23,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 08:51:23,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:51:24,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 08:51:24,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1596040.0, ans=0.125 2023-10-04 08:51:27,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:27,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 08:51:27,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:51:30,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 08:51:33,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:51:33,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:51:35,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:51:37,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:51:39,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:51:39,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:51:39,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:39,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:51:41,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:51:41,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:46,171 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.29 vs. limit=15.0 2023-10-04 08:51:49,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:51:49,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:51:51,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:51:52,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:56,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 08:51:56,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:57,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1596173.3333333333, ans=0.125 2023-10-04 08:51:58,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1596173.3333333333, ans=0.1 2023-10-04 08:51:59,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:59,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:52:00,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:52:01,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 08:52:02,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:04,215 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 08:52:05,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1596240.0, ans=0.125 2023-10-04 08:52:06,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 08:52:06,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:52:09,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:52:09,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 08:52:11,457 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.680e+02 2.124e+02 2.443e+02 2.962e+02 4.613e+02, threshold=4.885e+02, percent-clipped=1.0 2023-10-04 08:52:13,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:13,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1596240.0, ans=0.0 2023-10-04 08:52:15,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:52:16,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:52:18,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:18,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:52:19,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:52:22,599 INFO [train.py:1046] (1/4) Epoch 46, batch 400, loss[loss=0.1631, simple_loss=0.251, pruned_loss=0.03762, over 24388.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2347, pruned_loss=0.03641, over 4096744.72 frames. ], batch size: 77, lr: 2.22e-03, grad_scale: 32.0 2023-10-04 08:52:22,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:52:25,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:52:26,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 08:52:27,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:27,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:52:28,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:52:28,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:52:31,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:52:31,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:52:33,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 08:52:35,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 08:52:35,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:52:37,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 08:52:38,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:52:41,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:52:41,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:52:41,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 08:52:43,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:52:43,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:52:43,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:52:43,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:47,199 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 08:52:47,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 08:52:48,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1596373.3333333333, ans=0.0 2023-10-04 08:52:51,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:52:52,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:52:52,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 08:52:55,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 08:52:58,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:53:00,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:53:06,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 08:53:08,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1596506.6666666667, ans=0.0 2023-10-04 08:53:09,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 08:53:10,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 08:53:12,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1596506.6666666667, ans=0.015 2023-10-04 08:53:14,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:53:14,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:53:14,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1596506.6666666667, ans=0.125 2023-10-04 08:53:16,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 08:53:19,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:53:22,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:53:23,031 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.50 vs. limit=15.0 2023-10-04 08:53:23,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:53:24,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1596573.3333333333, ans=0.0 2023-10-04 08:53:26,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:53:26,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 08:53:29,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:53:29,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 08:53:30,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:53:30,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:53:34,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 08:53:36,736 INFO [train.py:1046] (1/4) Epoch 46, batch 450, loss[loss=0.1502, simple_loss=0.2272, pruned_loss=0.03659, over 23789.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2349, pruned_loss=0.03681, over 4235928.65 frames. ], batch size: 150, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:53:36,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:53:36,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:53:36,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 08:53:38,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 08:53:38,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:53:38,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1596640.0, ans=0.125 2023-10-04 08:53:39,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:53:39,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:53:39,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 08:53:41,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:53:42,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:53:43,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:53:51,853 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.84 vs. limit=12.0 2023-10-04 08:53:53,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:53:53,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:53:55,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 08:53:55,327 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1596706.6666666667, ans=0.1 2023-10-04 08:53:56,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 08:54:00,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:54:03,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:54:04,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:54:07,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1596773.3333333333, ans=0.125 2023-10-04 08:54:08,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:54:08,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:54:10,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 08:54:12,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 08:54:12,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 08:54:12,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:54:13,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:54:15,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:54:17,242 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 08:54:17,250 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 08:54:17,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:54:18,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:54:20,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 08:54:24,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 08:54:24,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:54:25,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 08:54:26,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 08:54:28,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:54:30,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:54:30,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 08:54:32,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 08:54:35,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:54:37,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 08:54:37,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 08:54:38,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:54:41,394 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.934e+02 2.148e+02 2.455e+02 3.795e+02, threshold=4.297e+02, percent-clipped=0.0 2023-10-04 08:54:44,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:54:44,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1596906.6666666667, ans=0.0 2023-10-04 08:54:45,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:54:46,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1596906.6666666667, ans=0.125 2023-10-04 08:54:47,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:54:47,699 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 08:54:48,317 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.00 vs. limit=15.0 2023-10-04 08:54:50,906 INFO [train.py:1046] (1/4) Epoch 46, batch 500, loss[loss=0.1477, simple_loss=0.2434, pruned_loss=0.02599, over 24585.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2349, pruned_loss=0.03661, over 4344694.56 frames. ], batch size: 71, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:54:52,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:54:53,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:54:53,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:54:53,805 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 08:54:55,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 08:54:55,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:54:58,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:55:02,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:55:02,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:55:05,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:55:05,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:55:05,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:15,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:55:16,365 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=13.46 vs. limit=15.0 2023-10-04 08:55:17,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 08:55:17,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:55:17,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:55:19,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 08:55:19,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:55:22,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:55:23,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:55:23,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:55:23,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:55:24,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 08:55:26,792 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 08:55:28,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:55:30,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:32,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:32,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:33,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 08:55:33,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 08:55:37,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:55:39,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:55:43,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:55:46,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:51,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:55:55,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 08:55:55,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:55:56,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:55:59,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 08:56:00,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 08:56:00,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:56:04,262 INFO [train.py:1046] (1/4) Epoch 46, batch 550, loss[loss=0.1617, simple_loss=0.2454, pruned_loss=0.03905, over 24014.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2363, pruned_loss=0.03711, over 4422934.79 frames. ], batch size: 86, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 08:56:05,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 08:56:06,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 08:56:08,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:56:08,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 08:56:08,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:56:08,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:56:09,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:11,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:11,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:56:13,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:56:14,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:56:14,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 08:56:15,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:56:16,392 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.98 vs. limit=15.0 2023-10-04 08:56:20,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:56:20,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:23,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:56:25,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:26,948 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:56:28,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 08:56:29,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 08:56:30,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:56:36,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:56:36,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:56:37,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:56:40,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:56:40,557 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 08:56:40,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1597440.0, ans=0.5 2023-10-04 08:56:42,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:45,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 08:56:46,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:56:46,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1597506.6666666667, ans=0.1 2023-10-04 08:56:47,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:56:47,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:56:49,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:56:49,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 08:56:49,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 08:56:51,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:56:51,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:56:52,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:56:52,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:56:56,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:56:57,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:57:00,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:57:00,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:57:00,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 08:57:01,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:57:03,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:57:03,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:57:04,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:57:04,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:57:05,868 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 08:57:10,265 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.752e+02 2.017e+02 2.263e+02 2.749e+02 3.801e+02, threshold=4.526e+02, percent-clipped=0.0 2023-10-04 08:57:10,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1597573.3333333333, ans=0.0 2023-10-04 08:57:11,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 08:57:14,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 08:57:16,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:57:16,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:57:16,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:57:17,313 INFO [train.py:1046] (1/4) Epoch 46, batch 600, loss[loss=0.1645, simple_loss=0.235, pruned_loss=0.04704, over 23784.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2364, pruned_loss=0.03752, over 4472166.97 frames. ], batch size: 164, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 08:57:23,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:57:27,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:57:28,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 08:57:31,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:57:32,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1597706.6666666667, ans=0.1 2023-10-04 08:57:33,341 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.72 vs. limit=10.0 2023-10-04 08:57:33,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:57:35,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:57:36,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1597706.6666666667, ans=0.125 2023-10-04 08:57:38,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 08:57:38,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:57:39,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1597706.6666666667, ans=0.0 2023-10-04 08:57:44,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 08:57:48,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:57:48,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:57:48,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:57:54,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:57:54,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:57:54,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:58:02,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:58:02,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1597840.0, ans=0.125 2023-10-04 08:58:06,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:58:06,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:58:06,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:58:08,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1597840.0, ans=0.125 2023-10-04 08:58:11,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1597840.0, ans=0.125 2023-10-04 08:58:12,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 08:58:19,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 08:58:19,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:58:21,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 08:58:22,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:58:24,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 08:58:24,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:58:25,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:58:26,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1597906.6666666667, ans=0.0 2023-10-04 08:58:29,267 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1597906.6666666667, ans=0.0 2023-10-04 08:58:30,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 08:58:31,715 INFO [train.py:1046] (1/4) Epoch 46, batch 650, loss[loss=0.1551, simple_loss=0.2415, pruned_loss=0.03435, over 23833.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2352, pruned_loss=0.03715, over 4526535.26 frames. ], batch size: 85, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 08:58:31,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:58:34,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:58:35,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:58:37,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:58:38,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1597973.3333333333, ans=0.2 2023-10-04 08:58:39,730 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.15 vs. limit=22.5 2023-10-04 08:58:40,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 08:58:41,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:58:44,152 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.62 vs. limit=15.0 2023-10-04 08:58:44,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:58:44,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:58:48,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:58:52,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 08:58:55,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:58:55,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:58:58,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:58:58,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 08:58:59,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1598040.0, ans=0.125 2023-10-04 08:59:00,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2.whitening_limit, batch_count=1598106.6666666667, ans=15.0 2023-10-04 08:59:01,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:59:02,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:02,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:59:04,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:05,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:59:08,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:59:08,448 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 08:59:08,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:59:08,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:59:12,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:13,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:59:15,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:59:15,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:59:17,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 08:59:18,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:59:18,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:59:19,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 08:59:19,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:59:20,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:59:22,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 08:59:24,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 08:59:24,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:24,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:59:26,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:59:26,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:59:27,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:59:30,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=1598240.0, ans=0.025 2023-10-04 08:59:30,679 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.77 vs. limit=10.0 2023-10-04 08:59:32,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:32,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:59:34,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:59:35,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:59:36,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 08:59:37,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:59:38,311 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 2.015e+02 2.292e+02 2.659e+02 4.120e+02, threshold=4.583e+02, percent-clipped=0.0 2023-10-04 08:59:41,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1598240.0, ans=0.0 2023-10-04 08:59:43,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:59:43,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:59:43,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:59:43,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:59:45,589 INFO [train.py:1046] (1/4) Epoch 46, batch 700, loss[loss=0.1239, simple_loss=0.1821, pruned_loss=0.03285, over 19371.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2348, pruned_loss=0.0364, over 4578538.56 frames. ], batch size: 388, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 08:59:49,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 08:59:49,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 08:59:53,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 08:59:54,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:55,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:59:57,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 09:00:01,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:00:03,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:00:05,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:00:07,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:00:07,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:00:09,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1598373.3333333333, ans=0.125 2023-10-04 09:00:10,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:00:12,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 09:00:12,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:00:14,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 09:00:15,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1598440.0, ans=0.125 2023-10-04 09:00:17,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 09:00:20,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:00:21,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:00:23,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:00:26,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:00:26,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 09:00:31,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:00:31,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:00:31,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 09:00:35,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:00:35,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1598506.6666666667, ans=0.125 2023-10-04 09:00:36,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:00:39,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:00:44,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:00:44,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 09:00:47,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 09:00:47,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 09:00:47,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1598573.3333333333, ans=0.2 2023-10-04 09:00:49,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:00:51,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:00:53,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:00:53,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1598573.3333333333, ans=0.125 2023-10-04 09:00:53,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1598573.3333333333, ans=0.1 2023-10-04 09:00:55,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:00:55,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 09:00:59,920 INFO [train.py:1046] (1/4) Epoch 46, batch 750, loss[loss=0.1487, simple_loss=0.2299, pruned_loss=0.03371, over 24318.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2335, pruned_loss=0.03619, over 4607142.86 frames. ], batch size: 61, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 09:01:01,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 09:01:01,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 09:01:01,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 09:01:03,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 09:01:03,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 09:01:03,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:01:06,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 09:01:06,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:01:07,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:01:08,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1598640.0, ans=0.125 2023-10-04 09:01:10,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:01:11,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:01:11,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:01:11,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:01:14,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:01:16,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:01:17,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:01:20,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:01:20,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:01:22,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 09:01:23,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:01:24,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1598706.6666666667, ans=0.2 2023-10-04 09:01:25,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:01:27,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:01:28,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 09:01:30,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 09:01:30,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:01:32,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 09:01:32,856 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 09:01:34,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 09:01:34,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 09:01:34,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 09:01:36,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:01:43,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:01:43,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:01:43,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:01:45,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:01:45,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:01:47,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 09:01:47,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:01:48,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 09:01:50,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:01:53,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:01:53,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 09:01:53,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1598840.0, ans=0.125 2023-10-04 09:01:54,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:02:00,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:02:01,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:02:01,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:04,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:02:06,748 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 2.037e+02 2.258e+02 2.551e+02 3.884e+02, threshold=4.516e+02, percent-clipped=0.0 2023-10-04 09:02:08,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 09:02:08,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:02:10,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:02:12,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:02:12,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:02:14,217 INFO [train.py:1046] (1/4) Epoch 46, batch 800, loss[loss=0.1451, simple_loss=0.226, pruned_loss=0.03214, over 24569.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.234, pruned_loss=0.03586, over 4655135.04 frames. ], batch size: 60, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 09:02:15,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:02:15,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:02:23,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:02:23,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:02:24,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:02:25,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:02:27,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:02:27,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:31,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:02:35,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:02:35,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:02:35,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1599040.0, ans=0.1 2023-10-04 09:02:38,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 09:02:39,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:39,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:02:39,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1599040.0, ans=0.125 2023-10-04 09:02:41,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:02:41,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:02:41,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 09:02:42,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:02:42,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 09:02:45,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:02:46,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:02:49,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:02:49,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:02:51,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:51,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:54,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:02:55,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:02:56,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 09:02:56,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1599106.6666666667, ans=0.09899494936611666 2023-10-04 09:02:57,400 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 09:02:59,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 09:02:59,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:02:59,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:03:02,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:03:02,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:03:06,436 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 09:03:06,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 09:03:07,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:03:09,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:03:12,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:03:17,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:03:17,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 09:03:19,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:03:21,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1599240.0, ans=0.07 2023-10-04 09:03:22,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 09:03:28,260 INFO [train.py:1046] (1/4) Epoch 46, batch 850, loss[loss=0.1374, simple_loss=0.2217, pruned_loss=0.0266, over 24303.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2346, pruned_loss=0.03592, over 4671999.59 frames. ], batch size: 61, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 09:03:28,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:03:31,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:03:31,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 09:03:33,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:03:33,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:03:35,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 09:03:35,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:03:35,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:03:37,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:03:39,334 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.92 vs. limit=15.0 2023-10-04 09:03:39,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:03:41,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:03:42,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 09:03:42,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 09:03:42,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 09:03:47,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:03:47,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:03:47,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:03:48,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:03:48,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:03:52,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1599373.3333333333, ans=0.0 2023-10-04 09:03:53,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:03:53,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:03:53,850 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.92 vs. limit=15.0 2023-10-04 09:03:54,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 09:03:55,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 09:03:56,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1599440.0, ans=0.0 2023-10-04 09:03:57,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:03:59,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 09:04:02,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1599440.0, ans=0.0 2023-10-04 09:04:04,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 09:04:05,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 09:04:08,596 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 09:04:08,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:04:08,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:04:08,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 09:04:08,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=1599440.0, ans=0.5 2023-10-04 09:04:10,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1599440.0, ans=0.125 2023-10-04 09:04:11,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:04:11,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:04:11,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 09:04:14,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:04:14,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:04:15,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:04:16,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 09:04:17,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:04:17,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1599506.6666666667, ans=0.125 2023-10-04 09:04:20,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 09:04:21,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 09:04:24,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:04:24,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:04:24,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:04:24,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:04:25,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:04:26,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1599573.3333333333, ans=0.0 2023-10-04 09:04:29,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:04:30,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 09:04:32,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:04:32,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:04:34,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:04:35,621 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 2.012e+02 2.248e+02 2.524e+02 3.712e+02, threshold=4.497e+02, percent-clipped=0.0 2023-10-04 09:04:42,570 INFO [train.py:1046] (1/4) Epoch 46, batch 900, loss[loss=0.1503, simple_loss=0.2266, pruned_loss=0.03701, over 24441.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2356, pruned_loss=0.0363, over 4682229.19 frames. ], batch size: 58, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:04:42,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:04:42,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:04:42,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 09:04:44,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:04:44,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:04:47,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 09:04:51,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:04:55,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:04:55,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 09:04:58,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:04:58,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 09:04:59,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 09:05:01,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:05:01,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:05:01,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:05:02,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:05:10,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:05:10,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:05:11,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:05:13,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:05:18,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 09:05:19,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:05:21,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1599773.3333333333, ans=0.125 2023-10-04 09:05:23,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:05:24,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:05:24,987 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 09:05:25,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1599840.0, ans=0.0 2023-10-04 09:05:26,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 09:05:32,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:05:32,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:05:32,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:05:37,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:05:37,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:05:40,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 09:05:41,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:05:43,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 09:05:44,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:05:46,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:05:47,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:05:47,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:05:51,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 09:05:51,879 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 09:05:53,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 09:05:53,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 09:05:55,831 INFO [train.py:1046] (1/4) Epoch 46, batch 950, loss[loss=0.1651, simple_loss=0.2476, pruned_loss=0.04132, over 24091.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2365, pruned_loss=0.03697, over 4680186.20 frames. ], batch size: 80, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:05:57,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:06:00,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1599973.3333333333, ans=0.1 2023-10-04 09:06:04,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 09:06:07,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:06:10,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:06:10,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:06:10,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 09:06:13,481 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 09:06:17,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1600040.0, ans=0.0 2023-10-04 09:06:18,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:06:18,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:06:18,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:06:19,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:06:19,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 09:06:19,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 09:06:21,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:06:22,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 09:06:23,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:06:27,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:06:27,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:06:28,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:06:29,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 09:06:31,069 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1600106.6666666667, ans=0.0 2023-10-04 09:06:32,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 09:06:33,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1600106.6666666667, ans=0.2 2023-10-04 09:06:34,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:06:35,563 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.79 vs. limit=6.0 2023-10-04 09:06:36,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:06:39,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:06:39,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:06:42,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 09:06:44,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 09:06:44,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:06:45,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:06:46,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:06:46,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:06:51,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 09:06:51,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:06:52,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1600173.3333333333, ans=0.0 2023-10-04 09:06:54,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:06:54,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:06:54,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 09:06:54,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:06:54,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:06:54,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1600173.3333333333, ans=0.1 2023-10-04 09:06:55,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 09:06:59,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:07:00,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1600240.0, ans=0.0 2023-10-04 09:07:02,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:07:05,366 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.058e+02 2.335e+02 2.951e+02 5.020e+02, threshold=4.671e+02, percent-clipped=3.0 2023-10-04 09:07:06,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:07:08,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 09:07:08,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 09:07:12,946 INFO [train.py:1046] (1/4) Epoch 46, batch 1000, loss[loss=0.1341, simple_loss=0.2002, pruned_loss=0.03406, over 23550.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2348, pruned_loss=0.0368, over 4665683.82 frames. ], batch size: 256, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:07:13,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:07:17,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 09:07:18,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:07:21,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:07:21,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 09:07:21,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 09:07:28,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:07:28,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:07:30,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:07:33,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 09:07:36,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 09:07:38,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 09:07:38,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:07:38,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 09:07:39,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 09:07:39,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 09:07:41,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:07:42,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:07:51,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:07:51,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:07:54,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:07:54,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:07:54,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 09:07:54,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:07:55,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:07:57,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:07:57,252 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 09:08:00,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 09:08:02,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 09:08:02,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 09:08:04,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:08:06,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1600506.6666666667, ans=0.1 2023-10-04 09:08:11,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:08:11,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:08:11,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:08:14,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:08:15,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 09:08:16,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:08:17,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 09:08:18,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 09:08:19,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:08:19,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:08:21,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1600573.3333333333, ans=0.1 2023-10-04 09:08:22,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:08:24,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:08:25,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:08:27,228 INFO [train.py:1046] (1/4) Epoch 46, batch 1050, loss[loss=0.1601, simple_loss=0.2354, pruned_loss=0.04237, over 23775.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2336, pruned_loss=0.0366, over 4680690.81 frames. ], batch size: 164, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:08:28,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:08:28,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:08:31,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 09:08:32,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:08:33,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:08:34,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:08:34,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1600640.0, ans=0.0 2023-10-04 09:08:34,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1600640.0, ans=0.125 2023-10-04 09:08:36,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:08:39,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:08:39,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:08:40,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:08:40,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:08:42,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 09:08:42,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1600706.6666666667, ans=0.0 2023-10-04 09:08:43,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:08:43,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 09:08:47,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:08:47,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 09:08:47,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:08:51,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:08:53,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:08:53,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:08:56,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 09:08:56,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 09:08:56,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:09:01,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 09:09:02,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 09:09:04,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:09:07,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 09:09:09,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1600773.3333333333, ans=0.125 2023-10-04 09:09:10,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:09:10,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:09:12,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:09:16,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:09:18,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 09:09:21,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 09:09:21,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 09:09:21,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:09:23,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:09:24,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 09:09:27,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:09:29,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:09:29,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:09:30,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:09:30,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:09:32,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1600906.6666666667, ans=0.125 2023-10-04 09:09:34,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1600906.6666666667, ans=0.0 2023-10-04 09:09:35,600 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 1.988e+02 2.157e+02 2.390e+02 3.023e+02, threshold=4.315e+02, percent-clipped=0.0 2023-10-04 09:09:35,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:09:37,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 09:09:38,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:09:38,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 09:09:39,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 09:09:39,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:09:41,067 INFO [train.py:1046] (1/4) Epoch 46, batch 1100, loss[loss=0.1605, simple_loss=0.2522, pruned_loss=0.03443, over 24458.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2332, pruned_loss=0.03612, over 4690121.55 frames. ], batch size: 66, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:09:43,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:09:48,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:09:52,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1600973.3333333333, ans=0.5 2023-10-04 09:09:54,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:09:55,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:09:55,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:09:55,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 09:09:57,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:10:00,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 09:10:02,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:10:06,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:10:06,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 09:10:08,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 09:10:08,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:10:08,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:10:10,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:10:12,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:10:16,696 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.78 vs. limit=15.0 2023-10-04 09:10:18,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:10:19,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 09:10:21,219 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 09:10:21,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:10:21,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1601106.6666666667, ans=0.1 2023-10-04 09:10:22,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:10:24,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:10:24,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:10:24,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1601173.3333333333, ans=0.0 2023-10-04 09:10:26,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 09:10:26,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:10:26,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:10:26,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:10:28,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:10:28,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 09:10:33,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1601173.3333333333, ans=0.125 2023-10-04 09:10:34,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:10:34,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 09:10:36,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:10:40,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:10:42,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 09:10:42,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 09:10:43,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:10:45,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1601240.0, ans=0.1 2023-10-04 09:10:46,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1601240.0, ans=0.125 2023-10-04 09:10:48,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:10:48,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:10:49,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 09:10:50,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:10:50,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:10:52,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 09:10:52,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:10:53,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 09:10:54,822 INFO [train.py:1046] (1/4) Epoch 46, batch 1150, loss[loss=0.1458, simple_loss=0.2322, pruned_loss=0.02964, over 24566.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2337, pruned_loss=0.0364, over 4697681.56 frames. ], batch size: 60, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:10:54,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:10:54,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:10:55,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:10:59,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:11:01,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:11:04,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:11:04,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:11:04,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 09:11:06,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:11:08,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 09:11:10,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:11:10,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1601373.3333333333, ans=0.1 2023-10-04 09:11:11,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:11:13,490 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.78 vs. limit=15.0 2023-10-04 09:11:15,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 09:11:17,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:11:21,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:11:21,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:11:22,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 09:11:22,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:11:22,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:11:24,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1601440.0, ans=0.125 2023-10-04 09:11:25,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1601440.0, ans=0.125 2023-10-04 09:11:27,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 09:11:27,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:11:29,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:11:29,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1601440.0, ans=0.125 2023-10-04 09:11:40,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:11:45,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:11:45,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 09:11:46,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:11:46,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1601506.6666666667, ans=0.0 2023-10-04 09:11:47,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:11:52,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1601573.3333333333, ans=0.1 2023-10-04 09:11:53,497 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 09:11:54,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:12:00,492 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 09:12:01,746 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.018e+02 2.194e+02 2.403e+02 3.349e+02, threshold=4.388e+02, percent-clipped=0.0 2023-10-04 09:12:05,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:12:06,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:12:06,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:12:06,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:12:08,532 INFO [train.py:1046] (1/4) Epoch 46, batch 1200, loss[loss=0.1393, simple_loss=0.2096, pruned_loss=0.03454, over 24290.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2341, pruned_loss=0.03661, over 4701473.50 frames. ], batch size: 56, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:12:11,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:12:15,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:12:15,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:12:16,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:12:16,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:12:16,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:12:19,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:12:19,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1601640.0, ans=0.125 2023-10-04 09:12:21,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:12:21,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:12:22,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:12:24,451 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 09:12:24,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1601706.6666666667, ans=0.0 2023-10-04 09:12:27,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 09:12:30,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:12:32,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:12:34,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:12:37,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:12:37,850 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 09:12:37,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:12:41,638 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:12:45,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:12:45,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:12:45,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 09:12:45,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:12:48,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 09:12:52,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 09:12:52,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:12:53,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:12:54,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:12:55,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:12:55,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:12:55,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:12:57,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:12:57,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1601840.0, ans=0.0 2023-10-04 09:12:58,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 09:12:58,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:12:58,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:12:58,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:12:59,595 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.42 vs. limit=12.0 2023-10-04 09:13:01,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:13:01,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:13:04,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 09:13:07,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:13:09,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 09:13:09,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1601906.6666666667, ans=0.04949747468305833 2023-10-04 09:13:09,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1601906.6666666667, ans=0.05 2023-10-04 09:13:13,791 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 09:13:14,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1601906.6666666667, ans=0.0 2023-10-04 09:13:15,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:13:15,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1601906.6666666667, ans=0.2 2023-10-04 09:13:16,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:13:17,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:13:19,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:13:21,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 09:13:22,536 INFO [train.py:1046] (1/4) Epoch 46, batch 1250, loss[loss=0.1661, simple_loss=0.241, pruned_loss=0.04562, over 23537.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2346, pruned_loss=0.03705, over 4699409.69 frames. ], batch size: 256, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:13:25,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:13:27,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:13:27,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 09:13:29,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:13:30,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1601973.3333333333, ans=0.125 2023-10-04 09:13:31,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:13:35,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 09:13:35,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:13:36,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1602040.0, ans=0.1 2023-10-04 09:13:37,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:13:37,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:13:37,965 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.90 vs. limit=10.0 2023-10-04 09:13:40,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:13:40,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1602040.0, ans=0.125 2023-10-04 09:13:43,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1602040.0, ans=0.1 2023-10-04 09:13:44,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 09:13:44,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 09:13:44,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:13:46,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:13:47,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:13:51,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:13:52,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:13:55,006 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.35 vs. limit=15.0 2023-10-04 09:13:56,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 09:13:57,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:13:59,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:14:00,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 09:14:00,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:14:00,964 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 09:14:02,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:14:02,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:14:06,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:14:08,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1602173.3333333333, ans=0.125 2023-10-04 09:14:09,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:14:11,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:14:11,626 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:14:13,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 09:14:13,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 09:14:13,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 09:14:15,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:14:18,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 09:14:18,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:14:21,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 09:14:21,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:14:22,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 09:14:22,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 09:14:23,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:14:25,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:14:25,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:14:25,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1602240.0, ans=0.0 2023-10-04 09:14:26,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 09:14:28,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:14:29,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:14:31,292 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.684e+02 2.091e+02 2.329e+02 2.674e+02 3.922e+02, threshold=4.659e+02, percent-clipped=0.0 2023-10-04 09:14:31,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:14:34,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 09:14:35,890 INFO [train.py:1046] (1/4) Epoch 46, batch 1300, loss[loss=0.141, simple_loss=0.2192, pruned_loss=0.03138, over 23383.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.235, pruned_loss=0.0368, over 4712297.37 frames. ], batch size: 134, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:14:36,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:14:36,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 09:14:40,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1602306.6666666667, ans=0.125 2023-10-04 09:14:41,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:14:43,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:14:45,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:14:46,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:14:46,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:14:48,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 09:14:54,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:14:55,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:14:58,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 09:14:59,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:15:03,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:15:04,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:15:04,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1602440.0, ans=0.04949747468305833 2023-10-04 09:15:05,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:15:07,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:15:08,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:15:09,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 09:15:10,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 09:15:16,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:15:16,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:15:18,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 09:15:18,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 09:15:19,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:15:21,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:15:22,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 09:15:23,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:15:23,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 09:15:25,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:15:29,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:15:29,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:15:32,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 09:15:32,796 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:15:33,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 09:15:34,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1602573.3333333333, ans=0.1 2023-10-04 09:15:34,536 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.29 vs. limit=15.0 2023-10-04 09:15:35,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 09:15:40,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:15:43,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 09:15:43,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1602573.3333333333, ans=0.0 2023-10-04 09:15:44,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:15:48,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1602640.0, ans=0.0 2023-10-04 09:15:50,505 INFO [train.py:1046] (1/4) Epoch 46, batch 1350, loss[loss=0.1463, simple_loss=0.2372, pruned_loss=0.02775, over 24437.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2339, pruned_loss=0.03658, over 4716901.58 frames. ], batch size: 69, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:15:52,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 09:15:54,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:15:55,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1602640.0, ans=0.0 2023-10-04 09:15:58,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:16:00,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:16:00,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:16:02,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:16:02,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:16:05,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:16:05,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 09:16:08,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:16:10,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:16:13,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 09:16:14,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:16:14,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:16:14,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 09:16:15,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 09:16:17,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 09:16:19,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:16:19,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 09:16:19,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1602773.3333333333, ans=0.125 2023-10-04 09:16:29,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1602773.3333333333, ans=0.2 2023-10-04 09:16:32,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:16:33,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1602840.0, ans=0.0 2023-10-04 09:16:40,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:16:40,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:16:41,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1602840.0, ans=0.2 2023-10-04 09:16:42,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 09:16:43,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:16:46,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 09:16:46,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:16:46,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:16:47,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:16:49,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1602906.6666666667, ans=0.125 2023-10-04 09:16:51,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 09:16:53,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:16:58,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1602906.6666666667, ans=0.1 2023-10-04 09:16:59,675 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 2.063e+02 2.245e+02 2.755e+02 4.029e+02, threshold=4.491e+02, percent-clipped=0.0 2023-10-04 09:16:59,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 09:17:01,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 09:17:02,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1602973.3333333333, ans=0.125 2023-10-04 09:17:03,915 INFO [train.py:1046] (1/4) Epoch 46, batch 1400, loss[loss=0.1616, simple_loss=0.2405, pruned_loss=0.04137, over 23594.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2331, pruned_loss=0.03646, over 4720640.22 frames. ], batch size: 106, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:17:05,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 09:17:08,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:17:11,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:17:11,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:17:17,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 09:17:17,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 09:17:26,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:17:29,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:17:29,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:17:30,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1603040.0, ans=0.1 2023-10-04 09:17:31,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 09:17:34,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:17:35,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 09:17:44,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:17:44,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:17:49,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 09:17:50,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:17:52,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:17:52,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:17:52,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:17:53,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:17:53,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:17:53,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:17:55,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 09:17:55,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:17:58,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:00,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1603173.3333333333, ans=0.125 2023-10-04 09:18:00,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=1603173.3333333333, ans=10.0 2023-10-04 09:18:00,631 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.99 vs. limit=15.0 2023-10-04 09:18:01,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:18:04,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1603240.0, ans=0.0 2023-10-04 09:18:10,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 09:18:11,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 09:18:12,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:18:14,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 09:18:15,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:18:16,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1603240.0, ans=0.2 2023-10-04 09:18:17,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:18:19,020 INFO [train.py:1046] (1/4) Epoch 46, batch 1450, loss[loss=0.1586, simple_loss=0.2441, pruned_loss=0.03659, over 23997.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2325, pruned_loss=0.03648, over 4719158.54 frames. ], batch size: 80, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:18:19,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:18:20,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:18:20,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:22,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 09:18:26,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:18:26,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:18:28,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:18:29,639 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.36 vs. limit=10.0 2023-10-04 09:18:29,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 09:18:31,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:18:32,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 09:18:32,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:32,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:18:32,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 09:18:34,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:18:34,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:18:35,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 09:18:35,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:18:37,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:18:39,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:39,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1603373.3333333333, ans=0.1 2023-10-04 09:18:40,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:18:43,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:18:43,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:18:43,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1603373.3333333333, ans=0.1 2023-10-04 09:18:43,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1603373.3333333333, ans=0.2 2023-10-04 09:18:44,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:18:45,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:48,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:18:48,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:18:49,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:49,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:18:52,089 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.45 vs. limit=15.0 2023-10-04 09:18:54,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 09:18:57,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:19:00,805 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 09:19:00,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:19:02,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:19:02,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:19:03,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 09:19:07,830 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.86 vs. limit=10.0 2023-10-04 09:19:08,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:19:09,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 09:19:10,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 09:19:14,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:19:15,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1603506.6666666667, ans=0.125 2023-10-04 09:19:19,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:19:19,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:19:22,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 09:19:25,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 09:19:25,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 09:19:27,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:19:28,537 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.754e+02 2.040e+02 2.275e+02 2.758e+02 4.535e+02, threshold=4.550e+02, percent-clipped=1.0 2023-10-04 09:19:28,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:19:33,288 INFO [train.py:1046] (1/4) Epoch 46, batch 1500, loss[loss=0.1481, simple_loss=0.2277, pruned_loss=0.03427, over 23649.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2332, pruned_loss=0.03663, over 4710538.77 frames. ], batch size: 232, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:19:37,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 09:19:37,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1603640.0, ans=0.125 2023-10-04 09:19:39,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:19:39,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:19:40,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:19:40,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:19:41,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:19:42,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 09:19:43,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:19:43,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:19:43,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:19:45,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:19:46,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:19:48,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:19:51,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:19:51,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 09:19:52,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:19:52,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:19:52,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:19:56,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1603706.6666666667, ans=0.125 2023-10-04 09:19:57,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 09:20:01,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 09:20:03,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:20:03,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 09:20:03,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1603773.3333333333, ans=0.125 2023-10-04 09:20:06,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 09:20:09,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:20:09,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:20:09,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:20:12,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 09:20:12,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:20:12,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:20:13,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 09:20:13,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:20:13,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1603773.3333333333, ans=0.125 2023-10-04 09:20:18,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:20:18,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 09:20:23,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:20:25,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:20:29,859 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 09:20:29,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:20:29,906 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 09:20:30,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:20:31,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:20:32,833 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 09:20:34,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:20:36,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 09:20:38,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:20:40,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:20:40,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:20:40,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:20:42,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:20:42,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:20:44,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 09:20:44,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 09:20:45,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:20:46,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 09:20:46,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 09:20:48,075 INFO [train.py:1046] (1/4) Epoch 46, batch 1550, loss[loss=0.1477, simple_loss=0.2214, pruned_loss=0.03697, over 23846.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2341, pruned_loss=0.03692, over 4707195.75 frames. ], batch size: 179, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:20:49,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:20:50,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:20:50,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:20:52,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:20:52,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1603973.3333333333, ans=0.125 2023-10-04 09:20:53,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:20:53,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:20:57,114 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 09:20:57,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:20:57,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:20:58,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:21:00,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:21:00,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 09:21:02,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:21:02,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 09:21:04,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 09:21:04,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 09:21:04,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:21:05,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:21:09,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:21:11,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 09:21:11,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 09:21:20,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:21:24,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:21:24,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 09:21:25,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:21:26,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 09:21:31,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:21:32,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:21:34,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1604173.3333333333, ans=0.0 2023-10-04 09:21:35,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:21:38,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:21:38,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:21:38,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 09:21:38,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:21:42,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:21:42,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:21:43,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 09:21:43,400 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 09:21:44,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:21:47,761 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:21:50,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 09:21:56,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:21:57,459 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.703e+02 2.083e+02 2.296e+02 2.597e+02 3.892e+02, threshold=4.592e+02, percent-clipped=0.0 2023-10-04 09:21:57,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:21:57,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 09:21:58,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:22:00,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:22:00,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:22:00,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:22:00,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:22:02,098 INFO [train.py:1046] (1/4) Epoch 46, batch 1600, loss[loss=0.1733, simple_loss=0.2542, pruned_loss=0.04623, over 24388.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2348, pruned_loss=0.03715, over 4718405.50 frames. ], batch size: 77, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:22:04,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:22:04,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 09:22:06,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 09:22:06,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 09:22:09,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:22:11,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 09:22:12,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:22:15,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:22:18,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:22:21,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 09:22:23,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1604373.3333333333, ans=0.125 2023-10-04 09:22:25,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:22:25,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 09:22:25,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:22:27,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 09:22:31,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 09:22:35,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1604440.0, ans=0.2 2023-10-04 09:22:37,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:22:39,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 09:22:40,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:22:41,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:22:41,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:22:44,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 09:22:45,622 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:22:48,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 09:22:51,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:22:51,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:22:52,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:22:52,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:22:55,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:22:55,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:22:58,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:23:03,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:23:03,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:23:07,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 09:23:07,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:23:07,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 09:23:13,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:23:14,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:23:14,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_ff2.min_abs, batch_count=1604640.0, ans=0.1 2023-10-04 09:23:15,856 INFO [train.py:1046] (1/4) Epoch 46, batch 1650, loss[loss=0.1558, simple_loss=0.2332, pruned_loss=0.03922, over 18405.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2355, pruned_loss=0.0374, over 4714103.53 frames. ], batch size: 40, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:23:15,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:23:15,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 09:23:15,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 09:23:15,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 09:23:17,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 09:23:21,497 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.45 vs. limit=6.0 2023-10-04 09:23:22,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:23:22,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:23:22,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:23:22,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:23:23,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:23:26,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 09:23:29,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:23:29,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:23:29,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:23:29,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:23:29,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 09:23:30,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 09:23:34,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:23:38,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:23:43,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1604706.6666666667, ans=0.2 2023-10-04 09:23:44,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 09:23:46,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:23:48,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 09:23:51,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:23:54,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:23:54,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:23:54,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:23:55,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:23:55,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:24:00,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:24:00,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:24:00,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:24:01,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:24:01,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:24:01,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:24:04,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:24:07,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 09:24:08,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:24:09,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 09:24:11,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 09:24:11,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 09:24:11,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:24:13,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:24:13,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:24:13,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:24:13,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 09:24:18,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:24:19,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:24:19,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:24:22,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 09:24:25,065 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.077e+02 2.252e+02 2.648e+02 5.011e+02, threshold=4.504e+02, percent-clipped=3.0 2023-10-04 09:24:26,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:24:26,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:24:26,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 09:24:26,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:24:26,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:24:26,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:24:29,298 INFO [train.py:1046] (1/4) Epoch 46, batch 1700, loss[loss=0.1452, simple_loss=0.2278, pruned_loss=0.03128, over 24481.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2338, pruned_loss=0.03719, over 4695636.08 frames. ], batch size: 63, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:24:29,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:24:29,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:24:29,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 09:24:30,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:24:33,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1604973.3333333333, ans=0.2 2023-10-04 09:24:38,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:24:41,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:24:48,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:24:48,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:24:49,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:24:49,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:24:52,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 09:24:53,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:24:55,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:24:55,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1605040.0, ans=0.2 2023-10-04 09:24:56,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:24:57,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:24:58,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1605106.6666666667, ans=0.05 2023-10-04 09:24:59,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 09:25:00,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 09:25:03,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:25:04,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 09:25:06,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:25:09,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1605106.6666666667, ans=0.0 2023-10-04 09:25:14,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:25:16,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:25:16,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:25:19,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 09:25:19,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 09:25:19,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:25:22,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:25:22,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 09:25:22,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:25:22,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:25:22,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:25:22,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:25:24,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1605173.3333333333, ans=0.125 2023-10-04 09:25:25,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:25:25,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:25:27,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:25:28,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:25:28,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:25:32,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:25:33,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 09:25:35,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:25:36,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:25:36,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1605240.0, ans=0.0 2023-10-04 09:25:39,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 09:25:41,873 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.51 vs. limit=15.0 2023-10-04 09:25:42,512 INFO [train.py:1046] (1/4) Epoch 46, batch 1750, loss[loss=0.1436, simple_loss=0.2217, pruned_loss=0.03275, over 24327.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.232, pruned_loss=0.03682, over 4688028.37 frames. ], batch size: 56, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:25:47,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:25:50,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:25:50,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:25:52,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 09:25:52,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:25:54,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:25:54,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:25:57,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 09:25:59,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:26:01,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 09:26:01,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:26:02,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:26:05,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 09:26:08,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 09:26:09,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:26:09,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 09:26:17,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:26:22,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:26:22,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:26:24,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:26:24,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:26:26,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:26:27,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:26:28,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1605506.6666666667, ans=0.0 2023-10-04 09:26:30,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:26:31,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:26:32,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 09:26:34,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:26:36,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 09:26:37,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:26:39,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:26:40,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:26:43,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:26:44,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 09:26:45,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:26:46,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1605573.3333333333, ans=0.125 2023-10-04 09:26:47,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:26:53,139 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.053e+02 2.343e+02 2.960e+02 5.357e+02, threshold=4.686e+02, percent-clipped=4.0 2023-10-04 09:26:53,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:26:55,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:26:56,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:26:57,941 INFO [train.py:1046] (1/4) Epoch 46, batch 1800, loss[loss=0.1517, simple_loss=0.2268, pruned_loss=0.03832, over 23860.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2318, pruned_loss=0.03666, over 4687708.93 frames. ], batch size: 195, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:26:58,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 09:26:58,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:26:58,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:26:58,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:26:58,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:26:59,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:26:59,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:27:02,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:27:02,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:27:02,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1605640.0, ans=10.0 2023-10-04 09:27:04,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 09:27:06,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:27:09,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 09:27:09,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:27:12,109 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.67 vs. limit=15.0 2023-10-04 09:27:12,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:27:15,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:27:15,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:27:17,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:27:20,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:27:20,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 09:27:20,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:27:23,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:27:28,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 09:27:30,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 09:27:30,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 09:27:31,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:27:31,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:27:31,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:27:32,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:27:38,095 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 09:27:39,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:27:40,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1605773.3333333333, ans=0.125 2023-10-04 09:27:41,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:27:42,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 09:27:43,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 09:27:44,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:27:45,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:27:45,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:27:51,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 09:27:58,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:27:58,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 09:27:59,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:27:59,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:28:00,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:28:00,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 09:28:03,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1605906.6666666667, ans=0.0 2023-10-04 09:28:05,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:28:05,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:28:06,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 09:28:06,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:28:09,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:28:10,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:28:10,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:28:10,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1605973.3333333333, ans=0.125 2023-10-04 09:28:11,941 INFO [train.py:1046] (1/4) Epoch 46, batch 1850, loss[loss=0.1444, simple_loss=0.2275, pruned_loss=0.03058, over 24342.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2325, pruned_loss=0.03667, over 4692637.45 frames. ], batch size: 61, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:28:12,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:28:12,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:28:14,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:28:14,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:28:15,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:28:16,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:28:23,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:28:23,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 09:28:27,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 09:28:30,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 09:28:33,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:28:33,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 09:28:33,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 09:28:43,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:28:46,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 09:28:50,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:28:50,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:28:55,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 09:28:55,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:28:55,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:28:58,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:29:00,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:29:01,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:29:04,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:29:06,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:29:06,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 09:29:06,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:29:07,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:29:08,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:29:11,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 09:29:13,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:29:15,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:29:16,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:29:16,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 09:29:16,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 09:29:18,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1606240.0, ans=10.0 2023-10-04 09:29:19,086 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 09:29:19,157 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 09:29:20,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:29:20,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:29:20,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:29:20,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:29:20,733 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 09:29:21,808 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 1.971e+02 2.198e+02 2.495e+02 3.601e+02, threshold=4.397e+02, percent-clipped=0.0 2023-10-04 09:29:21,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:29:21,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:29:23,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:29:23,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:29:26,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:29:26,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 09:29:26,231 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:29:27,249 INFO [train.py:1046] (1/4) Epoch 46, batch 1900, loss[loss=0.191, simple_loss=0.2646, pruned_loss=0.05877, over 19488.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2337, pruned_loss=0.03688, over 4705220.56 frames. ], batch size: 388, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:29:28,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:29:28,703 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 09:29:28,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:29:30,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:29:32,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1606306.6666666667, ans=0.2 2023-10-04 09:29:34,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:29:36,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:29:36,253 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 09:29:37,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 09:29:38,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:29:39,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:29:39,014 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 09:29:39,038 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 09:29:43,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 09:29:43,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1606373.3333333333, ans=0.09899494936611666 2023-10-04 09:29:44,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:29:49,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 09:29:52,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 09:30:00,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1606440.0, ans=0.125 2023-10-04 09:30:03,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 09:30:04,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 09:30:04,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:30:06,065 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 09:30:06,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 09:30:06,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 09:30:07,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 09:30:07,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:30:11,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 09:30:13,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:30:16,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:30:16,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 09:30:17,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:30:23,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 09:30:23,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:30:24,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1606506.6666666667, ans=0.0 2023-10-04 09:30:29,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:30:29,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:30:29,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:30:29,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:30:29,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1606573.3333333333, ans=0.0 2023-10-04 09:30:31,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:30:32,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 09:30:32,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:30:35,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:30:35,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:30:38,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:30:38,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:30:39,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:30:41,085 INFO [train.py:1046] (1/4) Epoch 46, batch 1950, loss[loss=0.1633, simple_loss=0.2408, pruned_loss=0.0429, over 23743.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2349, pruned_loss=0.0374, over 4690878.37 frames. ], batch size: 212, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:30:41,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:30:43,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:30:45,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:30:45,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:30:47,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:30:48,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 09:30:50,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 09:30:51,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:30:51,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:30:54,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:30:56,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:30:56,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:30:57,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:30:59,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1606706.6666666667, ans=0.125 2023-10-04 09:31:00,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:31:00,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:31:00,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:31:01,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:31:05,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:31:08,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:31:08,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:31:08,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 09:31:08,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 09:31:08,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:31:09,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:31:09,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:31:13,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:31:16,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:31:18,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:31:21,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:31:21,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:31:22,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 09:31:22,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:31:27,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:31:28,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:31:30,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:31:38,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:31:39,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:31:41,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1606906.6666666667, ans=0.125 2023-10-04 09:31:42,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:31:43,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:31:46,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:31:46,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:31:46,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 09:31:46,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:31:47,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:31:49,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 09:31:49,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1606906.6666666667, ans=0.2 2023-10-04 09:31:50,519 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.065e+02 2.286e+02 2.732e+02 4.457e+02, threshold=4.573e+02, percent-clipped=1.0 2023-10-04 09:31:50,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:31:55,473 INFO [train.py:1046] (1/4) Epoch 46, batch 2000, loss[loss=0.1539, simple_loss=0.2417, pruned_loss=0.03308, over 24471.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.235, pruned_loss=0.03686, over 4709873.96 frames. ], batch size: 69, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:31:55,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:31:55,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:31:56,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:31:58,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:31:59,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:32:02,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 09:32:02,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:32:06,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:32:07,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 09:32:08,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:32:10,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:32:11,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:32:12,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 09:32:14,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:17,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:17,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:17,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 09:32:18,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:32:19,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 09:32:19,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:32:23,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:32:25,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 09:32:25,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:25,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:32:26,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:32:27,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 09:32:30,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 09:32:30,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:32:30,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:32:34,565 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.27 vs. limit=15.0 2023-10-04 09:32:38,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:32:39,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:32:39,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:32:40,290 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=1607173.3333333333, ans=22.5 2023-10-04 09:32:41,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:32:42,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:32:42,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:32:42,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:32:42,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:32:44,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:48,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:32:48,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 09:32:49,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1607173.3333333333, ans=0.0 2023-10-04 09:32:52,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:32:53,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:32:59,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:32:59,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:33:00,158 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=9.28 vs. limit=15.0 2023-10-04 09:33:02,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:04,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:33:04,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:04,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:33:05,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:33:08,411 INFO [train.py:1046] (1/4) Epoch 46, batch 2050, loss[loss=0.1524, simple_loss=0.2335, pruned_loss=0.03569, over 23411.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2344, pruned_loss=0.03679, over 4711805.10 frames. ], batch size: 119, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:33:08,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:33:10,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:12,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:33:14,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:18,070 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.96 vs. limit=15.0 2023-10-04 09:33:18,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:33:18,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1607306.6666666667, ans=0.125 2023-10-04 09:33:20,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:33:20,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:21,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:33:22,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 09:33:22,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:33:24,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:33:25,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:33:35,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1607373.3333333333, ans=0.2 2023-10-04 09:33:37,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:33:37,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:33:38,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1607440.0, ans=0.1 2023-10-04 09:33:39,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 09:33:41,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:33:43,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 09:33:43,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:33:44,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:33:46,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:33:47,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:33:47,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:33:50,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:33:52,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:33:52,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:33:54,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:33:55,960 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:33:59,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:33:59,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:34:03,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:34:08,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:34:09,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 09:34:15,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:34:16,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:34:18,172 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.010e+02 2.178e+02 2.574e+02 3.822e+02, threshold=4.355e+02, percent-clipped=0.0 2023-10-04 09:34:18,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:34:19,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 09:34:22,381 INFO [train.py:1046] (1/4) Epoch 46, batch 2100, loss[loss=0.1597, simple_loss=0.2289, pruned_loss=0.04531, over 23735.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2333, pruned_loss=0.03672, over 4697978.96 frames. ], batch size: 164, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:34:23,828 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 09:34:23,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:34:25,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:34:25,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:34:26,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:34:26,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 09:34:27,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 09:34:27,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1607640.0, ans=0.125 2023-10-04 09:34:28,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:34:30,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1607640.0, ans=0.125 2023-10-04 09:34:31,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:34:33,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:34:35,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:34:36,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:34:36,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 09:34:37,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:34:37,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 09:34:37,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 09:34:40,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:34:40,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:34:40,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 09:34:42,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 09:34:46,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 09:34:46,627 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:34:49,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:34:50,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:34:53,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:34:54,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 09:34:54,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:34:54,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 09:34:57,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 09:34:57,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:34:57,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 09:34:57,669 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 09:34:57,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 09:35:00,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:35:02,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:35:05,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:35:06,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:35:07,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:10,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:35:10,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 09:35:10,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:35:10,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:35:12,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:12,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 09:35:13,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 09:35:13,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 09:35:14,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1607840.0, ans=0.125 2023-10-04 09:35:18,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:35:20,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:35:20,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 09:35:26,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:35:29,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:35:29,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:35:29,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:35:29,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 09:35:31,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:35:32,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:35:32,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:35:32,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1607906.6666666667, ans=0.0 2023-10-04 09:35:33,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:35:33,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:35,165 INFO [train.py:1046] (1/4) Epoch 46, batch 2150, loss[loss=0.152, simple_loss=0.2397, pruned_loss=0.03215, over 23779.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2331, pruned_loss=0.03622, over 4714172.91 frames. ], batch size: 85, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:35:35,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 09:35:36,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 09:35:36,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:35:39,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:35:39,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:35:40,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:35:41,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:35:45,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 09:35:48,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:35:49,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:51,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:35:51,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:35:51,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:35:54,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:54,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:35:54,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:35:56,124 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.13 vs. limit=6.0 2023-10-04 09:35:58,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:35:58,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 09:36:03,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:36:05,406 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.58 vs. limit=6.0 2023-10-04 09:36:06,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:36:07,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:07,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:36:08,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:08,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:36:08,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:36:09,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:36:10,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:36:10,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 09:36:13,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:36:14,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:36:14,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:36:15,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:36:17,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:36:19,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:36:19,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:36:21,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:36:21,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 09:36:21,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:36:23,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:36:25,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:26,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:36:26,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:36:27,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:36:29,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:29,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 09:36:31,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 09:36:31,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:36:33,283 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 09:36:33,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:36:34,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:36:34,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 09:36:34,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:36:34,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 09:36:36,201 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 09:36:36,201 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 09:36:36,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 09:36:37,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:36:38,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:36:39,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:36:39,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:40,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 09:36:40,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:36:40,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:48,294 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.972e+02 2.248e+02 2.510e+02 3.914e+02, threshold=4.495e+02, percent-clipped=0.0 2023-10-04 09:36:48,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:36:48,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 09:36:49,744 INFO [train.py:1046] (1/4) Epoch 46, batch 2200, loss[loss=0.1435, simple_loss=0.2331, pruned_loss=0.02694, over 24660.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2327, pruned_loss=0.03629, over 4693960.77 frames. ], batch size: 73, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:36:52,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:36:55,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:56,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:36:56,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:36:58,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:37:00,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:37:02,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:37:02,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 09:37:06,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 09:37:06,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1608373.3333333333, ans=0.07 2023-10-04 09:37:08,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:37:09,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1608373.3333333333, ans=0.125 2023-10-04 09:37:09,216 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1608373.3333333333, ans=0.0 2023-10-04 09:37:15,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 09:37:18,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:37:19,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:37:19,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:37:25,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:37:25,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 09:37:25,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1608440.0, ans=0.125 2023-10-04 09:37:29,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:37:30,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:37:31,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 09:37:33,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:37:35,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:37:36,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:37:36,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1608506.6666666667, ans=0.125 2023-10-04 09:37:39,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:37:40,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 09:37:40,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:37:42,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 09:37:45,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:37:45,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 09:37:45,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:37:47,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:37:47,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:37:47,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:37:47,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:37:48,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:37:49,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1608573.3333333333, ans=0.125 2023-10-04 09:37:50,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:37:51,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 09:37:54,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 09:37:54,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:37:55,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:37:57,363 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 09:38:00,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:38:00,107 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 09:38:01,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:38:01,979 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 09:38:02,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1608640.0, ans=0.1 2023-10-04 09:38:03,652 INFO [train.py:1046] (1/4) Epoch 46, batch 2250, loss[loss=0.1684, simple_loss=0.2585, pruned_loss=0.03919, over 24576.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2336, pruned_loss=0.03669, over 4694401.47 frames. ], batch size: 71, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:38:03,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:38:04,285 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.82 vs. limit=22.5 2023-10-04 09:38:05,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 09:38:06,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:38:07,973 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 09:38:10,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:38:13,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:38:13,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1608640.0, ans=0.0 2023-10-04 09:38:18,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:38:19,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:38:23,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:38:23,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:38:25,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:38:25,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1608706.6666666667, ans=0.1 2023-10-04 09:38:26,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 09:38:26,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:38:26,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:38:28,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 09:38:29,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:38:29,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:38:32,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:38:37,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:38:38,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 09:38:38,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:38:40,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 09:38:41,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:38:43,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:38:48,908 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.19 vs. limit=22.5 2023-10-04 09:38:49,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:38:50,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:38:52,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:38:52,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:38:53,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:38:55,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:38:56,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:38:59,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:39:05,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 09:39:05,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:39:05,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:39:06,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1608906.6666666667, ans=0.0 2023-10-04 09:39:12,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 09:39:15,080 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.142e+02 2.478e+02 2.835e+02 4.262e+02, threshold=4.957e+02, percent-clipped=0.0 2023-10-04 09:39:15,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:39:15,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 09:39:15,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:39:15,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:39:17,284 INFO [train.py:1046] (1/4) Epoch 46, batch 2300, loss[loss=0.1495, simple_loss=0.235, pruned_loss=0.03199, over 24316.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.235, pruned_loss=0.0371, over 4702787.51 frames. ], batch size: 61, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:39:18,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 09:39:21,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:39:21,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:39:27,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:39:27,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:39:28,704 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 09:39:30,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:39:36,784 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.28 vs. limit=6.0 2023-10-04 09:39:37,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:39:37,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:39:38,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:39:38,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:39:38,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 09:39:40,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:39:43,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:39:44,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:39:48,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:39:52,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:39:54,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:39:58,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:39:59,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:40:01,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:40:04,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:40:07,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:40:08,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:40:08,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:40:09,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 09:40:10,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1609173.3333333333, ans=0.125 2023-10-04 09:40:12,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 09:40:12,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:40:12,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:40:12,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:40:13,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:40:13,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 09:40:13,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 09:40:15,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 09:40:15,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:40:15,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:40:15,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 09:40:23,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:40:27,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:40:31,426 INFO [train.py:1046] (1/4) Epoch 46, batch 2350, loss[loss=0.1474, simple_loss=0.2121, pruned_loss=0.0414, over 22530.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2358, pruned_loss=0.03756, over 4704930.66 frames. ], batch size: 322, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:40:31,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:40:31,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:40:31,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 09:40:33,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:40:33,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:40:34,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:40:35,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 09:40:40,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1609306.6666666667, ans=0.125 2023-10-04 09:40:42,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:40:42,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 09:40:45,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1609373.3333333333, ans=0.0 2023-10-04 09:40:46,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 09:40:48,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:40:51,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:40:51,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:40:51,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:40:51,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:40:52,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 09:40:56,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:41:01,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1609440.0, ans=0.2 2023-10-04 09:41:02,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 09:41:02,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:41:07,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:41:07,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:41:09,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:41:10,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 09:41:11,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:41:12,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:41:12,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:41:12,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:41:17,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:41:20,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 09:41:20,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:41:23,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:41:23,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:41:24,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 09:41:26,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:41:27,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 09:41:27,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:41:32,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 09:41:35,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 09:41:35,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:41:35,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 09:41:35,376 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 09:41:35,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 09:41:39,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 09:41:42,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:41:43,487 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.219e+02 2.483e+02 2.970e+02 4.725e+02, threshold=4.966e+02, percent-clipped=0.0 2023-10-04 09:41:44,913 INFO [train.py:1046] (1/4) Epoch 46, batch 2400, loss[loss=0.1394, simple_loss=0.2035, pruned_loss=0.0376, over 22776.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2344, pruned_loss=0.03727, over 4701512.64 frames. ], batch size: 322, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:41:46,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:41:49,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:41:50,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:41:50,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 09:41:50,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 09:41:56,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1609640.0, ans=0.125 2023-10-04 09:41:57,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 09:41:57,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:41:59,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 09:41:59,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:42:00,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:00,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 09:42:02,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1609706.6666666667, ans=0.0 2023-10-04 09:42:04,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1609706.6666666667, ans=0.0 2023-10-04 09:42:06,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:08,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 09:42:08,608 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.99 vs. limit=15.0 2023-10-04 09:42:14,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:42:20,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 09:42:22,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:42:22,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:26,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:42:27,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 09:42:27,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:42:33,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:42:34,230 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:42:36,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:42:37,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1609840.0, ans=0.125 2023-10-04 09:42:37,670 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.60 vs. limit=15.0 2023-10-04 09:42:38,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:42:39,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:42:39,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:42:39,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:42:39,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:42:39,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:42:40,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:42:42,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:42:44,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:42:44,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 09:42:44,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1609906.6666666667, ans=0.2 2023-10-04 09:42:46,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 09:42:47,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:42:49,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:42:49,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 09:42:51,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 09:42:51,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 09:42:51,745 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 09:42:53,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 09:42:54,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:42:55,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:55,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:42:56,053 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 09:42:57,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:57,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 09:43:00,353 INFO [train.py:1046] (1/4) Epoch 46, batch 2450, loss[loss=0.1395, simple_loss=0.2154, pruned_loss=0.03175, over 24606.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2331, pruned_loss=0.03693, over 4695733.79 frames. ], batch size: 60, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:43:00,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:43:00,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:43:00,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1609973.3333333333, ans=0.0 2023-10-04 09:43:04,320 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.10 vs. limit=12.0 2023-10-04 09:43:05,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:43:05,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:43:05,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 09:43:06,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1609973.3333333333, ans=0.0 2023-10-04 09:43:08,660 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.55 vs. limit=15.0 2023-10-04 09:43:09,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:43:09,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:43:12,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:43:12,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:43:12,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:43:14,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 09:43:18,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:43:21,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:43:21,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:43:24,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:43:25,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:43:25,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:43:27,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:43:28,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 09:43:29,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:43:36,313 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:43:37,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:43:37,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:43:37,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:43:37,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:43:38,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:43:39,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:43:40,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 09:43:43,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1610173.3333333333, ans=0.0 2023-10-04 09:43:44,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:43:44,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:43:48,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:43:48,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:43:54,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:43:54,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 09:43:55,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:43:55,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:43:55,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 09:43:57,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:43:57,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:44:01,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:44:03,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:44:04,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:44:07,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 09:44:08,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:44:13,038 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.042e+02 2.322e+02 2.681e+02 4.445e+02, threshold=4.643e+02, percent-clipped=0.0 2023-10-04 09:44:14,565 INFO [train.py:1046] (1/4) Epoch 46, batch 2500, loss[loss=0.1407, simple_loss=0.2206, pruned_loss=0.03042, over 20968.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2322, pruned_loss=0.03673, over 4692113.46 frames. ], batch size: 45, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:44:14,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:44:22,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1610306.6666666667, ans=0.125 2023-10-04 09:44:25,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:44:26,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:44:27,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:44:27,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 09:44:31,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1610373.3333333333, ans=0.025 2023-10-04 09:44:34,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:44:36,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:44:37,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 09:44:37,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 09:44:38,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 09:44:39,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:44:39,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:44:39,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 09:44:40,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:44:40,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 09:44:40,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:44:46,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:44:48,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:44:48,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1610440.0, ans=0.125 2023-10-04 09:44:50,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 09:44:50,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 09:44:52,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:44:54,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:44:58,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:45:01,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:45:02,309 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.80 vs. limit=15.0 2023-10-04 09:45:04,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:45:10,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 09:45:11,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 09:45:13,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:45:13,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 09:45:14,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:45:14,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 09:45:16,020 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 09:45:16,021 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 09:45:16,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 09:45:19,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:45:20,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 09:45:20,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 09:45:22,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:45:23,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 09:45:26,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 09:45:28,210 INFO [train.py:1046] (1/4) Epoch 46, batch 2550, loss[loss=0.1384, simple_loss=0.2282, pruned_loss=0.02432, over 24469.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2328, pruned_loss=0.03642, over 4708744.22 frames. ], batch size: 63, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:45:28,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:45:29,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:45:29,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:45:31,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:45:32,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 09:45:32,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:45:36,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 09:45:38,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:45:40,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:45:42,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:45:42,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 09:45:44,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:45:44,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:45:45,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:45:47,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:45:47,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 09:45:48,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 09:45:48,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:45:48,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 09:45:51,019 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.47 vs. limit=22.5 2023-10-04 09:45:55,277 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=15.0 2023-10-04 09:45:56,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1610773.3333333333, ans=0.0 2023-10-04 09:45:59,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:46:02,686 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.97 vs. limit=15.0 2023-10-04 09:46:04,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:46:06,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:46:06,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:46:06,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:46:12,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:46:15,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:46:15,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:46:15,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:46:15,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:46:15,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:46:18,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:46:18,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:46:24,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:46:25,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 09:46:25,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:46:27,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:46:29,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:46:30,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:46:31,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:46:33,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1610906.6666666667, ans=0.125 2023-10-04 09:46:39,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:46:40,242 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 1.961e+02 2.101e+02 2.394e+02 3.747e+02, threshold=4.203e+02, percent-clipped=0.0 2023-10-04 09:46:40,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:46:41,749 INFO [train.py:1046] (1/4) Epoch 46, batch 2600, loss[loss=0.1467, simple_loss=0.2273, pruned_loss=0.03303, over 24368.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2342, pruned_loss=0.03673, over 4711745.57 frames. ], batch size: 61, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:46:43,255 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 09:46:47,837 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 09:46:47,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:46:47,905 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 09:46:47,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 09:46:47,996 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 09:46:51,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:46:51,387 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 09:46:51,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 09:46:52,781 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 09:46:55,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:46:56,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 09:46:59,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 09:47:01,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:47:01,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 09:47:04,139 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 09:47:04,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 09:47:12,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:47:12,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:47:12,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:47:12,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 09:47:14,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:47:18,972 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.70 vs. limit=15.0 2023-10-04 09:47:19,704 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 09:47:23,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:47:24,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:47:24,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 09:47:25,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:47:25,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:47:26,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1611173.3333333333, ans=0.125 2023-10-04 09:47:27,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 09:47:29,087 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1611173.3333333333, ans=0.125 2023-10-04 09:47:30,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:47:30,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:47:32,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:47:35,464 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 09:47:36,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:47:36,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:47:42,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:47:42,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:47:42,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 09:47:43,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:47:45,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:47:47,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:47:51,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 09:47:51,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:47:54,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 09:47:56,007 INFO [train.py:1046] (1/4) Epoch 46, batch 2650, loss[loss=0.1493, simple_loss=0.2367, pruned_loss=0.03093, over 23994.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2344, pruned_loss=0.03658, over 4712575.33 frames. ], batch size: 80, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:47:57,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 09:47:57,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:47:59,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:48:00,674 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 09:48:00,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:48:04,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:48:06,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 09:48:07,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:48:09,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:48:10,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 09:48:10,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:48:12,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:48:13,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1611373.3333333333, ans=0.0 2023-10-04 09:48:14,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 09:48:15,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1611373.3333333333, ans=0.125 2023-10-04 09:48:16,789 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 09:48:19,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:48:23,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 09:48:23,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:48:23,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 09:48:28,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:48:28,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 09:48:28,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:48:29,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:48:34,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 09:48:34,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 09:48:34,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1611440.0, ans=0.0 2023-10-04 09:48:35,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:48:40,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 09:48:40,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1611506.6666666667, ans=0.015 2023-10-04 09:48:42,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:48:42,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:48:43,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:48:43,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:48:43,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:48:44,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:48:46,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:48:48,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:48:48,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:48:49,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:48:49,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1611506.6666666667, ans=0.125 2023-10-04 09:48:50,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:48:52,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:48:52,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:48:53,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:48:54,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 09:48:58,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:48:58,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:48:59,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:48:59,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 09:49:01,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:49:03,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:49:03,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:49:03,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:04,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:49:04,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:04,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1611573.3333333333, ans=0.0 2023-10-04 09:49:07,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:49:07,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 09:49:09,084 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 2.031e+02 2.301e+02 2.644e+02 3.771e+02, threshold=4.602e+02, percent-clipped=0.0 2023-10-04 09:49:10,572 INFO [train.py:1046] (1/4) Epoch 46, batch 2700, loss[loss=0.2081, simple_loss=0.2803, pruned_loss=0.06796, over 19460.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2351, pruned_loss=0.0371, over 4696858.50 frames. ], batch size: 389, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:49:10,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:49:10,922 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1611640.0, ans=0.2 2023-10-04 09:49:12,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 09:49:14,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:49:14,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:14,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:15,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:49:15,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:49:16,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:49:16,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:49:16,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 09:49:18,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:49:19,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1611640.0, ans=0.1 2023-10-04 09:49:20,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:49:22,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:49:22,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:49:25,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:49:27,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 09:49:27,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1611706.6666666667, ans=0.2 2023-10-04 09:49:28,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:49:33,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:49:33,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:49:35,058 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.30 vs. limit=15.0 2023-10-04 09:49:35,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1611706.6666666667, ans=0.125 2023-10-04 09:49:35,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1611706.6666666667, ans=0.125 2023-10-04 09:49:40,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:49:40,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:49:40,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:49:40,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:49:44,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:49:47,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:49:47,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:49:47,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:49:52,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:52,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:49:58,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1611840.0, ans=0.1 2023-10-04 09:50:00,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:50:00,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:50:03,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:50:03,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:50:07,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:50:08,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:50:09,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:50:11,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:13,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:50:13,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:50:16,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:50:16,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:50:16,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:50:20,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 09:50:20,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:50:23,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:50:24,548 INFO [train.py:1046] (1/4) Epoch 46, batch 2750, loss[loss=0.1448, simple_loss=0.2314, pruned_loss=0.0291, over 24456.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2353, pruned_loss=0.03728, over 4702270.11 frames. ], batch size: 63, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:50:24,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 09:50:26,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 09:50:26,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:50:26,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1611973.3333333333, ans=0.125 2023-10-04 09:50:31,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:50:31,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:50:32,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:33,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:50:33,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:36,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:50:36,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 09:50:36,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:50:36,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:36,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 09:50:36,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:50:38,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:50:43,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 09:50:45,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:50:46,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:48,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:50:48,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 09:50:49,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:50:51,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:50:52,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:50:52,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:50:55,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:50:56,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 09:50:57,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:50:57,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:59,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:51:04,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:51:07,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 09:51:07,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:51:11,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:51:11,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:51:12,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:51:18,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:51:18,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:51:18,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 09:51:23,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:51:25,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 09:51:29,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 09:51:31,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:51:31,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 09:51:32,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:51:33,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:51:33,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 09:51:35,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:51:37,874 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 1.982e+02 2.271e+02 2.856e+02 5.103e+02, threshold=4.543e+02, percent-clipped=1.0 2023-10-04 09:51:37,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 09:51:38,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:51:39,235 INFO [train.py:1046] (1/4) Epoch 46, batch 2800, loss[loss=0.1525, simple_loss=0.2284, pruned_loss=0.03836, over 23779.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2332, pruned_loss=0.03676, over 4699889.95 frames. ], batch size: 164, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:51:39,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:51:39,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 09:51:39,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:51:39,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:51:43,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:51:44,011 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 09:51:44,011 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 09:51:45,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1612306.6666666667, ans=0.0 2023-10-04 09:51:47,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:51:49,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:51:49,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:51:50,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1612306.6666666667, ans=0.1 2023-10-04 09:51:50,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1612306.6666666667, ans=0.07 2023-10-04 09:51:53,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:51:53,632 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:51:54,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 09:51:56,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 09:51:58,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 09:51:58,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:51:59,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:51:59,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:52:01,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:52:02,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:52:02,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:52:02,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:52:09,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:52:10,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:52:12,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:52:14,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:52:14,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:52:19,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1612440.0, ans=0.125 2023-10-04 09:52:21,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:52:21,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 09:52:21,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:52:22,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:52:22,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:52:26,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:52:26,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:52:30,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:52:31,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1612506.6666666667, ans=0.125 2023-10-04 09:52:32,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:52:32,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:52:32,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 09:52:33,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:52:33,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:52:35,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:52:36,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 09:52:36,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:52:37,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:52:39,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:52:41,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 09:52:42,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:52:42,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:52:42,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:52:45,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 09:52:51,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:52:53,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:52:53,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:52:53,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1612640.0, ans=0.0 2023-10-04 09:52:54,414 INFO [train.py:1046] (1/4) Epoch 46, batch 2850, loss[loss=0.1462, simple_loss=0.2246, pruned_loss=0.03393, over 23728.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2323, pruned_loss=0.03634, over 4693503.66 frames. ], batch size: 135, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:52:54,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:52:56,223 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:52:59,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:52:59,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:53:00,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:53:01,189 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.79 vs. limit=15.0 2023-10-04 09:53:03,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:53:03,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:53:03,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1612640.0, ans=0.09899494936611666 2023-10-04 09:53:06,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:53:06,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 09:53:13,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 09:53:13,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:53:15,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 09:53:16,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:53:17,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 09:53:19,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 09:53:21,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:53:21,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1612773.3333333333, ans=0.0 2023-10-04 09:53:28,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.27 vs. limit=22.5 2023-10-04 09:53:33,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:53:33,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:53:33,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:53:34,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:53:34,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:53:35,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:53:37,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:53:37,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 09:53:38,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:53:38,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:53:38,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:53:40,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:53:43,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:53:43,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:53:44,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:53:47,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:53:48,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:53:48,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:53:50,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:53:53,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:53:59,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:53:59,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 09:54:00,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 09:54:02,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 09:54:02,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:54:04,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 09:54:04,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:54:05,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:54:05,600 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:54:05,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:54:05,624 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 09:54:05,657 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 09:54:05,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:54:05,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1612906.6666666667, ans=0.0 2023-10-04 09:54:06,879 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.728e+02 2.024e+02 2.306e+02 2.714e+02 5.189e+02, threshold=4.613e+02, percent-clipped=2.0 2023-10-04 09:54:07,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:54:08,219 INFO [train.py:1046] (1/4) Epoch 46, batch 2900, loss[loss=0.1614, simple_loss=0.2534, pruned_loss=0.03471, over 24060.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2327, pruned_loss=0.03622, over 4708040.84 frames. ], batch size: 80, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:54:11,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:54:11,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:54:11,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:54:12,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 09:54:16,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1612973.3333333333, ans=0.2 2023-10-04 09:54:17,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:54:17,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 09:54:17,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1612973.3333333333, ans=0.125 2023-10-04 09:54:18,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1612973.3333333333, ans=0.0 2023-10-04 09:54:19,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 09:54:19,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:54:19,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:54:22,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:54:22,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:54:27,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:54:27,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:54:31,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 09:54:31,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 09:54:33,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:54:33,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:54:36,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 09:54:36,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 09:54:39,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:54:39,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 09:54:39,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:54:41,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:54:41,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:54:43,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:54:45,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:54:49,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:54:52,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:54:52,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten.whitening_limit, batch_count=1613173.3333333333, ans=22.5 2023-10-04 09:54:53,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 09:54:53,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 09:54:53,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:54:58,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:55:00,235 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.06 vs. limit=12.0 2023-10-04 09:55:00,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 09:55:02,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:55:08,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:55:16,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:55:16,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:55:16,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 09:55:19,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:55:19,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 09:55:20,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:55:21,300 INFO [train.py:1046] (1/4) Epoch 46, batch 2950, loss[loss=0.1589, simple_loss=0.244, pruned_loss=0.03696, over 23990.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2336, pruned_loss=0.03678, over 4712687.58 frames. ], batch size: 86, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:55:21,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:55:26,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:55:28,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 09:55:30,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:55:30,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:55:33,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:55:33,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:55:33,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1613306.6666666667, ans=0.035 2023-10-04 09:55:34,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 09:55:35,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 09:55:36,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 09:55:38,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:55:43,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:55:44,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:55:46,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:55:46,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:55:49,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:55:49,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:55:51,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:55:52,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:55:52,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:55:54,216 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1613440.0, ans=0.07 2023-10-04 09:55:55,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 09:55:58,779 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.05 vs. limit=10.0 2023-10-04 09:55:59,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 09:55:59,505 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 09:56:01,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:56:04,091 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 09:56:04,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 09:56:05,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:56:05,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:56:05,674 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 09:56:05,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 09:56:08,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 09:56:09,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:56:09,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:56:11,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:56:13,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:56:13,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:56:14,553 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 09:56:14,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:56:15,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 09:56:20,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:56:21,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:56:22,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 09:56:22,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:56:22,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1613573.3333333333, ans=0.1 2023-10-04 09:56:24,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 09:56:27,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:56:28,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:56:28,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:56:32,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:56:32,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 09:56:32,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:56:33,947 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.976e+02 2.238e+02 2.495e+02 4.043e+02, threshold=4.475e+02, percent-clipped=0.0 2023-10-04 09:56:34,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:56:34,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:56:35,348 INFO [train.py:1046] (1/4) Epoch 46, batch 3000, loss[loss=0.1554, simple_loss=0.2352, pruned_loss=0.03779, over 24627.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2346, pruned_loss=0.03678, over 4722371.34 frames. ], batch size: 65, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:56:35,349 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 09:56:47,861 INFO [train.py:1078] (1/4) Epoch 46, validation: loss=0.3542, simple_loss=0.2819, pruned_loss=0.2132, over 1125622.00 frames. 2023-10-04 09:56:47,862 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 09:56:47,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 09:56:49,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:56:50,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:56:52,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:56:52,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 09:56:52,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:56:56,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:56:56,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:56:59,170 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 09:57:00,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 09:57:02,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:57:03,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:57:03,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 09:57:03,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:57:10,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:57:20,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:57:22,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1613773.3333333333, ans=0.2 2023-10-04 09:57:26,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 09:57:27,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:57:28,088 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.68 vs. limit=15.0 2023-10-04 09:57:28,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:57:28,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:57:29,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:57:31,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:57:31,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 09:57:32,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=1613840.0, ans=0.05 2023-10-04 09:57:33,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 09:57:35,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:57:37,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 09:57:39,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:57:40,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:57:40,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:57:40,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:57:44,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:57:44,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:57:44,732 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:57:46,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1613906.6666666667, ans=0.0 2023-10-04 09:57:47,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:57:48,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 09:57:49,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:57:50,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:57:50,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:57:54,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:57:54,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:57:56,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 09:57:56,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 09:57:56,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:57:56,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 09:57:57,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:57:57,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 09:58:00,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 09:58:02,149 INFO [train.py:1046] (1/4) Epoch 46, batch 3050, loss[loss=0.2241, simple_loss=0.2875, pruned_loss=0.0803, over 19403.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2354, pruned_loss=0.0369, over 4717817.29 frames. ], batch size: 389, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:58:02,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 09:58:02,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 09:58:04,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 09:58:04,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 09:58:04,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:58:05,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:58:05,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 09:58:07,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:07,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:58:10,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 09:58:10,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:58:13,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:58:13,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:58:17,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:20,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 09:58:24,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 09:58:24,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1614040.0, ans=0.0 2023-10-04 09:58:26,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 09:58:26,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:58:26,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1614040.0, ans=0.2 2023-10-04 09:58:28,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:58:32,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:32,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:58:32,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:58:35,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:58:35,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 09:58:36,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:58:37,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:58:37,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:58:40,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:41,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:58:43,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:58:44,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 09:58:44,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:45,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:58:47,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:58:47,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:58:48,032 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.81 vs. limit=15.0 2023-10-04 09:58:48,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:58:48,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:58:53,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:58:53,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:58:53,794 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.28 vs. limit=15.0 2023-10-04 09:58:59,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:01,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:59:01,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:59:02,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:59:02,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 09:59:03,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:59:04,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 09:59:05,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:59:05,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:06,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 09:59:11,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:59:13,159 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.21 vs. limit=15.0 2023-10-04 09:59:15,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:59:16,571 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.988e+02 2.298e+02 2.757e+02 4.239e+02, threshold=4.595e+02, percent-clipped=0.0 2023-10-04 09:59:16,597 INFO [train.py:1046] (1/4) Epoch 46, batch 3100, loss[loss=0.1371, simple_loss=0.2086, pruned_loss=0.03281, over 19420.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2351, pruned_loss=0.037, over 4711667.02 frames. ], batch size: 42, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 09:59:16,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:59:19,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:59:19,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1614306.6666666667, ans=0.125 2023-10-04 09:59:20,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 09:59:24,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 09:59:25,083 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.90 vs. limit=15.0 2023-10-04 09:59:25,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 09:59:28,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:59:28,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1614306.6666666667, ans=0.1 2023-10-04 09:59:31,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:59:31,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:33,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 09:59:37,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:39,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1614373.3333333333, ans=0.125 2023-10-04 09:59:40,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1614373.3333333333, ans=0.125 2023-10-04 09:59:41,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 09:59:41,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1614373.3333333333, ans=0.125 2023-10-04 09:59:46,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 09:59:47,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:59:47,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:59:47,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:59:49,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 09:59:51,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:59:51,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 09:59:51,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:59:53,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:55,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 09:59:55,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1614440.0, ans=0.0 2023-10-04 09:59:56,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:59:58,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1614440.0, ans=0.125 2023-10-04 09:59:59,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:59:59,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 10:00:01,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 10:00:02,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:02,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:00:05,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:05,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:06,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:00:07,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:00:07,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:00:08,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:00:08,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:00:08,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:08,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 10:00:11,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:00:11,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 10:00:14,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:00:15,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 10:00:16,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:16,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:16,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 10:00:23,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1614573.3333333333, ans=0.125 2023-10-04 10:00:27,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 10:00:28,619 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.75 vs. limit=22.5 2023-10-04 10:00:29,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:00:31,001 INFO [train.py:1046] (1/4) Epoch 46, batch 3150, loss[loss=0.1482, simple_loss=0.2172, pruned_loss=0.03958, over 22786.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2342, pruned_loss=0.03689, over 4709854.65 frames. ], batch size: 322, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:00:31,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:34,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:00:34,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:00:34,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 10:00:34,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1614640.0, ans=0.0 2023-10-04 10:00:34,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1614640.0, ans=0.0 2023-10-04 10:00:35,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:00:35,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 10:00:38,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 10:00:40,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:41,604 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 10:00:46,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 10:00:46,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:00:47,568 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 10:00:47,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 10:00:50,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 10:00:51,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 10:00:51,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 10:00:51,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:51,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:00:53,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:53,718 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.03 vs. limit=15.0 2023-10-04 10:00:54,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 10:00:56,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:00:56,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:00:57,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:00:58,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1614706.6666666667, ans=0.0 2023-10-04 10:01:00,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:01:02,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 10:01:02,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1614773.3333333333, ans=0.1 2023-10-04 10:01:04,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:01:07,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 10:01:08,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:01:08,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 10:01:11,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 10:01:11,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:01:12,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:01:12,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 10:01:13,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:01:13,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:01:15,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:01:15,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:01:15,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 10:01:16,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:01:16,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:18,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:01:18,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:01:20,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 10:01:20,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:01:21,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 10:01:21,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:21,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1614840.0, ans=0.0 2023-10-04 10:01:22,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 10:01:24,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 10:01:25,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:01:25,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:01:27,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 10:01:28,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 10:01:28,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:01:32,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:01:33,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:33,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:01:39,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:01:39,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:41,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 10:01:43,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=1614906.6666666667, ans=15.0 2023-10-04 10:01:44,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1614973.3333333333, ans=0.0 2023-10-04 10:01:45,122 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.157e+02 2.549e+02 3.186e+02 5.490e+02, threshold=5.099e+02, percent-clipped=3.0 2023-10-04 10:01:45,149 INFO [train.py:1046] (1/4) Epoch 46, batch 3200, loss[loss=0.1606, simple_loss=0.2482, pruned_loss=0.03648, over 24334.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2335, pruned_loss=0.03636, over 4719955.32 frames. ], batch size: 77, lr: 2.20e-03, grad_scale: 32.0 2023-10-04 10:01:46,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:01:46,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 10:01:51,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:52,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:01:52,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 10:01:53,487 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.74 vs. limit=6.0 2023-10-04 10:01:53,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:01:58,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:02:01,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:02:10,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:02:10,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1615040.0, ans=0.07 2023-10-04 10:02:17,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1615106.6666666667, ans=0.2 2023-10-04 10:02:20,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 10:02:22,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:02:24,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 10:02:25,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:02:28,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:02:28,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:02:29,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:02:34,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 10:02:37,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 10:02:38,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 10:02:40,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 10:02:43,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:02:47,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:02:47,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:02:49,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:02:50,522 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 10:02:50,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:02:53,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:02:54,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1615240.0, ans=0.0 2023-10-04 10:02:54,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1615240.0, ans=0.125 2023-10-04 10:02:55,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 10:02:55,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 10:02:56,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 10:02:58,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 10:02:59,765 INFO [train.py:1046] (1/4) Epoch 46, batch 3250, loss[loss=0.1466, simple_loss=0.2224, pruned_loss=0.03536, over 23610.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2335, pruned_loss=0.03624, over 4720596.68 frames. ], batch size: 256, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:03:01,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:03:04,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 10:03:04,384 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 10:03:04,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:03:04,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:05,858 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 10:03:06,713 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.29 vs. limit=15.0 2023-10-04 10:03:09,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:03:13,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:03:19,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:03:19,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 10:03:20,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:03:21,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:03:21,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:03:22,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:03:22,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:03:25,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:26,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:03:27,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:03:27,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:27,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:27,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:03:28,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:03:30,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:03:32,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:03:32,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:34,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:03:35,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:03:35,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:03:41,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 10:03:41,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:03:41,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:03:42,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:03:43,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1615506.6666666667, ans=0.2 2023-10-04 10:03:44,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:03:47,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1615506.6666666667, ans=0.07 2023-10-04 10:03:49,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:03:55,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1615506.6666666667, ans=0.125 2023-10-04 10:03:56,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:03:56,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:03:56,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 10:03:56,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1615573.3333333333, ans=0.1 2023-10-04 10:03:58,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:03:58,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 10:03:58,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:01,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 10:04:01,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 10:04:03,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:04:04,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:04:05,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:04:07,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 10:04:07,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:04:09,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:04:09,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:04:09,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1615573.3333333333, ans=0.125 2023-10-04 10:04:10,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 10:04:10,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:04:13,238 INFO [train.py:1046] (1/4) Epoch 46, batch 3300, loss[loss=0.1667, simple_loss=0.245, pruned_loss=0.04418, over 23308.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2346, pruned_loss=0.03644, over 4719236.44 frames. ], batch size: 105, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:04:13,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:04:13,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 10:04:14,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1615640.0, ans=0.125 2023-10-04 10:04:15,086 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.072e+02 2.407e+02 3.224e+02 5.952e+02, threshold=4.814e+02, percent-clipped=1.0 2023-10-04 10:04:17,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:04:17,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 10:04:19,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 10:04:19,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 10:04:19,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:04:23,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:04:23,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:04:24,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:26,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 10:04:26,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:04:28,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:04:29,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:04:34,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 10:04:34,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:04:34,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:04:36,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:37,011 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 10:04:38,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:04:39,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:04:40,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:04:40,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:04:40,098 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 10:04:42,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:04:42,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:04:46,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:46,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 10:04:46,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 10:04:46,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:48,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:04:51,515 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 10:04:52,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 10:04:52,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:04:55,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 10:04:59,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:04:59,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1615840.0, ans=0.0 2023-10-04 10:05:02,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:05:02,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:05:04,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:05:05,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:05:05,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:05:06,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:05:06,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:05:06,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:05:07,181 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.46 vs. limit=15.0 2023-10-04 10:05:08,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:05:11,101 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 10:05:12,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 10:05:15,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 10:05:16,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:05:16,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:05:17,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:05:17,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:05:19,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:05:21,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:21,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 10:05:22,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:05:23,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 10:05:24,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1615906.6666666667, ans=0.0 2023-10-04 10:05:25,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 10:05:25,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:05:26,616 INFO [train.py:1046] (1/4) Epoch 46, batch 3350, loss[loss=0.1394, simple_loss=0.2171, pruned_loss=0.03087, over 24445.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2352, pruned_loss=0.03665, over 4738415.00 frames. ], batch size: 58, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:05:26,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:28,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:05:29,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:05:29,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:05:30,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:05:30,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:05:34,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:05:37,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:05:37,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:05:40,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:41,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:05:43,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:05:44,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:05:46,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 10:05:46,144 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 10:05:46,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:05:49,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 10:05:50,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 10:05:52,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:05:52,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:05:52,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1616040.0, ans=0.125 2023-10-04 10:05:53,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:05:54,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 10:05:54,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:54,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:05:57,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:59,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:05:59,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:06:00,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:06:03,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:06:06,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:06:06,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:06:09,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:06:10,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:06:13,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:06:13,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:06:15,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:06:18,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 10:06:18,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 10:06:18,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 10:06:18,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:06:20,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 10:06:20,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1616173.3333333333, ans=0.05 2023-10-04 10:06:21,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:06:22,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:06:29,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:06:30,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 10:06:31,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:06:31,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=1616240.0, ans=0.1 2023-10-04 10:06:33,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:06:33,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:06:37,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:06:39,987 INFO [train.py:1046] (1/4) Epoch 46, batch 3400, loss[loss=0.1475, simple_loss=0.2327, pruned_loss=0.03116, over 23352.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2366, pruned_loss=0.03743, over 4722550.46 frames. ], batch size: 119, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:06:40,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 10:06:40,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:06:40,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:06:41,416 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.680e+02 2.075e+02 2.237e+02 2.507e+02 4.047e+02, threshold=4.473e+02, percent-clipped=0.0 2023-10-04 10:06:41,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:06:41,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 10:06:43,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:06:43,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 10:06:44,073 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.25 vs. limit=15.0 2023-10-04 10:06:46,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:06:46,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:06:46,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:06:47,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:06:47,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 10:06:48,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1616306.6666666667, ans=0.0 2023-10-04 10:06:49,778 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:06:51,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1616306.6666666667, ans=0.125 2023-10-04 10:06:52,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 10:06:52,387 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 10:06:52,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:06:55,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:06:55,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:06:56,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:06:58,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:07:03,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:07:05,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 10:07:10,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:07:11,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:07:11,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:07:13,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 10:07:18,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:07:23,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 10:07:28,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:07:29,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:07:30,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 10:07:30,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:07:30,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:07:30,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:07:30,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1616506.6666666667, ans=0.125 2023-10-04 10:07:31,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:07:36,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:07:38,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:07:39,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:07:45,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:07:46,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 10:07:47,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1616573.3333333333, ans=0.1 2023-10-04 10:07:51,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:07:54,247 INFO [train.py:1046] (1/4) Epoch 46, batch 3450, loss[loss=0.1366, simple_loss=0.2004, pruned_loss=0.03641, over 22796.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2364, pruned_loss=0.03714, over 4722379.31 frames. ], batch size: 322, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:07:54,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 10:07:57,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 10:07:57,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:08:00,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:08:00,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 10:08:01,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:08:04,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:08:09,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:08:11,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:08:12,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:08:12,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:08:15,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:08:19,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 10:08:25,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 10:08:27,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:08:27,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:08:27,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:08:31,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 10:08:32,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:08:34,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1616773.3333333333, ans=0.04949747468305833 2023-10-04 10:08:37,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:08:39,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:08:39,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:08:40,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:08:42,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 10:08:43,019 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.50 vs. limit=15.0 2023-10-04 10:08:43,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:08:44,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:08:46,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:08:50,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 10:08:53,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:08:57,546 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.34 vs. limit=12.0 2023-10-04 10:08:59,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:09:00,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:01,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:09:06,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:06,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:09:06,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:09:06,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:09:07,947 INFO [train.py:1046] (1/4) Epoch 46, batch 3500, loss[loss=0.1595, simple_loss=0.2497, pruned_loss=0.03472, over 24341.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2355, pruned_loss=0.03661, over 4729709.34 frames. ], batch size: 77, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:09:10,152 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.11 vs. limit=22.5 2023-10-04 10:09:11,221 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 2.024e+02 2.165e+02 2.480e+02 4.406e+02, threshold=4.331e+02, percent-clipped=0.0 2023-10-04 10:09:11,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:09:14,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:09:15,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 10:09:16,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:09:20,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 10:09:23,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:09:23,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 10:09:27,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:09:28,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:09:29,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:09:29,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:09:31,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 10:09:31,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:31,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1617040.0, ans=0.0 2023-10-04 10:09:32,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:09:32,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 10:09:35,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:35,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:09:37,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:09:40,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:42,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 10:09:42,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:09:43,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1617106.6666666667, ans=0.1 2023-10-04 10:09:45,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:09:45,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1617106.6666666667, ans=0.125 2023-10-04 10:09:46,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:09:47,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:49,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:09:49,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:09:52,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 10:09:53,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 10:09:53,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 10:09:53,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:09:55,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:56,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:09:56,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:09:59,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 10:09:59,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:10:00,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1617173.3333333333, ans=0.0 2023-10-04 10:10:01,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1617173.3333333333, ans=0.2 2023-10-04 10:10:05,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:10:07,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 10:10:07,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 10:10:07,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:10:08,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:10:08,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:10:10,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:10:13,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 10:10:13,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:10:15,212 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:10:16,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 10:10:19,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 10:10:20,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:10:21,785 INFO [train.py:1046] (1/4) Epoch 46, batch 3550, loss[loss=0.1542, simple_loss=0.241, pruned_loss=0.03369, over 24651.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2335, pruned_loss=0.03638, over 4710048.29 frames. ], batch size: 68, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:10:21,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:10:21,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:10:22,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1617306.6666666667, ans=0.0 2023-10-04 10:10:23,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:10:25,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:10:35,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:10:37,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 10:10:39,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:10:39,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:10:41,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:10:41,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:10:43,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:10:45,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1617373.3333333333, ans=0.125 2023-10-04 10:10:46,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:10:47,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:10:48,377 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.16 vs. limit=15.0 2023-10-04 10:10:48,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:10:48,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 10:10:49,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:10:54,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:10:55,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:10:57,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:10:57,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:10:57,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:10:57,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 10:10:57,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:10:59,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:11:00,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 10:11:05,210 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:11:06,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:06,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:11:08,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:10,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 10:11:10,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:11:11,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 10:11:12,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:11:15,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:11:15,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:11:18,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 10:11:18,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:11:24,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:11:24,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 10:11:25,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:11:30,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:11:30,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 10:11:36,581 INFO [train.py:1046] (1/4) Epoch 46, batch 3600, loss[loss=0.1491, simple_loss=0.2396, pruned_loss=0.02928, over 24475.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2334, pruned_loss=0.03623, over 4716842.19 frames. ], batch size: 66, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:11:36,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 10:11:36,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:11:38,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:11:39,477 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 1.969e+02 2.172e+02 2.569e+02 3.736e+02, threshold=4.344e+02, percent-clipped=0.0 2023-10-04 10:11:40,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:11:40,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:11:44,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:11:46,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:11:49,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:49,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:11:50,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:11:50,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:50,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 10:11:53,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:11:54,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:58,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:12:00,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:12:01,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:12:03,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:12:03,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 10:12:05,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:12:06,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:12:07,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:12:09,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:12:10,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:12:12,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:12:14,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 10:12:20,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:12:21,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:12:21,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 10:12:25,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:12:31,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:12:34,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:12:39,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1617906.6666666667, ans=0.2 2023-10-04 10:12:40,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 10:12:40,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:12:40,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 10:12:42,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 10:12:42,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 10:12:44,700 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.59 vs. limit=10.0 2023-10-04 10:12:45,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:12:45,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:12:48,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 10:12:48,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:12:49,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:12:49,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:12:49,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 10:12:50,888 INFO [train.py:1046] (1/4) Epoch 46, batch 3650, loss[loss=0.1488, simple_loss=0.2234, pruned_loss=0.03713, over 23765.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.234, pruned_loss=0.03644, over 4726649.10 frames. ], batch size: 164, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:12:50,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 10:12:54,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:12:55,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 10:12:59,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 10:13:01,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:13:03,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1617973.3333333333, ans=0.0 2023-10-04 10:13:04,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 10:13:04,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1618040.0, ans=0.125 2023-10-04 10:13:06,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1618040.0, ans=0.125 2023-10-04 10:13:07,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 10:13:11,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:13:11,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:13:11,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:13:15,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 10:13:15,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:13:16,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 10:13:17,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:13:17,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:13:18,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 10:13:19,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 10:13:20,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:13:20,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:13:23,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:13:24,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 10:13:26,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 10:13:28,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:13:29,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 10:13:30,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:13:32,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:13:36,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:13:38,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:13:38,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:13:40,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:13:41,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:13:42,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:13:46,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:13:47,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:13:47,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:13:50,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 10:13:50,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:13:51,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:13:51,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1618240.0, ans=0.2 2023-10-04 10:13:56,933 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 10:14:01,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:14:01,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:14:03,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:14:03,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:14:04,399 INFO [train.py:1046] (1/4) Epoch 46, batch 3700, loss[loss=0.153, simple_loss=0.2305, pruned_loss=0.03777, over 23848.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2342, pruned_loss=0.03654, over 4718163.08 frames. ], batch size: 179, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:14:04,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:14:06,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:14:06,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 10:14:06,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:14:07,881 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.739e+02 2.030e+02 2.313e+02 2.831e+02 4.310e+02, threshold=4.627e+02, percent-clipped=0.0 2023-10-04 10:14:08,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:14:10,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:14:11,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:14:14,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:14:14,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 10:14:14,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:14:15,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:14:17,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:14:20,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:14:23,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:14:23,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:14:24,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:14:24,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:14:25,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 10:14:28,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:14:30,320 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 10:14:33,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1618440.0, ans=0.0 2023-10-04 10:14:34,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:14:35,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 10:14:37,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:14:37,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 10:14:37,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:14:41,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:14:42,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 10:14:43,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:14:45,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:14:46,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:14:46,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:14:50,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:14:50,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1618506.6666666667, ans=0.125 2023-10-04 10:14:54,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:14:54,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 10:14:54,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:14:55,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 10:14:57,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1618506.6666666667, ans=0.125 2023-10-04 10:15:00,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:15:01,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:15:04,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:15:04,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 10:15:08,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:15:08,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 10:15:08,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:15:08,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:15:12,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:15:12,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 10:15:13,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1618573.3333333333, ans=0.0 2023-10-04 10:15:14,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 10:15:14,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:15:14,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:15:15,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:15:17,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:15:18,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:15:19,124 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.15 vs. limit=12.0 2023-10-04 10:15:20,345 INFO [train.py:1046] (1/4) Epoch 46, batch 3750, loss[loss=0.152, simple_loss=0.2422, pruned_loss=0.03092, over 24668.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2349, pruned_loss=0.03649, over 4724835.65 frames. ], batch size: 73, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:15:20,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:15:20,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:15:23,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 10:15:23,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 10:15:27,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:15:27,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 10:15:29,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:15:29,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:15:31,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:15:32,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:15:35,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:15:36,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1618706.6666666667, ans=0.1 2023-10-04 10:15:36,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1618706.6666666667, ans=0.04949747468305833 2023-10-04 10:15:40,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:15:41,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:15:42,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=1618706.6666666667, ans=15.0 2023-10-04 10:15:43,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:15:47,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:15:47,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 10:15:48,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:15:48,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:15:48,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:15:53,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 10:15:56,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 10:15:59,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:15:59,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:16:00,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:16:05,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:16:05,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 10:16:09,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 10:16:13,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:16:13,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1618840.0, ans=0.0 2023-10-04 10:16:18,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:16:18,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:16:22,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:16:24,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 10:16:26,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:16:28,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:16:29,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:16:31,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:16:35,123 INFO [train.py:1046] (1/4) Epoch 46, batch 3800, loss[loss=0.146, simple_loss=0.2331, pruned_loss=0.02946, over 24318.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2349, pruned_loss=0.03652, over 4723239.19 frames. ], batch size: 61, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:16:38,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1618973.3333333333, ans=0.125 2023-10-04 10:16:39,645 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 2.060e+02 2.402e+02 2.784e+02 4.041e+02, threshold=4.803e+02, percent-clipped=0.0 2023-10-04 10:16:39,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:16:43,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:16:44,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 10:16:44,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 10:16:44,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1618973.3333333333, ans=0.1 2023-10-04 10:16:45,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:16:47,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:16:48,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 10:16:50,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 10:16:50,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:16:51,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:16:52,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:16:53,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:16:53,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:16:54,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 10:16:57,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 10:16:58,106 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.67 vs. limit=15.0 2023-10-04 10:16:58,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:17:01,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:17:01,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1619040.0, ans=10.0 2023-10-04 10:17:04,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:17:06,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:17:06,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 10:17:08,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:17:09,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:17:11,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:17:15,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 10:17:15,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 10:17:17,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:17:24,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:17:25,551 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.68 vs. limit=22.5 2023-10-04 10:17:29,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:17:29,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1619173.3333333333, ans=0.125 2023-10-04 10:17:33,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 10:17:34,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 10:17:36,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:17:37,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:17:38,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:17:39,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 10:17:41,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1619240.0, ans=0.1 2023-10-04 10:17:43,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 10:17:43,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 10:17:43,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:17:45,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:17:49,793 INFO [train.py:1046] (1/4) Epoch 46, batch 3850, loss[loss=0.1571, simple_loss=0.247, pruned_loss=0.03363, over 24644.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2337, pruned_loss=0.03617, over 4702475.48 frames. ], batch size: 73, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:17:51,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:17:51,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:17:57,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:17:58,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 10:17:58,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:17:59,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:18:01,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1619306.6666666667, ans=0.125 2023-10-04 10:18:02,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:18:03,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1619373.3333333333, ans=0.125 2023-10-04 10:18:05,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:18:06,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 10:18:08,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 10:18:13,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:16,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:18:18,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:18:18,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:18:18,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1619440.0, ans=0.125 2023-10-04 10:18:19,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1619440.0, ans=0.125 2023-10-04 10:18:23,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:23,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:18:24,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:18:24,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:18:25,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:18:27,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:18:28,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:28,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:18:28,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 10:18:28,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 10:18:28,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1619440.0, ans=0.125 2023-10-04 10:18:30,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:18:30,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:31,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:18:32,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:32,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 10:18:35,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 10:18:37,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:18:38,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 10:18:41,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 10:18:46,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:18:48,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:48,830 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.77 vs. limit=15.0 2023-10-04 10:18:50,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:18:52,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 10:18:55,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 10:18:57,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:18:58,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:18:59,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:18:59,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:18:59,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:01,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:01,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:19:01,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 10:19:01,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1619573.3333333333, ans=0.0 2023-10-04 10:19:02,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:19:03,860 INFO [train.py:1046] (1/4) Epoch 46, batch 3900, loss[loss=0.1347, simple_loss=0.2106, pruned_loss=0.02942, over 24334.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2328, pruned_loss=0.0361, over 4708456.89 frames. ], batch size: 56, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:19:03,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 10:19:03,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:03,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:19:05,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:19:05,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:07,974 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.999e+02 2.284e+02 2.578e+02 4.359e+02, threshold=4.569e+02, percent-clipped=0.0 2023-10-04 10:19:08,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:19:10,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:19:10,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:19:10,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:19:10,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 10:19:11,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:14,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:19:16,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:19:17,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:19:17,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:19:20,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:19:20,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:21,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:19:23,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 10:19:23,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:19:25,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 10:19:25,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:26,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 10:19:29,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 10:19:32,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:19:33,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:19:33,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:19:33,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1619773.3333333333, ans=0.0 2023-10-04 10:19:34,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:19:38,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:19:41,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:19:43,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:19:43,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:19:43,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:19:45,932 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.65 vs. limit=22.5 2023-10-04 10:19:48,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:19:49,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:19:51,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1619840.0, ans=0.125 2023-10-04 10:19:55,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:19:56,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:20:03,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:20:06,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:20:06,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 10:20:06,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 10:20:06,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:20:08,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 10:20:10,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:20:10,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 10:20:16,556 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.73 vs. limit=15.0 2023-10-04 10:20:17,369 INFO [train.py:1046] (1/4) Epoch 46, batch 3950, loss[loss=0.1571, simple_loss=0.2352, pruned_loss=0.03947, over 23840.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2322, pruned_loss=0.03607, over 4708125.75 frames. ], batch size: 179, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:20:18,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:20:19,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 10:20:20,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:20:23,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:20:23,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:20:30,417 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 10:20:30,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:20:31,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 10:20:32,063 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 10:20:32,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:20:35,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:20:35,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:20:35,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:20:36,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 10:20:39,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:20:39,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:20:39,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:20:41,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:20:41,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:20:50,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1620106.6666666667, ans=0.125 2023-10-04 10:20:52,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:20:52,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:20:59,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 10:21:02,344 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.16 vs. limit=22.5 2023-10-04 10:21:04,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 10:21:04,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 10:21:05,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:21:07,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:21:13,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:21:13,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:21:13,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:21:13,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1620173.3333333333, ans=0.0 2023-10-04 10:21:15,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:21:15,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 10:21:19,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:21:21,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:21:26,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 10:21:30,866 INFO [train.py:1046] (1/4) Epoch 46, batch 4000, loss[loss=0.1487, simple_loss=0.2291, pruned_loss=0.03414, over 24660.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2328, pruned_loss=0.03603, over 4715859.10 frames. ], batch size: 65, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:21:31,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1620306.6666666667, ans=0.125 2023-10-04 10:21:35,611 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.007e+02 2.186e+02 2.659e+02 3.847e+02, threshold=4.373e+02, percent-clipped=0.0 2023-10-04 10:21:35,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:21:39,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1620306.6666666667, ans=0.125 2023-10-04 10:21:40,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:21:42,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1620306.6666666667, ans=0.0 2023-10-04 10:21:47,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:21:47,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:21:47,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:21:48,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 10:21:49,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 10:21:49,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 10:21:49,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:21:49,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 10:21:52,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:21:56,062 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.71 vs. limit=15.0 2023-10-04 10:21:56,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:21:56,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:21:56,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:21:57,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:21:57,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 10:21:59,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:22:00,621 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 10:22:00,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:22:02,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:22:04,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff3.min_abs, batch_count=1620440.0, ans=0.2 2023-10-04 10:22:05,360 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 10:22:05,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:22:05,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:22:12,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 10:22:14,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:22:16,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:22:16,959 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 10:22:17,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:22:18,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 10:22:18,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:22:20,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:22:22,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:22:23,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:22:23,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:22:23,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:22:24,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 10:22:26,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:22:27,455 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 10:22:31,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:22:36,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 10:22:36,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1620573.3333333333, ans=0.125 2023-10-04 10:22:37,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:22:38,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1620573.3333333333, ans=0.125 2023-10-04 10:22:38,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1620573.3333333333, ans=0.125 2023-10-04 10:22:39,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:22:39,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:22:40,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:22:41,660 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.38 vs. limit=22.5 2023-10-04 10:22:45,497 INFO [train.py:1046] (1/4) Epoch 46, batch 4050, loss[loss=0.1286, simple_loss=0.2099, pruned_loss=0.02366, over 24375.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2341, pruned_loss=0.03609, over 4723354.97 frames. ], batch size: 61, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:22:45,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:22:48,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 10:22:48,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 10:22:48,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1620640.0, ans=0.125 2023-10-04 10:22:49,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:22:51,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:22:53,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:22:54,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:22:54,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:22:58,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:23:01,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:23:01,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 10:23:03,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:23:04,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:23:06,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1620706.6666666667, ans=0.1 2023-10-04 10:23:07,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:23:10,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:23:13,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 10:23:13,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 10:23:14,764 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 10:23:16,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:23:22,224 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.44 vs. limit=8.0 2023-10-04 10:23:22,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 10:23:24,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:23:27,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:23:31,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:23:32,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:23:32,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:23:34,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:23:37,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 10:23:37,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 10:23:39,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:23:39,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 10:23:43,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:23:51,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 10:23:52,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:23:52,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:23:55,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 10:23:55,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 10:23:55,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:23:57,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:23:58,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:23:58,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:23:59,811 INFO [train.py:1046] (1/4) Epoch 46, batch 4100, loss[loss=0.1672, simple_loss=0.2412, pruned_loss=0.0466, over 23455.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2347, pruned_loss=0.03653, over 4715958.80 frames. ], batch size: 134, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:24:04,013 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.073e+02 2.274e+02 2.591e+02 3.994e+02, threshold=4.547e+02, percent-clipped=0.0 2023-10-04 10:24:07,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 10:24:09,622 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.72 vs. limit=12.0 2023-10-04 10:24:10,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 10:24:11,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 10:24:12,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1620973.3333333333, ans=0.125 2023-10-04 10:24:13,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 10:24:13,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:24:13,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:24:14,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:24:14,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:24:16,009 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 10:24:18,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:24:18,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:24:18,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:24:20,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:24:23,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:24:23,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:24:24,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:24:24,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 10:24:25,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:24:26,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:24:26,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:24:26,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:24:26,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 10:24:27,154 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.72 vs. limit=15.0 2023-10-04 10:24:31,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:24:32,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 10:24:33,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:24:35,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1621106.6666666667, ans=0.2 2023-10-04 10:24:36,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:24:36,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 10:24:37,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:24:37,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:24:39,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:24:41,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 10:24:42,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:24:42,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:24:45,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 10:24:45,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:24:45,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:24:45,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1621173.3333333333, ans=0.0 2023-10-04 10:24:48,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:24:54,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:24:59,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:24:59,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:25:06,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:25:06,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:25:09,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:25:09,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:25:13,862 INFO [train.py:1046] (1/4) Epoch 46, batch 4150, loss[loss=0.1447, simple_loss=0.2212, pruned_loss=0.03406, over 23444.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2343, pruned_loss=0.03649, over 4709345.38 frames. ], batch size: 134, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:25:14,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1621306.6666666667, ans=0.1 2023-10-04 10:25:15,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:25:16,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:25:17,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:25:17,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:25:18,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1621306.6666666667, ans=0.125 2023-10-04 10:25:19,198 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.77 vs. limit=15.0 2023-10-04 10:25:20,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1621306.6666666667, ans=0.125 2023-10-04 10:25:21,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 10:25:21,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:25:21,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 10:25:22,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 10:25:22,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 10:25:24,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:25:28,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:25:28,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:25:33,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:25:34,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:25:34,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:25:36,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:25:37,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:25:37,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:25:42,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:25:44,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:25:46,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 10:25:47,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 10:25:47,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:25:47,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1621440.0, ans=0.0 2023-10-04 10:25:49,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 10:25:49,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:25:49,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:25:52,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:25:54,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:25:57,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 10:26:00,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 10:26:01,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:26:03,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 10:26:03,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:26:04,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 10:26:07,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:26:08,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:26:09,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:26:11,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1621573.3333333333, ans=0.5 2023-10-04 10:26:13,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 10:26:13,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:26:13,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 10:26:14,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 10:26:16,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 10:26:16,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:26:16,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:26:17,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:26:17,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 10:26:17,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:26:17,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:26:18,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:26:20,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:26:20,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 10:26:20,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 10:26:25,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:26:27,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 10:26:28,282 INFO [train.py:1046] (1/4) Epoch 46, batch 4200, loss[loss=0.139, simple_loss=0.2056, pruned_loss=0.0362, over 23607.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.233, pruned_loss=0.03608, over 4700082.89 frames. ], batch size: 256, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:26:30,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:26:32,765 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 2.054e+02 2.218e+02 2.491e+02 3.350e+02, threshold=4.435e+02, percent-clipped=0.0 2023-10-04 10:26:32,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:26:34,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:26:35,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:26:35,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:26:37,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 10:26:39,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 10:26:39,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:26:41,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1621706.6666666667, ans=0.1 2023-10-04 10:26:43,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:26:44,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:26:47,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 10:26:48,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:26:48,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:26:49,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 10:26:49,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:26:51,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:26:51,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:26:53,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:26:54,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:26:54,872 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:26:56,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 10:26:56,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:27:01,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 10:27:03,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:27:05,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:27:06,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:27:08,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:27:08,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 10:27:09,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:27:11,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:27:14,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:27:15,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:27:23,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:27:25,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 10:27:28,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:27:32,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 10:27:32,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1621906.6666666667, ans=0.125 2023-10-04 10:27:34,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:27:35,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 10:27:39,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1621906.6666666667, ans=0.0 2023-10-04 10:27:40,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:27:42,961 INFO [train.py:1046] (1/4) Epoch 46, batch 4250, loss[loss=0.1543, simple_loss=0.2335, pruned_loss=0.03758, over 23417.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2318, pruned_loss=0.03561, over 4710634.89 frames. ], batch size: 134, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:27:44,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:27:44,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:27:44,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1621973.3333333333, ans=0.125 2023-10-04 10:27:45,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:27:50,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:27:51,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 10:27:51,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:27:54,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:27:58,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:28:01,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1622040.0, ans=0.1 2023-10-04 10:28:03,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:28:03,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:28:04,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:28:04,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:28:07,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:28:07,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:28:08,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:28:09,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:28:10,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:28:12,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 10:28:16,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 10:28:16,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:28:18,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:28:18,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:28:18,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:28:18,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:28:19,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:28:22,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 10:28:22,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:28:27,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:28:30,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:28:31,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 10:28:31,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:28:31,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 10:28:32,013 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.73 vs. limit=22.5 2023-10-04 10:28:33,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:28:34,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:28:34,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:28:34,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:28:39,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 10:28:40,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 10:28:40,832 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.57 vs. limit=12.0 2023-10-04 10:28:41,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:28:46,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:28:47,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:28:49,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:28:49,801 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.27 vs. limit=10.0 2023-10-04 10:28:50,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:28:52,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:28:52,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:28:53,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:28:53,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 10:28:55,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:28:57,097 INFO [train.py:1046] (1/4) Epoch 46, batch 4300, loss[loss=0.1384, simple_loss=0.2196, pruned_loss=0.0286, over 24423.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2316, pruned_loss=0.0357, over 4701388.76 frames. ], batch size: 58, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:29:01,508 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 2.043e+02 2.222e+02 2.546e+02 3.786e+02, threshold=4.445e+02, percent-clipped=0.0 2023-10-04 10:29:01,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:29:01,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:29:05,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:29:11,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:29:11,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 10:29:14,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:29:16,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:29:16,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:29:16,056 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 10:29:20,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:29:21,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:29:24,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 10:29:24,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:29:24,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 10:29:27,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 10:29:29,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:29:33,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:29:33,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:29:35,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:29:36,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:29:36,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:29:36,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 10:29:38,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 10:29:41,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:29:42,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:29:42,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 10:29:42,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:29:43,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:29:43,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 10:29:43,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 10:29:43,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 10:29:45,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:29:47,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 10:29:47,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 10:29:51,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:29:52,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1622506.6666666667, ans=0.1 2023-10-04 10:29:53,937 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 10:29:54,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:29:55,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:29:56,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:29:58,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 10:29:58,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:29:58,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:29:58,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:29:59,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:29:59,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:30:00,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1622573.3333333333, ans=0.125 2023-10-04 10:30:01,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:30:03,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:30:04,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:30:04,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:30:10,221 INFO [train.py:1046] (1/4) Epoch 46, batch 4350, loss[loss=0.159, simple_loss=0.2347, pruned_loss=0.04165, over 23640.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2322, pruned_loss=0.03573, over 4716948.73 frames. ], batch size: 120, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:30:10,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 10:30:10,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:30:16,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:30:17,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:30:20,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:30:20,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:30:25,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:30:27,928 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.28 vs. limit=12.0 2023-10-04 10:30:28,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=1622706.6666666667, ans=6.0 2023-10-04 10:30:29,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:30:30,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:30:30,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:30:30,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1622706.6666666667, ans=0.125 2023-10-04 10:30:34,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:30:37,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:30:38,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:30:41,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 10:30:42,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:30:44,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:30:49,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1622773.3333333333, ans=0.0 2023-10-04 10:30:50,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:30:51,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 10:30:54,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:30:55,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:30:59,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1622840.0, ans=0.125 2023-10-04 10:31:01,954 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 10:31:02,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:31:03,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:31:04,045 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 10:31:05,364 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 10:31:05,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:31:05,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:31:07,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:31:08,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:31:10,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:31:10,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:31:10,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1622906.6666666667, ans=0.125 2023-10-04 10:31:12,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 10:31:12,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:12,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:31:12,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:14,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 10:31:15,584 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 10:31:15,588 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 10:31:15,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 10:31:18,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:31:18,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:31:18,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:31:20,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:31:20,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1622906.6666666667, ans=0.1 2023-10-04 10:31:21,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 10:31:23,083 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 10:31:23,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:24,238 INFO [train.py:1046] (1/4) Epoch 46, batch 4400, loss[loss=0.165, simple_loss=0.2421, pruned_loss=0.04392, over 23745.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2331, pruned_loss=0.03592, over 4719628.31 frames. ], batch size: 164, lr: 2.20e-03, grad_scale: 32.0 2023-10-04 10:31:27,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:31:27,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:28,568 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 1.967e+02 2.244e+02 2.506e+02 3.290e+02, threshold=4.488e+02, percent-clipped=0.0 2023-10-04 10:31:28,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:31:31,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 10:31:31,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 10:31:33,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 10:31:33,446 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 10:31:34,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:31:34,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:31:37,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 10:31:39,723 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.08 vs. limit=15.0 2023-10-04 10:31:41,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:42,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:31:42,471 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 10:31:45,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:31:45,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 10:31:45,251 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 10:31:48,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 10:31:48,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 10:31:48,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 10:31:49,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:31:49,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1623040.0, ans=0.125 2023-10-04 10:31:51,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:31:51,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:31:52,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:31:53,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 10:31:53,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 10:31:53,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:31:55,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:31:55,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:56,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:31:58,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:31:58,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 10:31:58,403 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 10:31:58,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1623106.6666666667, ans=0.0 2023-10-04 10:32:02,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:32:09,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:32:10,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 10:32:14,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:32:19,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:32:21,365 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.09 vs. limit=8.0 2023-10-04 10:32:21,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:32:21,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 10:32:21,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:32:21,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:32:21,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:32:23,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:32:28,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 10:32:30,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 10:32:30,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 10:32:30,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:32:30,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 10:32:32,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:32:35,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:32:36,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 10:32:37,725 INFO [train.py:1046] (1/4) Epoch 46, batch 4450, loss[loss=0.179, simple_loss=0.2511, pruned_loss=0.05344, over 22631.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2348, pruned_loss=0.03671, over 4728129.19 frames. ], batch size: 322, lr: 2.20e-03, grad_scale: 32.0 2023-10-04 10:32:41,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:32:41,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1623306.6666666667, ans=0.0 2023-10-04 10:32:44,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:32:44,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:32:44,256 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:32:50,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:32:50,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:32:53,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1623373.3333333333, ans=0.0 2023-10-04 10:32:54,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:32:57,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:32:58,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:32:58,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:32:58,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1623373.3333333333, ans=0.0 2023-10-04 10:32:59,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 10:32:59,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:33:00,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:33:01,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:33:01,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:33:03,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1623373.3333333333, ans=0.125 2023-10-04 10:33:04,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:33:08,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:33:09,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:33:10,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:33:11,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:33:12,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:33:15,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1623440.0, ans=0.0 2023-10-04 10:33:16,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 10:33:17,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 10:33:18,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 10:33:18,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:33:22,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:33:23,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 10:33:26,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:33:30,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:33:30,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 10:33:31,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:33:31,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:33:31,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:33:31,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:33:34,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:33:37,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:33:37,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 10:33:39,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:33:40,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:33:42,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:33:42,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:33:44,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 10:33:45,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:33:45,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1623573.3333333333, ans=0.1 2023-10-04 10:33:47,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 10:33:48,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:33:51,991 INFO [train.py:1046] (1/4) Epoch 46, batch 4500, loss[loss=0.1429, simple_loss=0.2139, pruned_loss=0.03596, over 22670.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2352, pruned_loss=0.03694, over 4729725.17 frames. ], batch size: 322, lr: 2.20e-03, grad_scale: 32.0 2023-10-04 10:33:53,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:33:54,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 10:33:54,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 10:33:56,101 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 2.108e+02 2.336e+02 2.689e+02 4.416e+02, threshold=4.672e+02, percent-clipped=0.0 2023-10-04 10:33:56,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:34:00,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:34:01,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:34:03,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:34:03,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:34:03,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:34:04,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:34:16,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:34:18,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:34:20,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:34:20,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:34:21,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:34:24,023 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.91 vs. limit=15.0 2023-10-04 10:34:28,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:34:30,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:34:34,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:34:36,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:34:37,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1623840.0, ans=0.125 2023-10-04 10:34:38,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 10:34:38,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:34:39,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:34:41,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:34:43,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:34:46,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:34:46,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 10:34:46,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:34:46,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:34:51,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:34:51,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:34:51,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1623906.6666666667, ans=0.2 2023-10-04 10:34:54,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1623906.6666666667, ans=0.125 2023-10-04 10:34:55,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:34:55,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:34:55,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:34:57,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 10:34:59,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 10:34:59,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 10:35:02,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 10:35:05,105 INFO [train.py:1046] (1/4) Epoch 46, batch 4550, loss[loss=0.1371, simple_loss=0.2149, pruned_loss=0.02965, over 20248.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2345, pruned_loss=0.03679, over 4724955.38 frames. ], batch size: 44, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:35:06,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 10:35:07,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:35:11,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:35:12,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:35:14,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:35:15,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1623973.3333333333, ans=0.125 2023-10-04 10:35:16,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1623973.3333333333, ans=0.1 2023-10-04 10:35:19,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:35:21,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:35:23,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:35:23,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:35:23,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:35:25,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1624040.0, ans=0.0 2023-10-04 10:35:27,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:35:27,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:35:29,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:35:32,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 10:35:33,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 10:35:34,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:35:36,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 10:35:38,606 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.96 vs. limit=15.0 2023-10-04 10:35:39,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 10:35:39,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:35:42,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 10:35:45,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:35:48,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:35:48,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:35:49,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:35:51,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 10:35:53,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:35:54,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:35:54,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:35:56,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:35:57,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 10:35:58,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 10:35:59,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:35:59,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 10:36:00,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 10:36:01,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:36:02,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1624173.3333333333, ans=0.0 2023-10-04 10:36:03,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:03,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:36:04,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:36:04,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:36:06,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1624240.0, ans=0.1 2023-10-04 10:36:07,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:36:07,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 10:36:10,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:36:10,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 10:36:10,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 10:36:10,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:36:10,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 10:36:13,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:36:13,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:36:15,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:36:16,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:36:16,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:36:18,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1624306.6666666667, ans=0.125 2023-10-04 10:36:19,473 INFO [train.py:1046] (1/4) Epoch 46, batch 4600, loss[loss=0.1445, simple_loss=0.2132, pruned_loss=0.03791, over 22866.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2339, pruned_loss=0.03653, over 4726572.91 frames. ], batch size: 322, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:36:19,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:36:21,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:36:24,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:25,550 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.983e+02 2.157e+02 2.520e+02 3.849e+02, threshold=4.313e+02, percent-clipped=0.0 2023-10-04 10:36:25,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:36:25,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1624306.6666666667, ans=0.04949747468305833 2023-10-04 10:36:28,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:36:28,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:36:29,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:36:31,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 10:36:32,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:36:34,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:36:34,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:36:34,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1624373.3333333333, ans=0.0 2023-10-04 10:36:35,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:40,592 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.71 vs. limit=15.0 2023-10-04 10:36:42,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 10:36:44,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:44,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1624373.3333333333, ans=0.0 2023-10-04 10:36:46,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:48,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:36:48,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:36:52,540 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:36:54,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 10:36:54,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:36:55,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:37:00,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:37:00,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:37:01,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:37:06,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 10:37:07,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 10:37:10,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:11,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:37:13,064 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.75 vs. limit=15.0 2023-10-04 10:37:13,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:13,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 10:37:13,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:37:14,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 10:37:14,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:14,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:37:16,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:18,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:37:20,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:37:20,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 10:37:20,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1624573.3333333333, ans=0.125 2023-10-04 10:37:20,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1624573.3333333333, ans=0.0 2023-10-04 10:37:21,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 10:37:21,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 10:37:21,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:37:22,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:37:24,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:37:24,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:37:24,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1624573.3333333333, ans=0.0 2023-10-04 10:37:32,678 INFO [train.py:1046] (1/4) Epoch 46, batch 4650, loss[loss=0.155, simple_loss=0.2386, pruned_loss=0.03575, over 23515.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2333, pruned_loss=0.03659, over 4718302.95 frames. ], batch size: 106, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:37:34,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:37:35,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:37:36,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:37:36,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:37:36,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:37:36,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:37:38,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:37:40,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 10:37:46,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:37:48,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 10:37:48,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:37:50,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 10:37:51,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:37:51,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 10:37:51,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 10:37:51,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:52,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:37:54,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1624706.6666666667, ans=0.125 2023-10-04 10:37:55,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:37:56,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:37:57,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1624706.6666666667, ans=0.1 2023-10-04 10:37:58,133 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 10:38:00,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:38:01,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 10:38:03,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:38:03,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:38:05,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 10:38:06,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:38:09,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:38:12,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:38:13,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1624773.3333333333, ans=0.1 2023-10-04 10:38:18,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1624840.0, ans=0.0 2023-10-04 10:38:20,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:38:22,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:38:22,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:38:22,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:38:24,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1624840.0, ans=0.125 2023-10-04 10:38:25,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 10:38:25,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 10:38:26,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 10:38:26,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 10:38:26,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:38:30,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1624906.6666666667, ans=0.125 2023-10-04 10:38:34,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:38:34,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:38:34,901 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 10:38:34,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:38:37,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:38:37,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:38:37,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:38:40,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1624906.6666666667, ans=0.0 2023-10-04 10:38:41,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:38:41,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:38:41,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:38:43,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:38:44,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:38:44,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:38:45,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 10:38:46,697 INFO [train.py:1046] (1/4) Epoch 46, batch 4700, loss[loss=0.1494, simple_loss=0.231, pruned_loss=0.03395, over 24478.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2338, pruned_loss=0.03634, over 4718349.56 frames. ], batch size: 66, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:38:48,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:38:50,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 10:38:53,570 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.144e+02 2.474e+02 2.959e+02 4.299e+02, threshold=4.947e+02, percent-clipped=0.0 2023-10-04 10:38:57,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:38:59,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:38:59,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:38:59,895 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.36 vs. limit=15.0 2023-10-04 10:39:02,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:39:03,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 10:39:06,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 10:39:06,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 10:39:08,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1625040.0, ans=0.0 2023-10-04 10:39:08,446 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.34 vs. limit=15.0 2023-10-04 10:39:09,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:39:09,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:39:10,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:39:13,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:39:18,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:39:20,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 10:39:23,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:39:26,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1625106.6666666667, ans=0.125 2023-10-04 10:39:27,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 10:39:29,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:39:30,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:34,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 10:39:34,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:39:38,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:39:38,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 10:39:40,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:40,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:39:44,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:39:44,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:39:46,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 10:39:48,144 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 10:39:49,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:39:51,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:51,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:51,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 10:39:51,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:51,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1625240.0, ans=0.125 2023-10-04 10:39:53,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1625240.0, ans=0.0 2023-10-04 10:39:56,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 10:39:56,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten.whitening_limit, batch_count=1625240.0, ans=22.5 2023-10-04 10:39:57,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1625240.0, ans=0.125 2023-10-04 10:40:00,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:40:00,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:40:01,524 INFO [train.py:1046] (1/4) Epoch 46, batch 4750, loss[loss=0.1408, simple_loss=0.221, pruned_loss=0.03029, over 24304.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.234, pruned_loss=0.03634, over 4721309.75 frames. ], batch size: 61, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:40:05,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:40:05,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:40:07,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 10:40:07,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:40:08,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1625306.6666666667, ans=0.2 2023-10-04 10:40:10,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 10:40:11,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:40:12,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:40:12,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:40:13,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1625306.6666666667, ans=0.125 2023-10-04 10:40:17,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 10:40:22,459 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:40:24,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 10:40:24,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:40:28,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:40:28,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:40:28,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:40:29,799 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 10:40:29,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 10:40:35,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1625440.0, ans=0.125 2023-10-04 10:40:36,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 10:40:36,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1625440.0, ans=0.125 2023-10-04 10:40:39,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:40:40,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:40:43,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:40:43,632 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 10:40:43,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:40:46,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:40:49,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:40:50,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1625506.6666666667, ans=0.0 2023-10-04 10:40:52,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 10:40:52,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 10:40:52,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1625506.6666666667, ans=0.1 2023-10-04 10:40:53,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:40:53,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:40:54,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:40:57,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:40:57,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 10:40:59,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 10:41:00,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1625573.3333333333, ans=0.125 2023-10-04 10:41:03,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:04,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:41:04,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 10:41:05,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:41:07,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:41:08,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:41:10,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:10,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:41:12,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:41:12,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 10:41:14,306 INFO [train.py:1046] (1/4) Epoch 46, batch 4800, loss[loss=0.1324, simple_loss=0.2101, pruned_loss=0.02735, over 24308.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.234, pruned_loss=0.03614, over 4723122.11 frames. ], batch size: 56, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:41:14,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 10:41:14,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 10:41:15,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:41:17,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:41:18,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 10:41:19,496 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.60 vs. limit=15.0 2023-10-04 10:41:21,363 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.989e+02 2.249e+02 2.577e+02 3.690e+02, threshold=4.497e+02, percent-clipped=0.0 2023-10-04 10:41:22,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:22,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:27,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1625640.0, ans=0.1 2023-10-04 10:41:28,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:41:29,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:41:29,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:31,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 10:41:31,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:41:32,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:41:34,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:41:38,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:41:40,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:41:40,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:41:42,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:41:42,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 10:41:42,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:43,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:41:45,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:41:48,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:49,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:49,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:41:51,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 10:41:52,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:54,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1625773.3333333333, ans=0.125 2023-10-04 10:41:55,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 10:41:55,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 10:41:55,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:57,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:41:57,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:41:57,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:41:57,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:41:59,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:41:59,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1625840.0, ans=0.0 2023-10-04 10:42:00,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:42:05,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:42:07,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:09,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:42:12,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 10:42:12,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:42:13,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:13,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:42:14,543 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.63 vs. limit=15.0 2023-10-04 10:42:14,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:42:19,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:42:20,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:42:20,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:20,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:42:20,934 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.28 vs. limit=15.0 2023-10-04 10:42:22,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:42:22,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:42:22,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1625906.6666666667, ans=0.1 2023-10-04 10:42:25,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:42:25,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:25,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:42:26,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 10:42:28,551 INFO [train.py:1046] (1/4) Epoch 46, batch 4850, loss[loss=0.142, simple_loss=0.2364, pruned_loss=0.02385, over 24396.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2335, pruned_loss=0.03613, over 4724944.60 frames. ], batch size: 69, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:42:29,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 10:42:29,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:42:29,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:42:30,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:42:30,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:35,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:42:38,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1625973.3333333333, ans=0.0 2023-10-04 10:42:40,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 10:42:42,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:42:47,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:42:47,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:42:47,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1626040.0, ans=0.07 2023-10-04 10:42:48,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:52,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:42:53,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:42:55,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:42:55,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 10:42:58,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:43:00,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:43:01,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:43:01,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:43:01,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 10:43:06,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:43:06,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:43:09,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1626106.6666666667, ans=0.0 2023-10-04 10:43:10,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:43:10,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 10:43:11,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1626106.6666666667, ans=0.125 2023-10-04 10:43:12,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 10:43:12,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 10:43:17,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:43:18,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 10:43:19,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:43:19,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:43:20,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:43:22,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 10:43:22,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:43:22,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1626173.3333333333, ans=0.125 2023-10-04 10:43:24,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 10:43:24,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:43:26,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:43:26,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 10:43:34,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:43:39,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:43:39,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:43:43,385 INFO [train.py:1046] (1/4) Epoch 46, batch 4900, loss[loss=0.1523, simple_loss=0.2244, pruned_loss=0.04014, over 23645.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2326, pruned_loss=0.03634, over 4716840.75 frames. ], batch size: 149, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:43:46,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 10:43:46,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:43:50,262 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.004e+02 2.251e+02 2.578e+02 5.064e+02, threshold=4.503e+02, percent-clipped=3.0 2023-10-04 10:43:50,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:43:51,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:43:51,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:43:56,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 10:44:01,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 10:44:04,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 10:44:04,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1626373.3333333333, ans=0.0 2023-10-04 10:44:05,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 10:44:07,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:44:07,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:44:07,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:44:07,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:44:07,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:44:07,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 10:44:11,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 10:44:11,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:44:13,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:44:14,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:44:14,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:44:16,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:44:17,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:44:17,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 10:44:18,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:44:18,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:44:19,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1626440.0, ans=0.125 2023-10-04 10:44:20,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 10:44:20,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 10:44:24,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 10:44:26,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:44:27,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:44:27,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:44:27,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:44:29,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 10:44:29,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:44:29,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 10:44:32,400 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:44:35,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 10:44:36,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:44:39,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 10:44:39,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:44:39,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1626506.6666666667, ans=0.125 2023-10-04 10:44:40,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 10:44:42,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 10:44:47,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:44:49,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:44:49,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 10:44:50,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:44:50,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:44:51,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:44:55,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:44:55,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:44:56,687 INFO [train.py:1046] (1/4) Epoch 46, batch 4950, loss[loss=0.1647, simple_loss=0.2617, pruned_loss=0.0339, over 24319.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2318, pruned_loss=0.03608, over 4716424.97 frames. ], batch size: 74, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:44:56,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:44:56,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 10:44:56,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1626640.0, ans=0.125 2023-10-04 10:44:58,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:45:00,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:45:00,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:45:05,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 10:45:06,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 10:45:06,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:45:08,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 10:45:09,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:09,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:45:09,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:45:10,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:12,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:45:12,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:45:14,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:45:14,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1626706.6666666667, ans=0.125 2023-10-04 10:45:15,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:45:17,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:17,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:45:21,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:45:24,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:25,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:45:29,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:30,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:30,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:45:32,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 10:45:32,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 10:45:34,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:35,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:45:35,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:45:35,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:45:35,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:45:37,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:45:39,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:45:42,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:45:44,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:45:47,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:47,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:48,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 10:45:48,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:45:49,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:45:52,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:45:54,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:45:54,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:45:54,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:56,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:45:57,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:45:57,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:45:58,156 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.17 vs. limit=10.0 2023-10-04 10:45:58,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:46:00,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:46:00,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 10:46:05,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:46:11,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 10:46:11,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 10:46:14,448 INFO [train.py:1046] (1/4) Epoch 46, batch 5000, loss[loss=0.1543, simple_loss=0.2385, pruned_loss=0.03501, over 24483.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2317, pruned_loss=0.03624, over 4716114.22 frames. ], batch size: 66, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:46:17,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:46:17,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:46:18,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 10:46:18,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 10:46:21,353 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.198e+02 2.645e+02 3.298e+02 6.003e+02, threshold=5.290e+02, percent-clipped=6.0 2023-10-04 10:46:21,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:46:22,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 10:46:22,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:46:23,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1626973.3333333333, ans=0.04949747468305833 2023-10-04 10:46:24,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:46:24,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 10:46:24,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:46:25,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:46:27,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 10:46:27,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:46:27,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:46:27,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1627040.0, ans=0.125 2023-10-04 10:46:28,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 10:46:30,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 10:46:31,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:46:31,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 10:46:31,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:46:33,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:46:33,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 10:46:33,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 10:46:33,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 10:46:33,625 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:46:34,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 10:46:34,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:46:36,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:46:37,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 10:46:37,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:46:38,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1627040.0, ans=0.125 2023-10-04 10:46:40,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:46:41,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:46:41,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 10:46:42,017 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.50 vs. limit=22.5 2023-10-04 10:46:43,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 10:46:44,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:46:46,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:46:48,928 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 10:46:50,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1627106.6666666667, ans=0.125 2023-10-04 10:46:52,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:46:54,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:46:54,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:46:57,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 10:46:57,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:46:57,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:46:59,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:47:01,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 10:47:01,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:47:05,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:47:07,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:47:11,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 10:47:14,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:47:16,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1627240.0, ans=0.0 2023-10-04 10:47:21,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:47:22,348 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.49 vs. limit=15.0 2023-10-04 10:47:23,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:47:23,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:47:23,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:47:24,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:47:24,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:47:24,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:47:24,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1627240.0, ans=0.2 2023-10-04 10:47:27,127 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.60 vs. limit=15.0 2023-10-04 10:47:27,698 INFO [train.py:1046] (1/4) Epoch 46, batch 5050, loss[loss=0.1603, simple_loss=0.2379, pruned_loss=0.04131, over 23811.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.233, pruned_loss=0.03619, over 4726872.10 frames. ], batch size: 179, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:47:27,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:47:27,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 10:47:29,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:47:30,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:47:31,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:47:32,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 10:47:33,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:47:35,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:47:39,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 10:47:40,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:47:41,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:47:49,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 10:47:51,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 10:47:51,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:47:52,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 10:47:52,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:47:53,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:47:53,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:47:55,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:47:55,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 10:47:56,484 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 10:47:56,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1627440.0, ans=0.1 2023-10-04 10:47:58,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:47:59,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:48:01,884 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.31 vs. limit=15.0 2023-10-04 10:48:03,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:48:03,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 10:48:05,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:48:09,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 10:48:09,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:48:11,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:48:11,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:48:11,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:48:12,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:48:14,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:48:15,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:48:15,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:48:16,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:48:16,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 10:48:18,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:48:20,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:48:22,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:48:22,937 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 10:48:22,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:48:25,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:48:27,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:48:27,123 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 10:48:29,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:48:29,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 10:48:29,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:48:33,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:48:33,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1627573.3333333333, ans=0.125 2023-10-04 10:48:34,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:48:34,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 10:48:36,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 10:48:38,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:48:38,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:48:38,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1627573.3333333333, ans=0.05 2023-10-04 10:48:39,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:48:41,339 INFO [train.py:1046] (1/4) Epoch 46, batch 5100, loss[loss=0.1568, simple_loss=0.242, pruned_loss=0.03579, over 24677.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2339, pruned_loss=0.03639, over 4727356.48 frames. ], batch size: 65, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:48:41,464 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 10:48:43,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1627640.0, ans=0.0 2023-10-04 10:48:44,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:48:45,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 10:48:46,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 10:48:48,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:48:49,336 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.978e+02 2.173e+02 2.426e+02 3.698e+02, threshold=4.345e+02, percent-clipped=0.0 2023-10-04 10:48:49,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:48:51,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:48:52,039 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.84 vs. limit=15.0 2023-10-04 10:48:52,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 10:48:52,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 10:48:58,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:48:58,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:49:04,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:49:05,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 10:49:05,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:49:07,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:49:07,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 10:49:09,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:49:11,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:49:11,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 10:49:12,623 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 10:49:13,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:49:13,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 10:49:15,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 10:49:16,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:49:24,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:49:25,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1627840.0, ans=0.2 2023-10-04 10:49:27,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 10:49:27,465 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 10:49:27,473 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 10:49:27,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1627840.0, ans=0.1 2023-10-04 10:49:28,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 10:49:28,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:49:31,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 10:49:35,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 10:49:39,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:49:40,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:49:42,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 10:49:43,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 10:49:43,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 10:49:45,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1627906.6666666667, ans=0.2 2023-10-04 10:49:47,002 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.76 vs. limit=15.0 2023-10-04 10:49:49,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:49:49,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:49:49,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:49:49,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:49:49,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 10:49:51,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:49:52,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 10:49:52,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 10:49:53,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1627906.6666666667, ans=0.0 2023-10-04 10:49:54,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 10:49:54,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:49:54,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 10:49:55,480 INFO [train.py:1046] (1/4) Epoch 46, batch 5150, loss[loss=0.1544, simple_loss=0.2401, pruned_loss=0.03432, over 24640.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2345, pruned_loss=0.03625, over 4736543.80 frames. ], batch size: 65, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:49:56,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:49:57,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 10:49:59,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:50:01,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:50:07,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:50:07,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 10:50:07,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1627973.3333333333, ans=0.1 2023-10-04 10:50:09,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:50:10,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:50:11,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:50:11,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:50:13,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:50:13,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:50:13,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:50:13,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 10:50:15,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:50:15,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1628040.0, ans=0.05 2023-10-04 10:50:16,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:50:16,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1628040.0, ans=0.2 2023-10-04 10:50:17,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:50:18,290 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1628040.0, ans=0.125 2023-10-04 10:50:19,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 10:50:20,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:50:22,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1628040.0, ans=0.0 2023-10-04 10:50:25,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:50:28,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 10:50:32,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:50:37,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1628106.6666666667, ans=0.1 2023-10-04 10:50:38,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:50:40,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:50:43,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:50:43,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:50:45,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 10:50:49,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:50:50,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:50:50,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:50:53,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:50:53,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:50:53,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1628240.0, ans=0.0 2023-10-04 10:50:55,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 10:50:58,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:50:59,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:51:01,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:51:01,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:51:03,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:51:04,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:51:04,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:51:04,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:51:07,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:51:10,514 INFO [train.py:1046] (1/4) Epoch 46, batch 5200, loss[loss=0.1437, simple_loss=0.2287, pruned_loss=0.0294, over 24680.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2353, pruned_loss=0.0368, over 4719978.60 frames. ], batch size: 65, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:51:10,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:51:13,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:51:17,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 10:51:19,045 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.713e+02 2.096e+02 2.291e+02 2.477e+02 4.065e+02, threshold=4.583e+02, percent-clipped=0.0 2023-10-04 10:51:19,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:51:19,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:51:21,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:51:23,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:51:23,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:51:24,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 10:51:26,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:51:27,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:51:27,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1628373.3333333333, ans=0.0 2023-10-04 10:51:30,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 10:51:30,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:51:31,024 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.27 vs. limit=15.0 2023-10-04 10:51:32,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:51:34,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 10:51:34,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 10:51:36,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 10:51:38,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:51:38,109 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 10:51:38,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:51:39,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:51:41,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:51:42,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 10:51:43,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:51:45,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:51:48,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 10:51:48,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 10:51:48,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 10:51:54,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 10:51:54,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:51:57,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:51:57,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:51:58,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 10:51:58,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:51:58,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 10:51:58,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:52:00,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:52:03,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:52:04,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:52:07,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:52:08,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:52:08,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:52:15,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:52:15,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 10:52:16,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:52:16,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:52:18,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:52:19,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 10:52:20,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:52:23,575 INFO [train.py:1046] (1/4) Epoch 46, batch 5250, loss[loss=0.1667, simple_loss=0.2494, pruned_loss=0.04204, over 23852.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2341, pruned_loss=0.03645, over 4724604.48 frames. ], batch size: 86, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:52:23,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:52:27,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:52:27,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:52:29,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:52:34,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:52:37,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:52:39,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:52:40,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:52:42,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 10:52:42,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:52:44,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:52:48,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1628706.6666666667, ans=0.125 2023-10-04 10:52:50,687 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.57 vs. limit=15.0 2023-10-04 10:52:54,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1628773.3333333333, ans=0.1 2023-10-04 10:53:00,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1628773.3333333333, ans=0.125 2023-10-04 10:53:12,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1628840.0, ans=0.0 2023-10-04 10:53:22,525 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.00 vs. limit=15.0 2023-10-04 10:53:32,597 INFO [train.py:1046] (1/4) Epoch 46, batch 5300, loss[loss=0.1588, simple_loss=0.224, pruned_loss=0.04683, over 23838.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2336, pruned_loss=0.0364, over 4734461.01 frames. ], batch size: 195, lr: 2.19e-03, grad_scale: 16.0 2023-10-04 10:53:40,495 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 2.050e+02 2.264e+02 2.625e+02 3.522e+02, threshold=4.529e+02, percent-clipped=0.0 2023-10-04 10:53:44,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1629040.0, ans=0.0 2023-10-04 10:53:46,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:53:46,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 10:53:46,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 10:53:46,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:53:47,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:53:47,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:53:47,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:53:47,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:53:47,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:53:47,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:53:47,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 10:53:47,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:53:47,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 10:53:48,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 10:53:48,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 10:53:48,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 10:53:48,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 10:53:48,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 10:53:48,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:53:48,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:53:48,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:53:48,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:53:48,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:53:49,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:53:49,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:53:49,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:53:49,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:53:49,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:53:49,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:53:49,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:53:49,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:53:50,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 10:53:50,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:53:50,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:53:50,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 10:53:50,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 10:53:50,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:53:50,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:53:50,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 10:53:51,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 10:53:51,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:53:51,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:53:51,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:53:52,118 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 10:53:52,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 10:53:52,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:53:52,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:53:52,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 10:53:52,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 10:53:52,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 10:53:52,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:53:58,617 INFO [train.py:1046] (1/4) Epoch 47, batch 0, loss[loss=0.1525, simple_loss=0.2273, pruned_loss=0.03886, over 23521.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2273, pruned_loss=0.03886, over 23521.00 frames. ], batch size: 285, lr: 2.17e-03, grad_scale: 32.0 2023-10-04 10:53:58,617 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 10:54:09,147 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.8017, 3.5521, 3.5054, 3.4232], device='cuda:1') 2023-10-04 10:54:11,064 INFO [train.py:1078] (1/4) Epoch 47, validation: loss=0.3566, simple_loss=0.2776, pruned_loss=0.2178, over 1125622.00 frames. 2023-10-04 10:54:11,064 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 10:54:12,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 10:54:13,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:54:14,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:54:14,974 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.75 vs. limit=10.0 2023-10-04 10:54:16,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1629053.3333333333, ans=0.0 2023-10-04 10:54:19,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:54:19,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:54:20,749 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.50 vs. limit=15.0 2023-10-04 10:54:21,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:54:21,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 10:54:23,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 10:54:25,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:54:25,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:54:29,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:54:30,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:54:30,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:54:30,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:54:31,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1629120.0, ans=0.125 2023-10-04 10:54:33,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 10:54:35,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:54:40,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1629186.6666666667, ans=0.1 2023-10-04 10:54:42,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:54:42,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:54:46,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 10:54:50,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:54:50,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:54:52,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:54:56,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:54:59,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:55:02,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1629253.3333333333, ans=0.0 2023-10-04 10:55:02,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1629253.3333333333, ans=0.2 2023-10-04 10:55:03,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 10:55:08,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 10:55:09,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:55:09,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:55:11,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:55:11,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:55:12,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 10:55:15,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:55:16,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:55:19,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:55:24,367 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 10:55:24,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:55:25,783 INFO [train.py:1046] (1/4) Epoch 47, batch 50, loss[loss=0.1588, simple_loss=0.2461, pruned_loss=0.03575, over 24548.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2355, pruned_loss=0.03589, over 1072525.67 frames. ], batch size: 71, lr: 2.17e-03, grad_scale: 32.0 2023-10-04 10:55:27,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:55:30,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:55:30,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 10:55:30,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:55:30,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:55:31,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1629386.6666666667, ans=0.125 2023-10-04 10:55:32,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:55:34,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:55:38,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:55:41,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 10:55:41,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:55:47,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:55:47,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 10:55:50,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 10:55:51,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:55:54,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:55:54,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:55:54,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1629520.0, ans=0.0 2023-10-04 10:55:55,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:55:56,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:55:56,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:55:56,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:56:02,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:56:02,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:56:03,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:56:03,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 10:56:05,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:56:06,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:56:06,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 10:56:07,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:56:11,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 10:56:17,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:56:17,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:56:17,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1629586.6666666667, ans=0.025 2023-10-04 10:56:18,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:56:19,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:56:19,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:56:21,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 10:56:23,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 10:56:24,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:56:24,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:56:27,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:56:28,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:56:28,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 10:56:28,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 10:56:30,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 10:56:31,480 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 2.113e+02 2.383e+02 2.822e+02 6.328e+02, threshold=4.766e+02, percent-clipped=7.0 2023-10-04 10:56:31,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:56:31,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:56:31,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 10:56:31,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 10:56:33,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:56:33,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:56:35,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:56:35,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:56:38,365 INFO [train.py:1046] (1/4) Epoch 47, batch 100, loss[loss=0.1541, simple_loss=0.2347, pruned_loss=0.03672, over 23254.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2359, pruned_loss=0.03632, over 1879165.57 frames. ], batch size: 105, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 10:56:40,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:56:42,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:56:47,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:56:48,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 10:56:48,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:56:53,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:56:53,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:56:54,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:56:54,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:56:54,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:56:55,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 10:56:59,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:56:59,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:57:00,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:57:00,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:57:04,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 10:57:04,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:57:06,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:57:07,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:57:09,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:57:13,173 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 10:57:13,187 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 10:57:14,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:57:14,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:57:17,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:57:19,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:57:19,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1629853.3333333333, ans=0.2 2023-10-04 10:57:20,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:23,321 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.24 vs. limit=10.0 2023-10-04 10:57:24,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1629920.0, ans=0.125 2023-10-04 10:57:24,829 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.91 vs. limit=15.0 2023-10-04 10:57:25,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:26,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1629920.0, ans=0.1 2023-10-04 10:57:26,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1629920.0, ans=0.125 2023-10-04 10:57:27,209 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 10:57:28,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 10:57:30,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1629920.0, ans=0.125 2023-10-04 10:57:31,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:57:33,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:57:34,876 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1629920.0, ans=0.125 2023-10-04 10:57:35,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:38,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:57:40,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:57:41,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:57:44,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:44,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:57:44,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1629986.6666666667, ans=0.0 2023-10-04 10:57:47,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:57:47,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:57:47,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:49,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 10:57:49,291 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 10:57:49,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:57:50,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:57:50,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:57:50,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:57:50,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 10:57:50,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:57:50,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:57:50,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:57:52,187 INFO [train.py:1046] (1/4) Epoch 47, batch 150, loss[loss=0.1399, simple_loss=0.2227, pruned_loss=0.02857, over 24598.00 frames. ], tot_loss[loss=0.156, simple_loss=0.237, pruned_loss=0.03748, over 2503912.13 frames. ], batch size: 60, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 10:57:52,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:57:53,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:57:53,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:57:55,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:57:57,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:57:58,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:57:58,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:57:58,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:57:59,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1630053.3333333333, ans=0.0 2023-10-04 10:58:03,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:58:04,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:08,487 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.32 vs. limit=15.0 2023-10-04 10:58:08,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:58:08,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:11,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 10:58:11,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 10:58:11,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 10:58:14,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:58:14,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:58:15,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:58:17,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:58:17,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:58:17,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:19,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:20,523 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 10:58:23,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:58:29,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:58:33,308 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.68 vs. limit=10.0 2023-10-04 10:58:33,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:58:33,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 10:58:38,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:58:38,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:58:38,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:58:40,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1630253.3333333333, ans=0.0 2023-10-04 10:58:41,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1630253.3333333333, ans=0.125 2023-10-04 10:58:42,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:58:44,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:58:45,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:58:45,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:58:45,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1630253.3333333333, ans=0.0 2023-10-04 10:58:46,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 10:58:51,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:58:53,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:58:53,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:58:53,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:58:54,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:58:57,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 10:58:58,420 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.983e+02 2.136e+02 2.354e+02 3.118e+02, threshold=4.272e+02, percent-clipped=0.0 2023-10-04 10:58:58,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:59:01,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:59:03,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:59:04,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:59:04,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 10:59:04,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:59:04,830 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 10:59:06,076 INFO [train.py:1046] (1/4) Epoch 47, batch 200, loss[loss=0.1408, simple_loss=0.2278, pruned_loss=0.02689, over 24334.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2383, pruned_loss=0.03745, over 2992966.55 frames. ], batch size: 61, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 10:59:09,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:59:12,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:59:12,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:59:14,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 10:59:16,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:59:16,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:59:19,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 10:59:20,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:59:22,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:59:22,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:59:25,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:59:25,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:59:25,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:59:44,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:59:44,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:59:46,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:59:47,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:59:47,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 10:59:47,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:59:49,319 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.82 vs. limit=15.0 2023-10-04 10:59:50,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:59:50,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:59:51,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:59:51,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:59:53,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 10:59:54,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:59:54,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:59:57,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:00:03,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:00:10,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:11,710 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.69 vs. limit=15.0 2023-10-04 11:00:12,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:00:18,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:19,539 INFO [train.py:1046] (1/4) Epoch 47, batch 250, loss[loss=0.1423, simple_loss=0.2343, pruned_loss=0.02511, over 24364.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2382, pruned_loss=0.03742, over 3375272.18 frames. ], batch size: 74, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:00:21,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 11:00:21,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:00:21,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:00:21,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:00:21,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:00:21,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1630720.0, ans=0.125 2023-10-04 11:00:22,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 11:00:24,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:00:24,399 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 11:00:25,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:28,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:00:29,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:29,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:00:29,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1630720.0, ans=0.125 2023-10-04 11:00:31,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:00:31,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:32,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1630720.0, ans=0.0 2023-10-04 11:00:33,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:00:35,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:00:37,459 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.83 vs. limit=15.0 2023-10-04 11:00:39,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1630786.6666666667, ans=0.125 2023-10-04 11:00:46,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:00:49,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:00:49,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:00:56,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:00:56,897 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:00:57,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:00:57,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:00:59,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:01:00,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 11:01:00,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:01:00,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:01:02,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:01:06,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 11:01:06,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:01:07,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:01:07,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:01:07,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:01:07,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1630920.0, ans=0.125 2023-10-04 11:01:08,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:01:09,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1630920.0, ans=0.125 2023-10-04 11:01:10,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:01:10,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:01:11,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:01:13,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:01:14,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:01:16,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:01:18,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1630986.6666666667, ans=0.125 2023-10-04 11:01:21,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:01:21,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1630986.6666666667, ans=0.025 2023-10-04 11:01:25,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:01:28,128 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.764e+02 2.109e+02 2.410e+02 2.789e+02 4.244e+02, threshold=4.821e+02, percent-clipped=0.0 2023-10-04 11:01:29,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:01:30,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:01:31,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1630986.6666666667, ans=0.2 2023-10-04 11:01:31,531 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.82 vs. limit=12.0 2023-10-04 11:01:33,984 INFO [train.py:1046] (1/4) Epoch 47, batch 300, loss[loss=0.1518, simple_loss=0.2205, pruned_loss=0.04155, over 23789.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2367, pruned_loss=0.03722, over 3675409.26 frames. ], batch size: 179, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:01:34,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 11:01:34,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:01:34,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:01:35,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 11:01:35,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:01:37,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:01:37,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 11:01:42,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:01:43,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:01:43,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1631053.3333333333, ans=0.0 2023-10-04 11:01:45,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:01:45,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 11:01:49,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:01:50,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:01:50,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 11:01:50,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:01:52,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1631120.0, ans=0.0 2023-10-04 11:01:54,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 11:01:59,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:02:01,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 11:02:03,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 11:02:03,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:05,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1631186.6666666667, ans=0.2 2023-10-04 11:02:06,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:02:08,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:08,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 11:02:08,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:02:11,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:02:14,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:02:14,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:02:16,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 11:02:16,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 11:02:18,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:02:21,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:22,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 11:02:24,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:02:27,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:02:29,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:02:29,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 11:02:34,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:34,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:02:37,390 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.11 vs. limit=6.0 2023-10-04 11:02:38,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:39,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:02:39,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 11:02:39,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:02:40,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:02:42,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 11:02:42,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:42,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1631320.0, ans=0.125 2023-10-04 11:02:43,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:02:43,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:02:45,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:02:46,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:02:48,374 INFO [train.py:1046] (1/4) Epoch 47, batch 350, loss[loss=0.1505, simple_loss=0.2179, pruned_loss=0.04155, over 23717.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2348, pruned_loss=0.03685, over 3911113.63 frames. ], batch size: 232, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:02:48,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:02:48,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 11:02:51,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:02:53,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1631386.6666666667, ans=10.0 2023-10-04 11:02:57,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:03:02,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:03:02,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:03:04,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 11:03:06,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:03:06,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 11:03:08,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:03:09,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 11:03:09,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:03:11,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1631453.3333333333, ans=0.125 2023-10-04 11:03:12,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 11:03:13,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:03:15,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:03:15,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:03:16,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:03:16,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:03:18,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:03:18,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:03:18,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:03:20,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1631520.0, ans=0.1 2023-10-04 11:03:21,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:03:21,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:03:27,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:03:27,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:03:29,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:03:29,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:03:33,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 11:03:33,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:03:39,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:03:39,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:03:39,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:03:40,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 11:03:41,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1631586.6666666667, ans=0.125 2023-10-04 11:03:43,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:03:43,720 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 11:03:43,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 11:03:45,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:03:46,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:03:48,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 11:03:48,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:03:48,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1631653.3333333333, ans=0.125 2023-10-04 11:03:50,767 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.29 vs. limit=6.0 2023-10-04 11:03:51,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:03:53,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:03:53,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:03:53,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:03:54,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:03:58,117 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.078e+02 2.298e+02 2.651e+02 3.732e+02, threshold=4.596e+02, percent-clipped=0.0 2023-10-04 11:03:58,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:04:00,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:04:02,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 11:04:02,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:04:03,541 INFO [train.py:1046] (1/4) Epoch 47, batch 400, loss[loss=0.1543, simple_loss=0.2436, pruned_loss=0.03254, over 24629.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.233, pruned_loss=0.03615, over 4080745.10 frames. ], batch size: 73, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:04:03,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:04:04,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:04:06,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:04:07,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1631720.0, ans=0.0 2023-10-04 11:04:08,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:04:09,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:04:12,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 11:04:14,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 11:04:14,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:04:15,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 11:04:16,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:04:17,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1631786.6666666667, ans=0.2 2023-10-04 11:04:17,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1631786.6666666667, ans=0.0 2023-10-04 11:04:21,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:04:21,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:04:21,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 11:04:21,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:04:22,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:04:22,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:04:22,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:04:24,836 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 11:04:24,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 11:04:30,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:04:32,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:04:32,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 11:04:33,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 11:04:37,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:04:39,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:04:44,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 11:04:46,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:04:48,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 11:04:51,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:04:51,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:04:53,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 11:04:57,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:04:58,925 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.07 vs. limit=15.0 2023-10-04 11:04:59,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 11:05:01,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:05:04,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:05:04,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 11:05:07,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 11:05:08,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 11:05:09,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1631986.6666666667, ans=0.2 2023-10-04 11:05:09,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1631986.6666666667, ans=0.125 2023-10-04 11:05:10,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:05:10,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:05:12,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 11:05:13,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:05:14,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:05:14,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:05:17,427 INFO [train.py:1046] (1/4) Epoch 47, batch 450, loss[loss=0.1665, simple_loss=0.2522, pruned_loss=0.04045, over 24606.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2332, pruned_loss=0.03636, over 4201936.06 frames. ], batch size: 71, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:05:17,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 11:05:17,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:05:17,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:05:17,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1632053.3333333333, ans=0.125 2023-10-04 11:05:18,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:05:18,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 11:05:19,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:05:20,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:05:21,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:05:31,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:05:32,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:05:33,081 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1632120.0, ans=0.05 2023-10-04 11:05:34,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 11:05:35,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 11:05:37,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:05:40,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:05:41,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:05:44,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:05:45,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:05:47,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 11:05:48,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 11:05:49,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 11:05:49,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1632186.6666666667, ans=0.0 2023-10-04 11:05:50,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:05:50,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:05:51,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:05:53,971 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 11:05:53,980 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 11:05:54,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:05:56,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:05:56,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 11:05:58,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 11:05:58,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:06:00,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 11:06:01,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 11:06:03,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:06:04,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:06:04,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:06:06,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 11:06:07,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1632253.3333333333, ans=0.0 2023-10-04 11:06:10,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1632253.3333333333, ans=0.0 2023-10-04 11:06:11,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:06:12,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 11:06:13,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 11:06:14,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:06:14,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1632253.3333333333, ans=0.125 2023-10-04 11:06:18,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:06:21,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:06:23,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:06:23,118 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 11:06:27,508 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.974e+02 2.147e+02 2.474e+02 4.734e+02, threshold=4.294e+02, percent-clipped=1.0 2023-10-04 11:06:27,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:06:27,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1632320.0, ans=0.0 2023-10-04 11:06:28,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:06:29,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1632320.0, ans=0.125 2023-10-04 11:06:30,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:06:30,659 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 11:06:32,005 INFO [train.py:1046] (1/4) Epoch 47, batch 500, loss[loss=0.1477, simple_loss=0.2347, pruned_loss=0.03032, over 24500.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2344, pruned_loss=0.03684, over 4313186.72 frames. ], batch size: 66, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:06:32,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 11:06:32,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:06:32,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1632386.6666666667, ans=0.125 2023-10-04 11:06:33,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 11:06:36,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1632386.6666666667, ans=0.2 2023-10-04 11:06:37,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 11:06:39,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:06:41,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:06:41,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:06:42,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:06:52,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:06:53,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:06:53,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 11:06:55,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:06:55,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 11:06:55,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:06:55,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1632453.3333333333, ans=0.1 2023-10-04 11:06:58,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:06:58,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:06:58,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:06:58,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:06:59,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 11:07:03,824 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.03 vs. limit=6.0 2023-10-04 11:07:04,518 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 11:07:07,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:07:07,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:07:08,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:07:08,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:07:10,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:07:12,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 11:07:12,893 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.66 vs. limit=15.0 2023-10-04 11:07:14,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:07:14,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:07:15,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1632586.6666666667, ans=0.2 2023-10-04 11:07:20,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:07:21,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:07:27,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:07:28,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1632586.6666666667, ans=0.125 2023-10-04 11:07:30,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 11:07:30,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:07:30,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:07:34,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 11:07:34,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:07:35,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:07:40,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 11:07:41,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 11:07:41,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:07:41,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 11:07:42,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:07:42,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:07:43,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:07:44,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:07:44,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:07:45,988 INFO [train.py:1046] (1/4) Epoch 47, batch 550, loss[loss=0.156, simple_loss=0.2467, pruned_loss=0.03264, over 24346.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2357, pruned_loss=0.03732, over 4408045.61 frames. ], batch size: 77, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:07:46,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:07:48,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:07:48,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 11:07:50,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:07:50,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1632720.0, ans=0.1 2023-10-04 11:07:56,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:07:56,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:07:58,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:08:00,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:08:04,144 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.84 vs. limit=15.0 2023-10-04 11:08:04,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 11:08:06,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 11:08:07,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1632786.6666666667, ans=0.125 2023-10-04 11:08:07,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1632786.6666666667, ans=0.125 2023-10-04 11:08:09,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:08:15,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:08:15,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:08:17,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:08:19,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:08:19,950 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 11:08:20,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:08:21,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 11:08:24,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:08:25,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:08:25,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:08:26,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:08:28,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 11:08:30,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 11:08:30,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:08:30,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:08:31,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:08:31,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:08:34,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:08:35,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:08:38,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:08:38,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:08:40,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 11:08:40,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:08:42,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:08:44,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:08:44,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:08:46,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 11:08:46,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 11:08:52,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 11:08:55,006 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.110e+02 2.356e+02 2.673e+02 4.561e+02, threshold=4.712e+02, percent-clipped=1.0 2023-10-04 11:08:55,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 11:08:56,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:08:57,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:08:57,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:08:59,190 INFO [train.py:1046] (1/4) Epoch 47, batch 600, loss[loss=0.152, simple_loss=0.2422, pruned_loss=0.03088, over 24480.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2354, pruned_loss=0.03723, over 4473420.96 frames. ], batch size: 69, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:09:04,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:09:07,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 11:09:08,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 11:09:11,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:09:13,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:09:14,933 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:09:17,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 11:09:17,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:09:17,957 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1633120.0, ans=0.1 2023-10-04 11:09:22,902 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.00 vs. limit=8.0 2023-10-04 11:09:23,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 11:09:26,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1633120.0, ans=0.125 2023-10-04 11:09:27,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:09:27,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:09:27,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:09:30,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1633186.6666666667, ans=0.0 2023-10-04 11:09:33,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:09:33,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:09:34,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:09:40,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:09:45,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:09:45,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:09:45,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:09:54,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 11:09:58,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 11:09:58,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:10:01,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 11:10:03,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:10:05,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 11:10:05,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:10:05,828 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.14 vs. limit=15.0 2023-10-04 11:10:06,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:10:12,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 11:10:13,953 INFO [train.py:1046] (1/4) Epoch 47, batch 650, loss[loss=0.1527, simple_loss=0.2452, pruned_loss=0.0301, over 24668.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2339, pruned_loss=0.03671, over 4524339.80 frames. ], batch size: 73, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:10:14,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 11:10:16,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:10:17,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:10:20,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:10:22,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 11:10:24,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:10:25,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1633386.6666666667, ans=0.025 2023-10-04 11:10:29,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:10:29,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:10:32,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:10:36,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 11:10:36,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.whiten.whitening_limit, batch_count=1633453.3333333333, ans=15.0 2023-10-04 11:10:37,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1633453.3333333333, ans=0.125 2023-10-04 11:10:38,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:10:38,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:10:43,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:10:43,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 11:10:43,645 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:10:45,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:10:46,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:10:46,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:10:48,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:10:49,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:10:51,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:10:51,715 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 11:10:51,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:10:51,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:10:54,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:10:56,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:10:56,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:10:57,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:10:58,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 11:11:00,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:11:00,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:11:01,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:11:01,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:11:02,089 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.65 vs. limit=15.0 2023-10-04 11:11:02,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:11:03,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=1633586.6666666667, ans=15.0 2023-10-04 11:11:04,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 11:11:05,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 11:11:05,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:05,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:11:05,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:11:07,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:11:09,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:11:09,185 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:11:11,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1633586.6666666667, ans=0.1 2023-10-04 11:11:16,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:16,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:11:17,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:11:18,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1633653.3333333333, ans=0.0 2023-10-04 11:11:20,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:11:21,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:11:21,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:11:24,156 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.722e+02 2.080e+02 2.396e+02 2.903e+02 4.504e+02, threshold=4.792e+02, percent-clipped=0.0 2023-10-04 11:11:27,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:11:27,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:11:28,585 INFO [train.py:1046] (1/4) Epoch 47, batch 700, loss[loss=0.1454, simple_loss=0.2317, pruned_loss=0.0296, over 24307.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2325, pruned_loss=0.03637, over 4566056.01 frames. ], batch size: 61, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:11:28,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:11:28,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:11:28,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1633720.0, ans=0.125 2023-10-04 11:11:30,479 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.89 vs. limit=22.5 2023-10-04 11:11:32,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 11:11:33,394 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.31 vs. limit=12.0 2023-10-04 11:11:34,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 11:11:35,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 11:11:35,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:37,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:11:38,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 11:11:43,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:11:46,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:11:48,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:48,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:11:48,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:11:51,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:54,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 11:11:54,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:11:54,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 11:11:57,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 11:12:02,382 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:12:02,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:12:03,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:12:04,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1633853.3333333333, ans=0.0 2023-10-04 11:12:05,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1633853.3333333333, ans=0.125 2023-10-04 11:12:05,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1633853.3333333333, ans=0.0 2023-10-04 11:12:06,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:12:08,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 11:12:12,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:12:13,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:12:13,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 11:12:16,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1633920.0, ans=15.0 2023-10-04 11:12:18,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:12:18,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:12:20,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:12:24,279 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.90 vs. limit=15.0 2023-10-04 11:12:27,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:12:27,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 11:12:30,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 11:12:30,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 11:12:32,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:12:33,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:12:35,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:12:38,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:12:38,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 11:12:43,406 INFO [train.py:1046] (1/4) Epoch 47, batch 750, loss[loss=0.1388, simple_loss=0.2197, pruned_loss=0.02901, over 23628.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2324, pruned_loss=0.03636, over 4596027.38 frames. ], batch size: 135, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:12:44,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 11:12:44,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 11:12:45,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 11:12:46,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 11:12:46,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 11:12:47,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:12:47,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 11:12:49,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:12:49,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:12:51,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:12:52,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:12:52,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:12:52,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:12:56,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:12:56,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:12:59,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:13:03,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:13:03,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:13:03,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 11:13:04,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:13:06,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:13:07,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:13:07,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:13:09,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 11:13:09,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:13:12,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 11:13:12,325 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 11:13:14,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 11:13:14,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:13:14,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:13:15,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:13:24,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:13:24,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:13:24,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:13:25,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:13:27,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:13:28,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 11:13:28,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:13:28,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1634253.3333333333, ans=0.1 2023-10-04 11:13:29,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 11:13:29,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:13:32,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:13:34,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 11:13:34,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:13:34,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1634253.3333333333, ans=0.2 2023-10-04 11:13:39,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:13:41,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:13:42,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:13:44,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:13:47,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 11:13:47,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:13:49,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:13:52,914 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.939e+02 2.140e+02 2.417e+02 3.613e+02, threshold=4.280e+02, percent-clipped=0.0 2023-10-04 11:13:53,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:13:53,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:13:55,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1634386.6666666667, ans=0.1 2023-10-04 11:13:56,952 INFO [train.py:1046] (1/4) Epoch 47, batch 800, loss[loss=0.1668, simple_loss=0.2388, pruned_loss=0.04741, over 23832.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2331, pruned_loss=0.0363, over 4621004.45 frames. ], batch size: 179, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:13:57,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:13:58,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:14:05,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:14:05,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:06,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:14:06,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:14:08,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:08,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:14:09,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:12,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:14:14,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:14:15,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 11:14:15,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:14:17,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:14:18,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:14:19,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:14:20,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 11:14:20,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:14:20,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 11:14:23,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:26,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:14:27,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:14:27,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:14:30,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:14:31,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:14:35,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:14:35,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:14:35,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 11:14:37,304 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 11:14:38,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 11:14:38,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:14:38,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:14:39,522 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.75 vs. limit=6.0 2023-10-04 11:14:41,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:41,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:14:47,374 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 11:14:47,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 11:14:47,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:14:49,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:14:55,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:14:58,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:15:00,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 11:15:00,249 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:15:04,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 11:15:08,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:15:09,817 INFO [train.py:1046] (1/4) Epoch 47, batch 850, loss[loss=0.1579, simple_loss=0.2544, pruned_loss=0.03067, over 24585.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2345, pruned_loss=0.0364, over 4654635.40 frames. ], batch size: 71, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:15:11,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:15:12,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 11:15:12,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:15:14,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:15:14,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 11:15:14,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:15:14,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1634720.0, ans=0.1 2023-10-04 11:15:15,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:15:17,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:15:20,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:15:22,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:15:23,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 11:15:23,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 11:15:24,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 11:15:26,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:15:26,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:15:27,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:15:27,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:15:27,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:15:33,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:15:34,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:15:35,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 11:15:36,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 11:15:40,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:15:41,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 11:15:44,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 11:15:46,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 11:15:46,301 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 11:15:46,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:15:46,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:15:48,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 11:15:49,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:15:50,431 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.68 vs. limit=10.0 2023-10-04 11:15:50,521 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.71 vs. limit=15.0 2023-10-04 11:15:52,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:15:52,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 11:15:54,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:15:54,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:15:54,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:15:56,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:15:56,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:15:57,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 11:15:58,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 11:16:04,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:16:04,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:16:04,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:16:06,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:16:06,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1634920.0, ans=0.0 2023-10-04 11:16:06,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1634920.0, ans=0.0 2023-10-04 11:16:07,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:16:10,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:16:11,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:16:11,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:16:13,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:16:13,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:16:19,635 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 2.035e+02 2.355e+02 2.898e+02 4.168e+02, threshold=4.711e+02, percent-clipped=0.0 2023-10-04 11:16:21,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:16:21,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:16:23,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 11:16:23,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:16:23,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:16:24,532 INFO [train.py:1046] (1/4) Epoch 47, batch 900, loss[loss=0.1533, simple_loss=0.24, pruned_loss=0.03328, over 24383.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2351, pruned_loss=0.03658, over 4677037.27 frames. ], batch size: 77, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:16:25,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 11:16:31,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:16:34,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:16:34,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 11:16:36,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1635053.3333333333, ans=0.125 2023-10-04 11:16:37,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:16:37,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 11:16:39,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 11:16:40,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:16:40,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:16:40,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:16:40,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:16:48,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1635120.0, ans=0.125 2023-10-04 11:16:48,704 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.09 vs. limit=15.0 2023-10-04 11:16:50,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:16:50,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:16:50,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:16:55,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:16:59,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 11:17:02,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:17:07,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:17:07,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:17:08,552 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 11:17:08,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 11:17:08,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1635253.3333333333, ans=0.125 2023-10-04 11:17:14,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:17:15,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:17:15,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:17:21,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:17:21,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:17:23,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 11:17:23,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:17:25,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 11:17:27,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:17:27,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:17:28,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1635320.0, ans=0.5 2023-10-04 11:17:29,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:17:29,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:17:32,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1635320.0, ans=0.125 2023-10-04 11:17:34,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 11:17:34,981 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 11:17:35,080 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 11:17:35,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 11:17:37,680 INFO [train.py:1046] (1/4) Epoch 47, batch 950, loss[loss=0.1431, simple_loss=0.2079, pruned_loss=0.03911, over 22697.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2346, pruned_loss=0.0363, over 4691018.55 frames. ], batch size: 322, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:17:37,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:17:41,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 11:17:46,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:17:49,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:17:49,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:17:51,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 11:17:54,622 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 11:17:58,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:17:58,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:17:59,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:17:59,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:17:59,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 11:18:00,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 11:18:03,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:03,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 11:18:04,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:18:08,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:08,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:18:08,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:18:09,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 11:18:11,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 11:18:11,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1635520.0, ans=0.0 2023-10-04 11:18:12,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:18:13,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1635520.0, ans=15.0 2023-10-04 11:18:14,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:18:21,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:18:21,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:18:24,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 11:18:26,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 11:18:26,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:18:26,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:18:26,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:26,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:18:32,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 11:18:33,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:18:35,057 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:18:35,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:35,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 11:18:35,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:18:35,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:18:36,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 11:18:39,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:18:41,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1635653.3333333333, ans=0.2 2023-10-04 11:18:42,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:18:46,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:18:48,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 11:18:48,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 11:18:50,044 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.047e+02 2.206e+02 2.421e+02 3.668e+02, threshold=4.411e+02, percent-clipped=0.0 2023-10-04 11:18:53,444 INFO [train.py:1046] (1/4) Epoch 47, batch 1000, loss[loss=0.1421, simple_loss=0.2266, pruned_loss=0.0288, over 24459.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2337, pruned_loss=0.03642, over 4691377.97 frames. ], batch size: 63, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:18:53,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:55,587 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.88 vs. limit=22.5 2023-10-04 11:18:57,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 11:18:57,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:03,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:19:04,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 11:19:04,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 11:19:09,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:19:09,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:19:10,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:19:13,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 11:19:16,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 11:19:17,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 11:19:19,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:19:19,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 11:19:22,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 11:19:22,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 11:19:24,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:19:25,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:32,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:19:33,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:19:34,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:34,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:19:34,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 11:19:36,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:19:36,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:19:36,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:19:37,619 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 11:19:40,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 11:19:42,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 11:19:43,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 11:19:45,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:19:47,296 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.06 vs. limit=6.0 2023-10-04 11:19:50,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:50,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:19:50,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:51,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:19:53,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 11:19:54,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:19:56,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 11:19:58,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 11:19:58,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:19:58,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:20:00,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:20:03,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:20:05,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:20:07,908 INFO [train.py:1046] (1/4) Epoch 47, batch 1050, loss[loss=0.1492, simple_loss=0.2383, pruned_loss=0.03006, over 24309.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2326, pruned_loss=0.03624, over 4694892.24 frames. ], batch size: 74, lr: 2.17e-03, grad_scale: 4.0 2023-10-04 11:20:09,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:20:10,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:20:13,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 11:20:13,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:20:15,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:20:18,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:20:18,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1636053.3333333333, ans=0.05 2023-10-04 11:20:19,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:20:22,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:20:23,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:20:23,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:20:24,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:20:25,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 11:20:27,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:20:28,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 11:20:30,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:20:30,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 11:20:30,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:20:37,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:20:37,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:20:37,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:20:37,892 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:20:39,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 11:20:39,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 11:20:39,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:20:42,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 11:20:42,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_na.min_abs, batch_count=1636186.6666666667, ans=0.02 2023-10-04 11:20:46,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 11:20:46,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1636186.6666666667, ans=0.05 2023-10-04 11:20:47,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:20:49,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 11:20:52,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 11:20:52,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:20:52,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:20:53,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1636253.3333333333, ans=0.125 2023-10-04 11:20:58,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:21:01,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 11:21:01,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 11:21:03,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 11:21:03,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:21:03,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:21:06,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 11:21:08,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:21:09,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1636320.0, ans=0.1 2023-10-04 11:21:11,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:21:11,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:21:11,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:21:11,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:21:16,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:21:16,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 11:21:16,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:21:16,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 11:21:17,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 11:21:17,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:21:20,775 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.762e+02 2.061e+02 2.280e+02 2.915e+02 5.230e+02, threshold=4.560e+02, percent-clipped=2.0 2023-10-04 11:21:22,025 INFO [train.py:1046] (1/4) Epoch 47, batch 1100, loss[loss=0.1561, simple_loss=0.2338, pruned_loss=0.03919, over 23889.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2327, pruned_loss=0.0362, over 4707513.45 frames. ], batch size: 212, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:21:22,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:21:26,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:21:29,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:21:32,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:21:32,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:21:32,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 11:21:33,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:21:36,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:21:39,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:21:42,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:21:42,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 11:21:44,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 11:21:45,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:21:45,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:21:45,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1636453.3333333333, ans=0.0 2023-10-04 11:21:48,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:21:49,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:21:54,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:21:56,519 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.35 vs. limit=12.0 2023-10-04 11:21:57,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 11:21:58,541 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 11:21:58,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:22:00,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:22:03,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:22:03,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:22:05,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 11:22:05,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1636586.6666666667, ans=0.0 2023-10-04 11:22:06,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:22:06,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:22:06,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:22:07,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:22:07,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 11:22:09,384 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:22:11,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:22:11,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 11:22:12,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1636586.6666666667, ans=0.2 2023-10-04 11:22:14,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:22:19,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:22:22,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 11:22:24,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 11:22:24,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:22:27,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:22:27,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:22:27,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 11:22:27,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1636653.3333333333, ans=0.0 2023-10-04 11:22:29,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:22:29,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:22:30,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 11:22:30,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:22:31,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 11:22:33,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:22:33,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:22:34,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:22:36,345 INFO [train.py:1046] (1/4) Epoch 47, batch 1150, loss[loss=0.1765, simple_loss=0.2484, pruned_loss=0.05231, over 23825.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2338, pruned_loss=0.03647, over 4700062.36 frames. ], batch size: 164, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:22:39,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:22:41,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:22:43,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:22:43,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:22:43,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 11:22:45,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:22:47,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 11:22:49,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:22:49,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:22:54,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1636786.6666666667, ans=0.125 2023-10-04 11:22:55,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 11:22:59,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:23:02,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:23:02,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:23:02,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 11:23:02,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:23:02,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:23:07,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 11:23:07,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:23:10,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:23:19,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:23:19,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1636920.0, ans=0.125 2023-10-04 11:23:27,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:23:27,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 11:23:27,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:23:29,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:23:29,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1636920.0, ans=0.125 2023-10-04 11:23:34,993 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 11:23:36,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:23:40,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1636986.6666666667, ans=0.1 2023-10-04 11:23:44,115 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 11:23:45,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:23:47,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:23:47,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:23:48,706 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.016e+02 2.247e+02 2.696e+02 4.163e+02, threshold=4.494e+02, percent-clipped=0.0 2023-10-04 11:23:48,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:23:50,024 INFO [train.py:1046] (1/4) Epoch 47, batch 1200, loss[loss=0.1484, simple_loss=0.2295, pruned_loss=0.03371, over 23640.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2344, pruned_loss=0.03612, over 4718492.42 frames. ], batch size: 149, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:23:52,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:23:57,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:23:57,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:24:00,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:24:00,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:24:00,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:24:03,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:24:04,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:24:06,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:24:06,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:24:07,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1637120.0, ans=0.1 2023-10-04 11:24:10,385 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 11:24:11,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 11:24:14,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:24:16,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:24:17,277 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.03 vs. limit=22.5 2023-10-04 11:24:19,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:24:21,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:24:21,253 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 11:24:22,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:24:26,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1637186.6666666667, ans=0.125 2023-10-04 11:24:27,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1637186.6666666667, ans=0.125 2023-10-04 11:24:28,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:24:28,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:24:28,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 11:24:31,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:24:34,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 11:24:37,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 11:24:37,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:24:38,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:24:40,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:24:40,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:24:41,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:24:41,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:24:41,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:24:42,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 11:24:43,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:24:43,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:24:43,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:24:44,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1637253.3333333333, ans=0.0 2023-10-04 11:24:46,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:24:46,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:24:49,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 11:24:51,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:24:54,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 11:24:54,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1637320.0, ans=0.1 2023-10-04 11:24:58,829 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 11:25:00,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:25:03,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:25:04,353 INFO [train.py:1046] (1/4) Epoch 47, batch 1250, loss[loss=0.1337, simple_loss=0.2177, pruned_loss=0.02486, over 24598.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2347, pruned_loss=0.03625, over 4725693.33 frames. ], batch size: 60, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:25:04,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:25:06,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:25:07,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 11:25:09,221 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:25:11,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:25:11,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:25:13,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 11:25:16,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:25:16,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:25:19,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 11:25:19,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:25:21,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:25:21,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:25:23,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:25:23,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1637453.3333333333, ans=0.125 2023-10-04 11:25:29,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 11:25:29,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:25:29,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:25:32,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:25:32,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:25:36,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:25:37,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:25:38,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1637520.0, ans=0.0 2023-10-04 11:25:42,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 11:25:42,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:25:45,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:25:46,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 11:25:46,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:25:47,004 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 11:25:48,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:25:48,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:25:53,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:25:56,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:25:56,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:25:56,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 11:25:56,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 11:25:57,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 11:25:59,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:26:02,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 11:26:02,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:26:03,094 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.42 vs. limit=6.0 2023-10-04 11:26:03,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 11:26:03,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:26:06,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 11:26:06,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 11:26:06,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:26:06,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 11:26:07,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1637653.3333333333, ans=0.0 2023-10-04 11:26:08,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:26:08,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 11:26:11,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:26:12,473 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.02 vs. limit=15.0 2023-10-04 11:26:12,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:26:14,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:26:16,845 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 2.071e+02 2.257e+02 2.568e+02 3.764e+02, threshold=4.514e+02, percent-clipped=0.0 2023-10-04 11:26:17,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:26:18,273 INFO [train.py:1046] (1/4) Epoch 47, batch 1300, loss[loss=0.2028, simple_loss=0.2764, pruned_loss=0.06461, over 19448.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2353, pruned_loss=0.03664, over 4720995.04 frames. ], batch size: 388, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:26:20,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:26:20,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 11:26:20,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1637720.0, ans=0.5 2023-10-04 11:26:24,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:26:26,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 11:26:28,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:26:28,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:26:29,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1637720.0, ans=0.125 2023-10-04 11:26:30,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:26:30,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1637720.0, ans=0.0 2023-10-04 11:26:31,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 11:26:34,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:26:35,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:26:37,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 11:26:40,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:26:40,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1637786.6666666667, ans=0.125 2023-10-04 11:26:44,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:26:44,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:26:46,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:26:48,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:26:48,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:26:50,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 11:26:50,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1637853.3333333333, ans=0.125 2023-10-04 11:26:51,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 11:26:55,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:26:55,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:26:58,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 11:27:00,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 11:27:01,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:27:03,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:27:04,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 11:27:04,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:27:04,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 11:27:05,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:27:09,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:27:09,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:27:13,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 11:27:14,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 11:27:14,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 11:27:18,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1637986.6666666667, ans=0.1 2023-10-04 11:27:19,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:27:20,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1637986.6666666667, ans=0.2 2023-10-04 11:27:22,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 11:27:23,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:27:29,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 11:27:29,523 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:27:31,477 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.39 vs. limit=15.0 2023-10-04 11:27:32,265 INFO [train.py:1046] (1/4) Epoch 47, batch 1350, loss[loss=0.148, simple_loss=0.2109, pruned_loss=0.04256, over 23573.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2339, pruned_loss=0.03628, over 4725044.26 frames. ], batch size: 256, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:27:32,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:27:35,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:27:39,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:27:39,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:27:41,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:27:41,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:27:45,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:27:48,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 11:27:50,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:27:51,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:27:53,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 11:27:54,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:27:56,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:27:56,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 11:27:57,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 11:27:58,391 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.65 vs. limit=15.0 2023-10-04 11:27:58,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 11:28:00,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:28:00,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 11:28:00,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1638186.6666666667, ans=0.2 2023-10-04 11:28:10,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:28:17,110 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.34 vs. limit=15.0 2023-10-04 11:28:20,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:28:20,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:28:22,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 11:28:25,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:28:25,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1638253.3333333333, ans=0.0 2023-10-04 11:28:25,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1638253.3333333333, ans=0.125 2023-10-04 11:28:26,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 11:28:26,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:28:26,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:28:29,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:28:30,488 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.45 vs. limit=15.0 2023-10-04 11:28:31,938 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.40 vs. limit=15.0 2023-10-04 11:28:33,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 11:28:33,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:28:35,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 11:28:39,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 11:28:44,664 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 2.056e+02 2.395e+02 2.955e+02 4.262e+02, threshold=4.789e+02, percent-clipped=0.0 2023-10-04 11:28:44,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 11:28:44,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:28:45,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1638386.6666666667, ans=0.1 2023-10-04 11:28:46,166 INFO [train.py:1046] (1/4) Epoch 47, batch 1400, loss[loss=0.1578, simple_loss=0.2463, pruned_loss=0.03464, over 24357.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2327, pruned_loss=0.03618, over 4707275.96 frames. ], batch size: 77, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:28:49,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:28:51,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:28:56,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 11:28:58,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 11:28:58,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1638386.6666666667, ans=0.2 2023-10-04 11:29:08,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:29:10,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:29:13,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:29:13,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:29:17,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:29:17,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 11:29:25,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:29:26,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:29:27,823 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.10 vs. limit=12.0 2023-10-04 11:29:31,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 11:29:31,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:29:31,606 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=12.11 vs. limit=15.0 2023-10-04 11:29:32,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:29:33,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:29:35,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:29:35,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:29:36,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:29:36,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:29:36,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 11:29:38,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:29:39,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:29:42,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1638586.6666666667, ans=0.125 2023-10-04 11:29:44,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:29:53,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 11:29:53,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 11:29:53,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:29:54,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1638653.3333333333, ans=0.125 2023-10-04 11:29:55,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 11:29:56,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:29:58,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:30:01,009 INFO [train.py:1046] (1/4) Epoch 47, batch 1450, loss[loss=0.1617, simple_loss=0.2328, pruned_loss=0.04535, over 22675.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2329, pruned_loss=0.03602, over 4721245.46 frames. ], batch size: 322, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:30:01,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:30:02,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:30:02,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1638720.0, ans=0.0 2023-10-04 11:30:03,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:03,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 11:30:04,087 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1638720.0, ans=0.125 2023-10-04 11:30:05,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1638720.0, ans=0.0 2023-10-04 11:30:08,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:30:09,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:30:11,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:30:11,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 11:30:12,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:30:14,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 11:30:14,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:14,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1638786.6666666667, ans=0.125 2023-10-04 11:30:15,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:30:15,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 11:30:17,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:30:18,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:30:18,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 11:30:18,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:30:19,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:30:21,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:25,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:30:30,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:30:30,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:30:31,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:30:31,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:34,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:30:34,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:30:35,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:35,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:30:38,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 11:30:40,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:30:41,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1638853.3333333333, ans=0.125 2023-10-04 11:30:45,131 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 11:30:46,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:30:47,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:30:49,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:30:50,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 11:30:53,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:30:53,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1638920.0, ans=0.2 2023-10-04 11:30:55,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 11:30:55,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 11:30:58,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:30:59,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:31:01,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:31:02,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 11:31:03,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 11:31:05,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 11:31:06,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:31:08,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:31:12,107 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.70 vs. limit=10.0 2023-10-04 11:31:13,950 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.05 vs. limit=22.5 2023-10-04 11:31:14,655 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 2.021e+02 2.399e+02 2.893e+02 4.968e+02, threshold=4.797e+02, percent-clipped=1.0 2023-10-04 11:31:14,682 INFO [train.py:1046] (1/4) Epoch 47, batch 1500, loss[loss=0.155, simple_loss=0.2407, pruned_loss=0.03465, over 24460.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2334, pruned_loss=0.03641, over 4727729.69 frames. ], batch size: 66, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:31:17,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1639053.3333333333, ans=0.125 2023-10-04 11:31:18,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 11:31:18,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:31:18,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:31:19,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:31:21,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:31:22,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:31:24,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 11:31:24,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:31:24,677 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:31:26,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:31:26,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:31:26,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:31:29,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:31:29,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:31:29,984 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.57 vs. limit=15.0 2023-10-04 11:31:30,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1639120.0, ans=0.125 2023-10-04 11:31:34,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:31:34,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 11:31:35,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:31:35,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:31:37,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:31:41,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 11:31:46,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 11:31:46,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:31:48,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 11:31:50,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 11:31:50,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:31:52,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:31:52,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:31:53,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 11:31:53,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:31:53,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:31:53,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 11:31:53,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:31:54,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1639186.6666666667, ans=0.0 2023-10-04 11:32:01,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:32:01,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 11:32:05,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:32:05,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:32:09,638 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 11:32:10,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:32:10,375 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 11:32:11,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:32:11,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:32:13,701 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 11:32:15,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:32:17,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 11:32:19,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:32:20,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:32:21,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:32:21,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:32:23,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:32:23,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:32:24,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 11:32:24,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 11:32:24,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:32:26,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 11:32:28,092 INFO [train.py:1046] (1/4) Epoch 47, batch 1550, loss[loss=0.1382, simple_loss=0.2258, pruned_loss=0.02532, over 24459.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.234, pruned_loss=0.03657, over 4719413.42 frames. ], batch size: 63, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:32:28,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 11:32:30,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:32:32,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:32:32,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:32:32,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:32:33,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:32:35,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:32:39,076 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 11:32:39,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:32:39,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:32:40,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:32:41,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1639453.3333333333, ans=0.125 2023-10-04 11:32:42,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:32:42,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 11:32:44,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:32:44,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 11:32:44,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=1639453.3333333333, ans=0.05 2023-10-04 11:32:46,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1639453.3333333333, ans=0.125 2023-10-04 11:32:47,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 11:32:47,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 11:32:47,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:32:47,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:32:51,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:32:54,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 11:32:54,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 11:32:59,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1639520.0, ans=0.1 2023-10-04 11:33:03,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:33:05,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:33:05,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:33:06,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:33:07,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 11:33:13,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:33:15,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:33:17,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:33:20,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:33:20,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:33:20,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 11:33:20,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:33:23,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:33:23,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:33:23,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 11:33:23,537 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 11:33:26,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:33:32,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 11:33:32,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1639653.3333333333, ans=0.1 2023-10-04 11:33:35,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:33:35,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:33:36,556 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.33 vs. limit=12.0 2023-10-04 11:33:37,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 11:33:38,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:33:38,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:33:38,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:33:39,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:33:40,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:33:41,232 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.799e+02 2.051e+02 2.234e+02 2.804e+02 4.353e+02, threshold=4.467e+02, percent-clipped=0.0 2023-10-04 11:33:41,259 INFO [train.py:1046] (1/4) Epoch 47, batch 1600, loss[loss=0.168, simple_loss=0.251, pruned_loss=0.04251, over 24025.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2345, pruned_loss=0.03657, over 4724872.88 frames. ], batch size: 80, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:33:44,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:33:45,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 11:33:45,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 11:33:47,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 11:33:50,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:33:52,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 11:33:53,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:33:55,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1639786.6666666667, ans=0.1 2023-10-04 11:33:56,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:33:59,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:33:59,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1639786.6666666667, ans=10.0 2023-10-04 11:34:04,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 11:34:04,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1639786.6666666667, ans=0.125 2023-10-04 11:34:06,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:34:08,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 11:34:08,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:34:08,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 11:34:13,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 11:34:21,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:34:22,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 11:34:22,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:34:22,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:34:22,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:34:25,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 11:34:29,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 11:34:29,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:34:31,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:34:32,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:34:32,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:34:36,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:34:38,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:34:38,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:34:41,732 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1639986.6666666667, ans=0.125 2023-10-04 11:34:44,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:34:46,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:34:47,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 11:34:47,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:34:49,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 11:34:54,407 INFO [train.py:1046] (1/4) Epoch 47, batch 1650, loss[loss=0.1678, simple_loss=0.2502, pruned_loss=0.04268, over 23911.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2351, pruned_loss=0.0369, over 4716935.80 frames. ], batch size: 86, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:34:54,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:34:54,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1640053.3333333333, ans=0.0 2023-10-04 11:34:55,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:34:57,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:34:57,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 11:34:57,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 11:34:57,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 11:34:57,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 11:35:02,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:35:02,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:35:04,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:35:04,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:35:08,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:35:10,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 11:35:12,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:35:12,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:35:12,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:35:12,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:35:13,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 11:35:13,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 11:35:19,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:35:20,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:35:24,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1640186.6666666667, ans=0.1 2023-10-04 11:35:25,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1640186.6666666667, ans=0.1 2023-10-04 11:35:27,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 11:35:28,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:35:30,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1640186.6666666667, ans=0.1 2023-10-04 11:35:31,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 11:35:31,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1640186.6666666667, ans=0.1 2023-10-04 11:35:33,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:35:36,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:35:36,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:35:37,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:35:39,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:35:39,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:35:43,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:35:44,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:35:44,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:35:44,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:35:44,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:35:46,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:35:49,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:35:50,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 11:35:52,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:35:52,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 11:35:53,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 11:35:55,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 11:35:55,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:35:55,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:35:56,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:35:56,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:35:56,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 11:35:59,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:36:01,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:36:01,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:36:03,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 11:36:07,885 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 2.048e+02 2.297e+02 2.769e+02 5.278e+02, threshold=4.594e+02, percent-clipped=2.0 2023-10-04 11:36:07,912 INFO [train.py:1046] (1/4) Epoch 47, batch 1700, loss[loss=0.1449, simple_loss=0.2261, pruned_loss=0.03187, over 24580.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2343, pruned_loss=0.03701, over 4709737.51 frames. ], batch size: 60, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:36:08,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:36:08,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:36:09,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 11:36:11,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:36:11,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:36:11,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:36:11,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1640386.6666666667, ans=0.1 2023-10-04 11:36:12,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:36:12,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:36:12,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 11:36:14,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1640386.6666666667, ans=0.0 2023-10-04 11:36:15,531 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:36:18,138 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.58 vs. limit=15.0 2023-10-04 11:36:24,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:36:26,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:36:27,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1640453.3333333333, ans=0.0 2023-10-04 11:36:31,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:36:31,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:36:33,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:36:33,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:36:36,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 11:36:39,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:36:39,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:36:40,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:36:41,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1640520.0, ans=0.125 2023-10-04 11:36:42,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:36:43,167 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.82 vs. limit=15.0 2023-10-04 11:36:43,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 11:36:45,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 11:36:46,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:36:46,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 11:36:48,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:36:57,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:36:57,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:36:58,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:37:00,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 11:37:00,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 11:37:00,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:37:01,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:37:01,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 11:37:03,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:37:03,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:37:03,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:37:03,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:37:04,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:37:04,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:37:06,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:37:07,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:37:07,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:37:11,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:37:12,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 11:37:15,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:37:17,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:37:18,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 11:37:22,924 INFO [train.py:1046] (1/4) Epoch 47, batch 1750, loss[loss=0.168, simple_loss=0.2362, pruned_loss=0.04988, over 23773.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2329, pruned_loss=0.03668, over 4706396.84 frames. ], batch size: 164, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:37:24,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:37:25,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:37:27,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:37:28,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 11:37:28,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:37:30,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1640720.0, ans=0.125 2023-10-04 11:37:31,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:37:31,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:37:33,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1640720.0, ans=0.1 2023-10-04 11:37:34,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1640720.0, ans=0.1 2023-10-04 11:37:36,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 11:37:37,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:37:40,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 11:37:40,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:37:42,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:37:44,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 11:37:46,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 11:37:48,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:37:49,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 11:37:52,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1640853.3333333333, ans=0.125 2023-10-04 11:37:57,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:37:59,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:38:00,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:38:03,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:38:03,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:38:04,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:38:06,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:38:08,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:38:09,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:38:09,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 11:38:11,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:38:14,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 11:38:15,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:38:17,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:38:17,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:38:21,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:38:22,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 11:38:22,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:38:22,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1640986.6666666667, ans=0.0 2023-10-04 11:38:23,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:38:26,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:38:29,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:38:31,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:38:31,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 11:38:31,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:38:32,394 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.03 vs. limit=15.0 2023-10-04 11:38:33,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:38:33,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:38:33,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:38:34,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:38:36,290 INFO [train.py:1046] (1/4) Epoch 47, batch 1800, loss[loss=0.1485, simple_loss=0.2384, pruned_loss=0.02927, over 24680.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2323, pruned_loss=0.03631, over 4699953.00 frames. ], batch size: 65, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:38:36,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:38:37,620 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 2.081e+02 2.379e+02 2.770e+02 6.213e+02, threshold=4.757e+02, percent-clipped=1.0 2023-10-04 11:38:39,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:38:39,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:38:42,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 11:38:44,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:38:45,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1641053.3333333333, ans=0.1 2023-10-04 11:38:47,336 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.93 vs. limit=15.0 2023-10-04 11:38:48,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 11:38:50,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:38:53,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:38:54,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:38:56,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:38:57,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:38:58,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:38:58,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 11:39:00,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:01,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:04,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1641186.6666666667, ans=0.125 2023-10-04 11:39:05,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 11:39:07,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 11:39:07,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 11:39:09,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:39:09,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:39:09,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:39:10,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:39:16,994 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 11:39:18,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:39:21,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:23,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 11:39:23,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 11:39:24,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:39:24,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:39:26,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:39:30,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 11:39:34,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:39:36,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 11:39:37,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:39:37,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:39:37,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:39:37,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 11:39:40,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:39:40,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:39:43,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 11:39:43,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:39:43,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1641320.0, ans=0.125 2023-10-04 11:39:45,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:39:45,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:39:45,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:48,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:48,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:39:50,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:39:50,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:39:51,557 INFO [train.py:1046] (1/4) Epoch 47, batch 1850, loss[loss=0.1459, simple_loss=0.2274, pruned_loss=0.03222, over 23681.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2323, pruned_loss=0.03623, over 4693282.89 frames. ], batch size: 149, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:39:53,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:39:53,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:39:58,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:39:58,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 11:40:03,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 11:40:06,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 11:40:10,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:40:10,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 11:40:10,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 11:40:21,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:40:23,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 11:40:27,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:40:27,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:40:30,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 11:40:31,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:40:31,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 11:40:33,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:40:35,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:40:39,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:40:40,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:40:40,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:40:40,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 11:40:40,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:40:43,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:40:43,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1641586.6666666667, ans=0.0 2023-10-04 11:40:44,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:40:46,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 11:40:47,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:40:52,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:40:52,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:40:52,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 11:40:52,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 11:40:54,422 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 11:40:55,922 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 11:40:59,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:40:59,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:40:59,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:40:59,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:41:01,284 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 11:41:01,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:41:01,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:41:04,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:41:04,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:41:05,481 INFO [train.py:1046] (1/4) Epoch 47, batch 1900, loss[loss=0.1514, simple_loss=0.241, pruned_loss=0.03089, over 24685.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2338, pruned_loss=0.03675, over 4699923.93 frames. ], batch size: 73, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:41:05,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:41:06,791 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.127e+02 2.369e+02 2.756e+02 3.360e+02, threshold=4.738e+02, percent-clipped=0.0 2023-10-04 11:41:06,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 11:41:08,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:41:10,249 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 11:41:10,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:41:10,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1641720.0, ans=0.0 2023-10-04 11:41:11,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:41:13,607 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.54 vs. limit=12.0 2023-10-04 11:41:15,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:41:17,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:41:18,600 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 11:41:18,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 11:41:21,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:41:21,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:41:21,426 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 11:41:22,659 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 11:41:22,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1641786.6666666667, ans=0.0 2023-10-04 11:41:27,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 11:41:29,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:41:31,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 11:41:33,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 11:41:40,755 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.49 vs. limit=22.5 2023-10-04 11:41:43,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 11:41:46,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 11:41:46,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:41:47,537 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 11:41:47,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 11:41:47,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 11:41:47,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 11:41:47,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:41:51,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 11:41:54,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:41:57,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:41:57,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 11:41:59,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:42:03,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 11:42:03,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:42:05,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1641986.6666666667, ans=0.125 2023-10-04 11:42:05,658 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.642e-02 2023-10-04 11:42:09,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:42:09,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:42:09,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:42:10,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:42:12,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:42:12,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1641986.6666666667, ans=0.0 2023-10-04 11:42:13,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 11:42:13,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:42:15,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:42:15,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:42:18,181 INFO [train.py:1046] (1/4) Epoch 47, batch 1950, loss[loss=0.1363, simple_loss=0.2167, pruned_loss=0.02792, over 24234.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2341, pruned_loss=0.03666, over 4713954.12 frames. ], batch size: 56, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:42:18,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:42:18,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:42:18,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:42:19,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:42:22,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:42:26,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:42:26,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:42:26,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:42:27,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 11:42:28,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 11:42:28,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:42:30,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:42:34,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:42:34,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:42:34,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:42:36,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1642120.0, ans=0.1 2023-10-04 11:42:37,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:42:39,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:42:40,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:42:40,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:42:40,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:42:43,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:42:46,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:42:46,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:42:46,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 11:42:46,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 11:42:47,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 11:42:47,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:42:47,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:42:52,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:42:54,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:42:58,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:43:01,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:43:01,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:43:01,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 11:43:03,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:43:06,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:43:07,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:43:07,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:43:16,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:43:18,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:43:21,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:43:23,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:43:24,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:43:24,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:43:26,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 11:43:26,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:43:26,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:43:28,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 11:43:30,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:43:33,283 INFO [train.py:1046] (1/4) Epoch 47, batch 2000, loss[loss=0.1585, simple_loss=0.2449, pruned_loss=0.03605, over 23356.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2343, pruned_loss=0.03639, over 4722402.19 frames. ], batch size: 93, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:43:34,756 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 2.107e+02 2.252e+02 2.646e+02 4.173e+02, threshold=4.505e+02, percent-clipped=0.0 2023-10-04 11:43:36,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:43:37,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:43:37,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:43:38,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:43:40,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:43:41,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 11:43:41,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:43:46,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:43:47,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 11:43:49,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:43:49,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:43:52,753 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.84 vs. limit=15.0 2023-10-04 11:43:53,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:43:53,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 11:43:54,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:43:56,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:43:56,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:43:58,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 11:43:58,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 11:43:59,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 11:43:59,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:44:01,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=1642520.0, ans=0.5 2023-10-04 11:44:04,727 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:44:04,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:44:04,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:05,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:44:07,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:44:07,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 11:44:11,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 11:44:11,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:44:11,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:44:17,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:44:17,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:44:17,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:44:18,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:44:20,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:44:20,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:44:21,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:44:21,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:44:24,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:25,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:44:25,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 11:44:32,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:44:33,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1642653.3333333333, ans=0.125 2023-10-04 11:44:34,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:44:37,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:44:37,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:44:39,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1642653.3333333333, ans=0.0 2023-10-04 11:44:40,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:43,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:44:43,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:44,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:44:44,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:44:47,638 INFO [train.py:1046] (1/4) Epoch 47, batch 2050, loss[loss=0.1445, simple_loss=0.2192, pruned_loss=0.03488, over 23812.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.234, pruned_loss=0.03685, over 4704779.79 frames. ], batch size: 212, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:44:47,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:44:49,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:50,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:44:51,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:56,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:44:59,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:45:00,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:45:02,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:45:02,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 11:45:02,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:45:04,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:45:04,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:45:04,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1642786.6666666667, ans=0.125 2023-10-04 11:45:13,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:45:13,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:45:14,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 11:45:16,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:45:19,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 11:45:19,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:45:23,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:45:24,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:45:25,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:45:25,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:45:27,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:45:29,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:45:29,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:45:33,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:45:35,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:45:38,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:45:39,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:45:44,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:45:47,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:45:48,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 11:45:53,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:45:53,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:45:56,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:45:57,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 11:45:59,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1642986.6666666667, ans=0.2 2023-10-04 11:46:00,909 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 11:46:00,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:46:02,624 INFO [train.py:1046] (1/4) Epoch 47, batch 2100, loss[loss=0.1335, simple_loss=0.2189, pruned_loss=0.02404, over 24436.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2333, pruned_loss=0.03629, over 4709608.60 frames. ], batch size: 58, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:46:02,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:46:02,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:46:05,229 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.063e+02 2.330e+02 2.672e+02 3.956e+02, threshold=4.660e+02, percent-clipped=0.0 2023-10-04 11:46:05,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:46:05,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 11:46:05,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 11:46:06,199 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.48 vs. limit=22.5 2023-10-04 11:46:06,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:46:10,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:46:11,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:46:14,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:46:15,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:46:15,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 11:46:18,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:46:18,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 11:46:18,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 11:46:18,719 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.78 vs. limit=15.0 2023-10-04 11:46:19,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:46:21,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:46:21,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 11:46:21,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 11:46:24,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 11:46:24,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:46:27,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:46:27,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:46:29,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:46:31,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 11:46:31,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:46:31,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 11:46:32,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1643186.6666666667, ans=0.125 2023-10-04 11:46:33,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1643186.6666666667, ans=0.2 2023-10-04 11:46:34,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 11:46:34,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:46:34,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 11:46:35,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 11:46:35,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 11:46:37,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:46:39,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:46:42,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:46:43,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:46:45,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:46:46,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:46:46,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 11:46:47,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:46:47,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:46:47,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:46:48,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 11:46:50,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 11:46:52,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 11:46:54,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:46:58,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:46:59,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 11:47:03,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:47:06,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:47:06,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:47:06,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:47:08,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 11:47:08,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:47:10,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:47:10,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:47:11,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:47:12,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:14,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 11:47:15,700 INFO [train.py:1046] (1/4) Epoch 47, batch 2150, loss[loss=0.1585, simple_loss=0.2317, pruned_loss=0.04265, over 23777.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2324, pruned_loss=0.03602, over 4717975.98 frames. ], batch size: 195, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:47:15,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 11:47:15,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:47:18,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:47:18,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:47:18,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:47:18,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:47:18,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1643386.6666666667, ans=0.0 2023-10-04 11:47:23,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 11:47:24,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:47:26,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:28,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:47:28,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:47:28,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:47:30,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:32,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:47:32,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:47:35,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:47:36,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 11:47:39,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:47:41,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:47:43,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:47:43,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:47:43,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:47:43,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:47:44,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:47:44,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:47:44,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:47:46,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 11:47:48,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:47:50,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:50,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:47:52,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:47:53,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:47:55,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1643520.0, ans=0.2 2023-10-04 11:47:56,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:56,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:47:58,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:47:58,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 11:47:58,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:48:02,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:48:03,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:05,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:48:05,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:48:06,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:08,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:08,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 11:48:09,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 11:48:09,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:48:09,546 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 11:48:10,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:10,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:48:12,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 11:48:12,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:48:12,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 11:48:12,891 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 11:48:12,891 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 11:48:12,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 11:48:14,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:15,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:48:16,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:48:17,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:17,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 11:48:18,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:18,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:27,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:48:28,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 11:48:30,051 INFO [train.py:1046] (1/4) Epoch 47, batch 2200, loss[loss=0.1582, simple_loss=0.239, pruned_loss=0.03865, over 23338.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2331, pruned_loss=0.03603, over 4728814.87 frames. ], batch size: 93, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:48:31,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:48:34,221 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.776e+02 2.021e+02 2.246e+02 2.617e+02 4.385e+02, threshold=4.493e+02, percent-clipped=0.0 2023-10-04 11:48:36,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:36,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:48:36,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:48:37,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:48:39,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1643720.0, ans=0.0 2023-10-04 11:48:40,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:40,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:48:40,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 11:48:46,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 11:48:47,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:48:54,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 11:48:55,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:55,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:48:56,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:49:00,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:49:02,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 11:49:05,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:49:05,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:49:07,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 11:49:10,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:49:13,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:49:14,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:49:16,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:49:16,780 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.22 vs. limit=15.0 2023-10-04 11:49:20,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 11:49:20,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:49:21,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 11:49:24,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:49:24,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 11:49:24,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:49:26,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:49:26,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:49:26,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:49:26,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:49:27,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:49:27,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:49:29,264 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 11:49:33,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 11:49:33,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:49:34,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:49:36,730 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 11:49:39,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:49:39,491 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 11:49:40,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 11:49:42,110 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 11:49:43,436 INFO [train.py:1046] (1/4) Epoch 47, batch 2250, loss[loss=0.1453, simple_loss=0.2263, pruned_loss=0.03219, over 23455.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2339, pruned_loss=0.03633, over 4726145.20 frames. ], batch size: 285, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:49:43,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:49:43,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 11:49:45,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:49:47,018 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 11:49:48,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:49:49,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:49:55,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:49:56,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1644053.3333333333, ans=0.0 2023-10-04 11:49:57,188 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:49:59,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:50:00,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:50:01,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:50:05,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 11:50:05,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:50:05,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:50:06,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 11:50:06,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:50:06,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:50:08,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:50:12,415 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.34 vs. limit=22.5 2023-10-04 11:50:14,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:50:16,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 11:50:17,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:50:17,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1644186.6666666667, ans=0.1 2023-10-04 11:50:18,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 11:50:20,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:50:22,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:50:27,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:50:28,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1644253.3333333333, ans=0.125 2023-10-04 11:50:30,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:50:30,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1644253.3333333333, ans=0.1 2023-10-04 11:50:31,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:50:31,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:50:34,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:50:35,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:50:38,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:50:40,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:50:43,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1644320.0, ans=0.0 2023-10-04 11:50:45,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:50:45,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:50:45,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1644320.0, ans=0.125 2023-10-04 11:50:46,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:50:52,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 11:50:54,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:50:54,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 11:50:54,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:50:54,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:50:57,184 INFO [train.py:1046] (1/4) Epoch 47, batch 2300, loss[loss=0.1426, simple_loss=0.2306, pruned_loss=0.02731, over 24460.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2346, pruned_loss=0.03634, over 4733967.14 frames. ], batch size: 66, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:50:57,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 11:51:00,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:51:00,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:51:01,759 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.761e+02 2.249e+02 2.496e+02 2.917e+02 4.902e+02, threshold=4.992e+02, percent-clipped=2.0 2023-10-04 11:51:06,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1644386.6666666667, ans=0.1 2023-10-04 11:51:07,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:51:07,431 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:51:09,536 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 11:51:10,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:51:10,937 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1644453.3333333333, ans=0.05 2023-10-04 11:51:16,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:51:16,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:51:16,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:51:18,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:51:18,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 11:51:19,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:51:21,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:51:21,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:51:25,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:51:28,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:51:30,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:51:35,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:51:35,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:51:39,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:51:41,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:51:44,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:51:46,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:51:46,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:51:47,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 11:51:50,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 11:51:50,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:51:50,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:51:50,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:51:50,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:51:52,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 11:51:52,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:51:52,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 11:51:52,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:51:52,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:51:53,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 11:51:53,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1644586.6666666667, ans=0.125 2023-10-04 11:51:59,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:52:02,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1644653.3333333333, ans=0.125 2023-10-04 11:52:05,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:52:07,940 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:52:09,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:52:09,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:52:10,661 INFO [train.py:1046] (1/4) Epoch 47, batch 2350, loss[loss=0.1311, simple_loss=0.2106, pruned_loss=0.02584, over 24460.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2353, pruned_loss=0.03667, over 4729253.25 frames. ], batch size: 58, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:52:10,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:52:10,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:52:12,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:52:12,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 11:52:17,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:52:18,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 11:52:23,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 11:52:24,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:52:26,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1644786.6666666667, ans=0.0 2023-10-04 11:52:26,628 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.10 vs. limit=15.0 2023-10-04 11:52:28,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:52:28,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:52:28,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:52:28,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:52:30,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 11:52:33,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:52:38,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 11:52:40,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:52:42,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:52:42,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:52:43,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1644853.3333333333, ans=0.0 2023-10-04 11:52:45,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:52:45,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 11:52:47,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:52:50,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:52:50,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:52:50,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:52:51,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:52:53,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 11:52:54,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:52:56,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:52:56,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:52:57,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 11:52:59,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:53:01,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 11:53:01,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:53:06,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 11:53:11,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 11:53:13,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:53:13,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:53:13,178 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 11:53:13,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 11:53:15,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 11:53:18,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:53:20,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1644986.6666666667, ans=0.0 2023-10-04 11:53:22,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:53:23,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:53:23,830 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:53:23,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1644986.6666666667, ans=0.125 2023-10-04 11:53:25,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:53:25,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 11:53:25,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 11:53:26,466 INFO [train.py:1046] (1/4) Epoch 47, batch 2400, loss[loss=0.1433, simple_loss=0.2006, pruned_loss=0.04302, over 19144.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2343, pruned_loss=0.03659, over 4727981.12 frames. ], batch size: 388, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:53:30,967 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.780e+02 2.104e+02 2.348e+02 2.668e+02 4.204e+02, threshold=4.695e+02, percent-clipped=0.0 2023-10-04 11:53:32,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 11:53:32,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:53:35,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 11:53:35,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:53:36,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:53:38,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 11:53:43,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:53:46,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 11:53:50,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:53:50,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1645120.0, ans=0.0 2023-10-04 11:53:52,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1645120.0, ans=0.0 2023-10-04 11:53:55,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 11:53:55,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=1645186.6666666667, ans=0.025 2023-10-04 11:53:56,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:53:59,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:54:05,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:54:05,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 11:54:06,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:54:15,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:54:17,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:54:19,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:54:21,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:54:21,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:54:21,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:54:21,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:54:21,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1645253.3333333333, ans=0.0 2023-10-04 11:54:22,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:54:22,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:54:27,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:54:27,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:54:27,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 11:54:29,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 11:54:30,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:54:30,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:54:31,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 11:54:33,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 11:54:33,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 11:54:33,211 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 11:54:35,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 11:54:36,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:54:37,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:54:37,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:54:39,220 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 11:54:39,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:54:40,510 INFO [train.py:1046] (1/4) Epoch 47, batch 2450, loss[loss=0.1496, simple_loss=0.2308, pruned_loss=0.03416, over 23319.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2329, pruned_loss=0.03616, over 4717161.55 frames. ], batch size: 105, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:54:40,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 11:54:43,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:54:43,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:54:46,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:54:46,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:54:47,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1645386.6666666667, ans=0.1 2023-10-04 11:54:48,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 11:54:53,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:54:53,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:54:58,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:54:58,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:54:58,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:54:58,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 11:55:01,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:55:04,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:55:04,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:55:08,188 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.01 vs. limit=15.0 2023-10-04 11:55:10,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:55:10,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:55:11,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:55:11,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:55:13,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 11:55:14,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:55:22,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:55:23,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:55:24,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:55:25,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:55:25,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:55:26,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:55:26,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 11:55:29,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:55:31,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:55:34,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:55:34,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:55:39,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:55:39,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 11:55:40,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:55:41,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:55:41,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 11:55:43,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:55:43,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:55:47,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:55:51,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:55:52,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:55:54,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 11:55:55,343 INFO [train.py:1046] (1/4) Epoch 47, batch 2500, loss[loss=0.1594, simple_loss=0.2495, pruned_loss=0.03465, over 24658.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2319, pruned_loss=0.03578, over 4718777.80 frames. ], batch size: 73, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:55:55,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:55:59,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:56:01,445 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.014e+02 2.322e+02 2.787e+02 4.599e+02, threshold=4.643e+02, percent-clipped=0.0 2023-10-04 11:56:09,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:56:09,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:56:10,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:56:10,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 11:56:16,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 11:56:17,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:56:19,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 11:56:19,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 11:56:19,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 11:56:21,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:56:21,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:56:23,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 11:56:23,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:56:23,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 11:56:23,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:56:27,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:56:27,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:56:32,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 11:56:32,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 11:56:32,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:56:33,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:56:37,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:56:41,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:56:44,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:56:44,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1645920.0, ans=0.1 2023-10-04 11:56:50,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 11:56:54,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 11:56:54,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:56:54,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 11:56:56,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:56:56,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:56:57,328 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 11:56:57,329 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 11:56:57,342 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 11:57:01,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:57:01,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 11:57:01,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 11:57:02,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:57:02,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 11:57:07,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 11:57:08,884 INFO [train.py:1046] (1/4) Epoch 47, batch 2550, loss[loss=0.1588, simple_loss=0.2365, pruned_loss=0.04061, over 23781.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2324, pruned_loss=0.03613, over 4715793.67 frames. ], batch size: 212, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:57:10,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:57:10,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=1646053.3333333333, ans=0.5 2023-10-04 11:57:12,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:57:13,582 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:57:16,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:57:16,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 11:57:17,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 11:57:21,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1646053.3333333333, ans=0.125 2023-10-04 11:57:22,491 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 11:57:23,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:57:24,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1646120.0, ans=0.1 2023-10-04 11:57:25,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:57:25,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=1646120.0, ans=0.05 2023-10-04 11:57:28,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:57:28,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 11:57:28,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:57:29,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:57:29,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:57:30,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:57:31,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1646120.0, ans=0.1 2023-10-04 11:57:32,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 11:57:32,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 11:57:32,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:57:32,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 11:57:42,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:57:43,971 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.09 vs. limit=15.0 2023-10-04 11:57:46,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:57:46,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:57:46,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:57:47,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:57:53,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:57:56,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:57:56,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:57:56,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:57:57,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:57:57,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:58:00,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:58:00,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:58:03,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1646253.3333333333, ans=0.0 2023-10-04 11:58:05,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:58:05,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 11:58:05,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:58:06,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:58:06,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:58:08,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:58:11,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:58:17,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:58:20,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:58:20,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1646320.0, ans=0.125 2023-10-04 11:58:23,342 INFO [train.py:1046] (1/4) Epoch 47, batch 2600, loss[loss=0.144, simple_loss=0.2224, pruned_loss=0.0328, over 24425.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2333, pruned_loss=0.03658, over 4700330.95 frames. ], batch size: 58, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:58:23,389 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 11:58:23,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1646386.6666666667, ans=0.1 2023-10-04 11:58:26,111 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 11:58:26,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:58:26,193 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 11:58:27,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 11:58:27,605 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 11:58:28,871 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.029e+02 2.282e+02 2.767e+02 4.631e+02, threshold=4.564e+02, percent-clipped=0.0 2023-10-04 11:58:29,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1646386.6666666667, ans=0.125 2023-10-04 11:58:30,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:58:30,391 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 11:58:31,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 11:58:33,194 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 11:58:34,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:58:35,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 11:58:37,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 11:58:39,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:58:39,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 11:58:42,455 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 11:58:42,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 11:58:49,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:58:49,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:58:49,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:58:49,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 11:58:51,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1646520.0, ans=0.0 2023-10-04 11:58:52,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:58:55,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1646520.0, ans=0.1 2023-10-04 11:58:55,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1646520.0, ans=0.0 2023-10-04 11:58:56,887 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 11:59:02,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:59:03,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:59:04,087 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.63 vs. limit=15.0 2023-10-04 11:59:04,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 11:59:05,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:59:05,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:59:05,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 11:59:05,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1646586.6666666667, ans=0.125 2023-10-04 11:59:08,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 11:59:08,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:59:10,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:59:13,018 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1646586.6666666667, ans=0.0 2023-10-04 11:59:14,787 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 11:59:14,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:59:14,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:59:21,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:59:21,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1646653.3333333333, ans=0.125 2023-10-04 11:59:22,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:59:22,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 11:59:22,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:59:22,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1646653.3333333333, ans=0.1 2023-10-04 11:59:25,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:59:25,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:59:31,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 11:59:32,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:59:32,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1646653.3333333333, ans=0.1 2023-10-04 11:59:32,712 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:59:35,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 11:59:36,501 INFO [train.py:1046] (1/4) Epoch 47, batch 2650, loss[loss=0.1688, simple_loss=0.241, pruned_loss=0.04832, over 23753.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2341, pruned_loss=0.03682, over 4705013.61 frames. ], batch size: 179, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:59:39,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 11:59:39,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:59:40,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:59:42,093 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 11:59:42,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:59:44,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:59:44,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1646720.0, ans=0.125 2023-10-04 11:59:45,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 11:59:47,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:59:51,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:59:52,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 11:59:52,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:59:52,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:59:55,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 11:59:56,566 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 11:59:59,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:00:00,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 12:00:00,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:02,026 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 12:00:06,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:00:06,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:00:06,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:00:06,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:09,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 12:00:10,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 12:00:13,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:00:15,716 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.00 vs. limit=15.0 2023-10-04 12:00:17,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 12:00:17,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:00:19,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:20,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:00:22,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:00:22,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:00:23,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:00:25,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:00:26,478 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:00:26,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:00:26,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:00:28,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:29,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:00:29,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:30,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:00:31,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:00:33,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:34,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1646920.0, ans=0.2 2023-10-04 12:00:35,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:00:35,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:35,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 12:00:39,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:00:41,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:41,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:44,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:00:45,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:00:45,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:00:49,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:00:49,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 12:00:50,827 INFO [train.py:1046] (1/4) Epoch 47, batch 2700, loss[loss=0.1586, simple_loss=0.2359, pruned_loss=0.04067, over 23677.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2349, pruned_loss=0.03712, over 4711239.72 frames. ], batch size: 149, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:00:52,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:00:52,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 12:00:55,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:00:56,839 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 2.063e+02 2.218e+02 2.649e+02 4.383e+02, threshold=4.436e+02, percent-clipped=0.0 2023-10-04 12:00:56,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:00:56,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:00:58,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:00:58,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:58,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:00:58,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 12:00:58,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 12:00:59,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:01:01,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:01:01,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:01:02,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:01:06,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:01:08,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 12:01:08,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:01:08,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1647120.0, ans=0.2 2023-10-04 12:01:12,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:01:12,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:01:17,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:01:17,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:01:18,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:01:18,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:01:23,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:01:24,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:01:26,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:01:26,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:01:26,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1647186.6666666667, ans=0.125 2023-10-04 12:01:30,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:01:30,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:01:39,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:01:40,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:01:42,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:01:42,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:01:45,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:01:47,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:01:47,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:01:50,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:01:52,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:01:52,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:01:54,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:01:57,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:01:57,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:01:57,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1647320.0, ans=0.0 2023-10-04 12:01:58,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 12:02:00,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:02:02,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:02:02,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 12:02:04,336 INFO [train.py:1046] (1/4) Epoch 47, batch 2750, loss[loss=0.1408, simple_loss=0.2266, pruned_loss=0.02753, over 24650.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2349, pruned_loss=0.03683, over 4709691.87 frames. ], batch size: 65, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:02:04,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 12:02:04,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:02:05,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:02:05,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:02:07,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:07,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:02:08,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:10,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:02:11,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 12:02:11,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:02:11,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:11,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 12:02:12,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:02:12,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:02:17,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1647453.3333333333, ans=0.1 2023-10-04 12:02:19,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 12:02:20,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:02:22,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:22,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:02:24,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:02:25,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:02:25,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:02:27,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:02:27,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:02:30,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:02:31,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:02:32,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:02:32,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1647520.0, ans=0.125 2023-10-04 12:02:34,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:36,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:02:42,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:02:42,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:02:44,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:02:48,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:48,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:02:48,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:02:56,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:02:57,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:02:57,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 12:03:01,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:03:02,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 12:03:07,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 12:03:09,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:03:11,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 12:03:12,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:03:13,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:03:13,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 12:03:13,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:03:17,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 12:03:17,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:03:18,398 INFO [train.py:1046] (1/4) Epoch 47, batch 2800, loss[loss=0.1545, simple_loss=0.2224, pruned_loss=0.0433, over 23707.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2343, pruned_loss=0.03645, over 4716210.95 frames. ], batch size: 179, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:03:18,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:03:19,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 12:03:19,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:03:19,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:03:23,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:03:23,627 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 12:03:23,628 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 12:03:24,723 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.793e+02 2.018e+02 2.201e+02 2.488e+02 4.073e+02, threshold=4.402e+02, percent-clipped=0.0 2023-10-04 12:03:25,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1647720.0, ans=0.125 2023-10-04 12:03:26,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:03:27,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:03:27,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:03:30,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1647720.0, ans=0.125 2023-10-04 12:03:31,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:03:32,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 12:03:35,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 12:03:36,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 12:03:38,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:03:38,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:03:38,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:03:43,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:03:43,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:03:43,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 12:03:44,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:03:51,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:03:53,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:03:56,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:03:57,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:03:57,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:04:02,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:04:02,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 12:04:04,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:04:05,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:04:05,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:04:08,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:04:09,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:04:11,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1647920.0, ans=0.125 2023-10-04 12:04:12,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:04:13,264 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.47 vs. limit=12.0 2023-10-04 12:04:13,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1647920.0, ans=0.1 2023-10-04 12:04:15,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:04:15,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:04:15,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 12:04:16,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 12:04:16,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:04:17,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:04:17,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 12:04:17,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:04:19,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:04:19,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:04:21,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 12:04:23,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:04:23,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:04:25,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:04:26,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 12:04:32,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:04:33,913 INFO [train.py:1046] (1/4) Epoch 47, batch 2850, loss[loss=0.1539, simple_loss=0.2245, pruned_loss=0.04168, over 23787.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2341, pruned_loss=0.03629, over 4716740.99 frames. ], batch size: 195, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:04:33,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:04:34,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:04:35,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:04:38,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:04:38,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:04:38,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:04:41,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:04:41,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:04:43,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:04:43,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 12:04:50,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 12:04:50,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:04:52,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 12:04:52,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:04:56,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 12:04:56,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 12:04:57,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:04:59,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1648120.0, ans=0.0 2023-10-04 12:05:06,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1648186.6666666667, ans=0.0 2023-10-04 12:05:10,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:05:12,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:05:12,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:05:12,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:05:12,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:05:12,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:05:13,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:05:15,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 12:05:16,170 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.10 vs. limit=15.0 2023-10-04 12:05:16,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:05:16,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:05:18,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:05:19,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:05:22,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:05:23,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:05:25,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:05:27,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:05:28,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1648253.3333333333, ans=0.1 2023-10-04 12:05:30,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:05:30,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:05:31,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:05:33,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:05:37,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:05:39,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 12:05:39,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 12:05:40,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:05:42,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:05:42,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 12:05:43,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:05:43,894 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.43 vs. limit=15.0 2023-10-04 12:05:44,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:05:44,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:05:44,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:05:44,770 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 12:05:44,804 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 12:05:44,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:05:46,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:05:47,449 INFO [train.py:1046] (1/4) Epoch 47, batch 2900, loss[loss=0.1538, simple_loss=0.2445, pruned_loss=0.03155, over 24523.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2343, pruned_loss=0.03655, over 4699265.34 frames. ], batch size: 71, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:05:50,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 12:05:50,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:05:50,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:05:51,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 12:05:53,561 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.030e+02 2.253e+02 2.601e+02 4.096e+02, threshold=4.506e+02, percent-clipped=0.0 2023-10-04 12:05:53,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1648386.6666666667, ans=0.125 2023-10-04 12:05:56,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:05:56,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 12:05:58,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 12:05:59,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:05:59,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:06:02,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:06:04,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:06:06,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:06:07,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:06:10,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 12:06:10,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 12:06:10,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:06:12,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:06:14,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 12:06:15,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 12:06:18,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:06:18,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 12:06:18,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:06:19,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:06:19,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 12:06:21,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1648520.0, ans=0.125 2023-10-04 12:06:22,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:06:24,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:06:28,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:06:31,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:06:32,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 12:06:32,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 12:06:32,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:06:39,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:06:40,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 12:06:40,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:06:46,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:06:52,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1648653.3333333333, ans=0.125 2023-10-04 12:06:53,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:06:53,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:06:55,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 12:06:58,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:06:58,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 12:06:59,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:07:00,878 INFO [train.py:1046] (1/4) Epoch 47, batch 2950, loss[loss=0.1563, simple_loss=0.247, pruned_loss=0.03284, over 24566.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2341, pruned_loss=0.03649, over 4711929.70 frames. ], batch size: 71, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:07:00,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:07:06,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:07:08,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 12:07:09,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:07:09,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:07:10,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:07:12,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:07:13,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 12:07:14,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 12:07:16,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:07:16,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:07:19,487 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.48 vs. limit=12.0 2023-10-04 12:07:20,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:07:21,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:07:24,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:07:24,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:07:27,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:07:27,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:07:29,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:07:29,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:07:29,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:07:32,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 12:07:35,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1648853.3333333333, ans=0.0 2023-10-04 12:07:39,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 12:07:39,336 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 12:07:40,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:07:42,041 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 12:07:42,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 12:07:43,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:07:43,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:07:43,493 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 12:07:43,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:07:46,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 12:07:47,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:07:47,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:07:50,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:07:51,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:07:51,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:07:51,660 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 12:07:51,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:07:51,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 12:07:59,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:08:00,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:08:01,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 12:08:01,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:08:05,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 12:08:06,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:08:08,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:08:08,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:08:10,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1648986.6666666667, ans=0.0 2023-10-04 12:08:11,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:08:11,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 12:08:13,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:08:14,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:08:14,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:08:14,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:08:14,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:08:15,879 INFO [train.py:1046] (1/4) Epoch 47, batch 3000, loss[loss=0.1504, simple_loss=0.2314, pruned_loss=0.03471, over 24648.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.235, pruned_loss=0.03682, over 4711275.89 frames. ], batch size: 65, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:08:15,879 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 12:08:22,848 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([4.4242, 3.8878, 3.5523, 3.6533], device='cuda:1') 2023-10-04 12:08:28,124 INFO [train.py:1078] (1/4) Epoch 47, validation: loss=0.3516, simple_loss=0.269, pruned_loss=0.2171, over 1125622.00 frames. 2023-10-04 12:08:28,125 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 12:08:28,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:08:29,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:08:29,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 12:08:30,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:08:33,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:08:34,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:08:35,324 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 2.022e+02 2.270e+02 2.675e+02 4.950e+02, threshold=4.541e+02, percent-clipped=1.0 2023-10-04 12:08:36,880 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 12:08:36,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 12:08:40,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:08:40,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:08:41,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 12:08:41,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:08:49,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 12:08:56,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:09:03,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 12:09:04,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:09:07,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:09:07,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:09:09,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:09:10,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:09:10,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 12:09:11,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1649253.3333333333, ans=0.2 2023-10-04 12:09:14,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 12:09:14,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:09:15,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:09:17,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:09:17,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:09:17,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:09:17,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:09:21,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:09:22,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:09:22,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:09:25,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:09:26,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 12:09:28,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:09:28,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:09:30,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:09:32,364 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.05 vs. limit=15.0 2023-10-04 12:09:33,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:09:33,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:09:36,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 12:09:36,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 12:09:36,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:09:36,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 12:09:37,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:09:39,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 12:09:40,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:09:42,649 INFO [train.py:1046] (1/4) Epoch 47, batch 3050, loss[loss=0.1452, simple_loss=0.2299, pruned_loss=0.03019, over 24322.00 frames. ], tot_loss[loss=0.154, simple_loss=0.235, pruned_loss=0.03651, over 4720856.37 frames. ], batch size: 61, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:09:42,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:09:42,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 12:09:44,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 12:09:44,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 12:09:46,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:09:46,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:09:46,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 12:09:46,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:09:47,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:09:47,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1649386.6666666667, ans=0.0 2023-10-04 12:09:48,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 12:09:51,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:09:53,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:09:54,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:09:57,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:10:00,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 12:10:06,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 12:10:06,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 12:10:08,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:10:09,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:10:14,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:10:14,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:10:16,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:10:17,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:10:17,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1649520.0, ans=0.125 2023-10-04 12:10:18,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:10:18,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:10:18,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:10:18,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:10:20,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:10:20,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:10:24,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:10:24,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 12:10:25,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:10:25,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:10:26,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1649586.6666666667, ans=0.1 2023-10-04 12:10:27,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:10:28,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:10:28,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:10:30,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:10:33,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1649586.6666666667, ans=0.125 2023-10-04 12:10:35,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:10:36,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:10:40,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:10:42,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:10:42,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:10:44,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:10:44,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:10:44,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:10:47,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 12:10:48,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:10:48,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:10:50,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 12:10:51,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:10:55,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:10:57,089 INFO [train.py:1046] (1/4) Epoch 47, batch 3100, loss[loss=0.1464, simple_loss=0.2245, pruned_loss=0.03417, over 23296.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2349, pruned_loss=0.03685, over 4710363.57 frames. ], batch size: 119, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:10:58,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:10:58,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1649720.0, ans=0.125 2023-10-04 12:10:59,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:11:01,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 12:11:03,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 12:11:04,587 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.016e+02 2.223e+02 2.559e+02 3.925e+02, threshold=4.446e+02, percent-clipped=0.0 2023-10-04 12:11:04,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 12:11:04,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1649720.0, ans=0.125 2023-10-04 12:11:06,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:11:09,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:11:09,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:11:13,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 12:11:17,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:11:17,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1649786.6666666667, ans=0.1 2023-10-04 12:11:21,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 12:11:25,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:11:25,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:27,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:11:27,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:11:27,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 12:11:28,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:11:28,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 12:11:28,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:11:30,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:11:31,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 12:11:33,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:11:38,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:11:38,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 12:11:39,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 12:11:42,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:42,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:11:44,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:11:44,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:44,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:11:45,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:11:45,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:11:48,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:11:48,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1649920.0, ans=0.125 2023-10-04 12:11:50,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:11:50,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:50,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 12:11:54,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:11:55,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 12:11:58,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:11:58,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 12:11:58,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1649986.6666666667, ans=0.0 2023-10-04 12:11:59,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:11:59,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:59,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 12:12:10,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1650053.3333333333, ans=0.0 2023-10-04 12:12:11,728 INFO [train.py:1046] (1/4) Epoch 47, batch 3150, loss[loss=0.1658, simple_loss=0.2559, pruned_loss=0.03785, over 24558.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2334, pruned_loss=0.03644, over 4720244.98 frames. ], batch size: 71, lr: 2.16e-03, grad_scale: 4.0 2023-10-04 12:12:11,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 12:12:13,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:12:15,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:12:17,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:12:17,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:12:18,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 12:12:19,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:12:19,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 12:12:19,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 12:12:22,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:12:24,019 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 12:12:26,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 12:12:26,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:12:28,246 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 12:12:28,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 12:12:29,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 12:12:30,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 12:12:30,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 12:12:30,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:12:32,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:12:32,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:12:33,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 12:12:35,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:12:35,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:12:37,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:12:38,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 12:12:42,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 12:12:43,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:12:44,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:12:46,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:12:47,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 12:12:50,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 12:12:51,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:12:52,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 12:12:52,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:12:53,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:12:53,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:12:54,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:12:54,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:12:56,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 12:12:56,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:12:56,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:12:57,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:12:59,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:12:59,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 12:12:59,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:13:01,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 12:13:01,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:03,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 12:13:03,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 12:13:04,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:13:04,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:13:06,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 12:13:07,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 12:13:08,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:13:11,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1650320.0, ans=0.0 2023-10-04 12:13:12,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:13:13,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:14,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:13:18,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:13:18,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:20,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1650320.0, ans=0.0 2023-10-04 12:13:20,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1650320.0, ans=0.2 2023-10-04 12:13:21,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 12:13:25,667 INFO [train.py:1046] (1/4) Epoch 47, batch 3200, loss[loss=0.1462, simple_loss=0.2279, pruned_loss=0.03229, over 24284.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2328, pruned_loss=0.03639, over 4711373.99 frames. ], batch size: 61, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:13:25,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:13:25,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 12:13:26,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1650386.6666666667, ans=0.1 2023-10-04 12:13:28,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:31,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:13:31,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 12:13:32,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:13:34,114 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.740e+02 2.055e+02 2.237e+02 2.558e+02 4.162e+02, threshold=4.474e+02, percent-clipped=0.0 2023-10-04 12:13:38,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:13:42,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:49,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:13:58,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 12:13:58,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:14:01,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 12:14:03,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:14:05,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:14:05,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:14:07,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:14:09,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1650586.6666666667, ans=0.125 2023-10-04 12:14:12,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 12:14:13,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 12:14:15,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 12:14:17,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 12:14:20,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:14:22,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1650586.6666666667, ans=0.0 2023-10-04 12:14:26,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:14:26,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:14:28,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:14:28,492 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 12:14:28,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:14:31,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:14:32,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 12:14:34,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 12:14:34,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1650653.3333333333, ans=0.2 2023-10-04 12:14:35,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 12:14:37,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 12:14:39,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:14:40,330 INFO [train.py:1046] (1/4) Epoch 47, batch 3250, loss[loss=0.163, simple_loss=0.2423, pruned_loss=0.04185, over 23990.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2326, pruned_loss=0.03619, over 4719995.77 frames. ], batch size: 80, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:14:40,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 12:14:40,483 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 12:14:41,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:14:41,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:14:43,665 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 12:14:46,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:14:47,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:14:52,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1650720.0, ans=0.125 2023-10-04 12:14:55,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:14:55,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 12:14:56,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:14:56,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:14:56,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:14:58,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:14:58,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:15:01,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:15:02,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:15:02,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:15:02,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:15:02,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:15:04,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:15:05,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:15:06,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:15:08,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1650853.3333333333, ans=0.1 2023-10-04 12:15:10,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:15:10,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:15:12,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:15:12,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:15:12,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:15:17,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 12:15:17,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:15:17,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:15:19,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:15:20,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:15:28,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:15:34,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:15:34,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:15:34,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 12:15:34,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:15:34,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 12:15:36,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:15:38,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 12:15:38,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 12:15:39,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:15:39,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1650986.6666666667, ans=0.0 2023-10-04 12:15:40,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:15:41,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:15:43,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 12:15:43,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:15:44,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:15:46,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:15:47,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 12:15:47,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:15:49,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:15:50,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 12:15:52,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1650986.6666666667, ans=0.125 2023-10-04 12:15:54,448 INFO [train.py:1046] (1/4) Epoch 47, batch 3300, loss[loss=0.1518, simple_loss=0.228, pruned_loss=0.0378, over 23330.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2331, pruned_loss=0.03669, over 4709633.92 frames. ], batch size: 285, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:15:54,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:15:54,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 12:15:55,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 12:15:57,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 12:15:57,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:16:01,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:16:02,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:16:02,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:03,620 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.067e+02 2.305e+02 2.787e+02 4.644e+02, threshold=4.609e+02, percent-clipped=2.0 2023-10-04 12:16:05,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 12:16:05,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:16:07,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:16:08,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:16:08,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1651120.0, ans=0.2 2023-10-04 12:16:12,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 12:16:15,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:16:15,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:16:16,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:17,545 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 12:16:17,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:16:19,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:16:19,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:16:19,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:16:20,477 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 12:16:24,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:16:24,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:16:27,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:27,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 12:16:27,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1651186.6666666667, ans=0.125 2023-10-04 12:16:28,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 12:16:28,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:29,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:16:31,417 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 12:16:33,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 12:16:33,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:16:36,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 12:16:38,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:16:40,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:16:42,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:16:42,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1651253.3333333333, ans=0.04949747468305833 2023-10-04 12:16:44,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:16:44,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:16:44,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:16:44,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:16:46,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:16:46,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:48,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:16:49,585 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 12:16:49,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 12:16:53,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:16:55,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:16:55,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:16:56,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:16:56,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:16:57,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:16:57,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:16:58,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 12:16:58,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:59,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 12:17:00,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 12:17:01,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:03,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:06,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:17:06,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:17:07,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:17:08,861 INFO [train.py:1046] (1/4) Epoch 47, batch 3350, loss[loss=0.149, simple_loss=0.2339, pruned_loss=0.03202, over 24543.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2337, pruned_loss=0.03656, over 4720730.32 frames. ], batch size: 66, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:17:10,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:17:10,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:13,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:17:15,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:15,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:17:19,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:20,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:17:23,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:17:23,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:17:24,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 12:17:26,031 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 12:17:26,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:17:28,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 12:17:28,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 12:17:30,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:17:30,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:17:31,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:17:31,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 12:17:31,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:33,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:17:34,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:34,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:34,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:36,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:17:36,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1651520.0, ans=0.035 2023-10-04 12:17:37,157 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.68 vs. limit=15.0 2023-10-04 12:17:39,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1651520.0, ans=0.0 2023-10-04 12:17:40,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:17:41,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1651520.0, ans=0.04949747468305833 2023-10-04 12:17:42,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:42,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:17:46,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:17:48,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:50,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:50,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:17:53,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:17:54,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 12:17:54,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 12:17:54,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 12:17:54,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:17:57,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 12:17:57,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:17:58,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:18:02,250 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.44 vs. limit=15.0 2023-10-04 12:18:07,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:18:07,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 12:18:08,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:18:08,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1651653.3333333333, ans=0.2 2023-10-04 12:18:10,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:18:12,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:18:16,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:18:18,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 12:18:18,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:18:18,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:18:18,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1651653.3333333333, ans=0.125 2023-10-04 12:18:20,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:18:20,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 12:18:21,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:18:22,925 INFO [train.py:1046] (1/4) Epoch 47, batch 3400, loss[loss=0.1633, simple_loss=0.2399, pruned_loss=0.04333, over 22698.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2353, pruned_loss=0.03719, over 4706966.95 frames. ], batch size: 322, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:18:22,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 12:18:24,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:18:24,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:18:25,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:18:25,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1651720.0, ans=0.125 2023-10-04 12:18:26,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:18:26,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 12:18:31,050 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.016e+02 2.273e+02 2.707e+02 4.180e+02, threshold=4.545e+02, percent-clipped=0.0 2023-10-04 12:18:31,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 12:18:31,173 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 12:18:31,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:18:31,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1651720.0, ans=0.0 2023-10-04 12:18:35,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1651786.6666666667, ans=0.125 2023-10-04 12:18:37,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:18:37,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:18:37,134 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:18:38,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:18:39,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1651786.6666666667, ans=0.0 2023-10-04 12:18:44,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:18:44,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 12:18:50,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:18:51,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:18:51,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:18:53,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 12:18:56,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:18:59,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1651853.3333333333, ans=0.0 2023-10-04 12:19:02,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 12:19:08,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:19:08,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:19:08,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1651920.0, ans=0.0 2023-10-04 12:19:09,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 12:19:09,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:19:09,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1651920.0, ans=0.07 2023-10-04 12:19:10,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:19:10,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:19:10,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:19:13,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1651920.0, ans=0.125 2023-10-04 12:19:15,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:19:18,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:19:18,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:19:23,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:19:25,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 12:19:29,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:19:33,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 12:19:36,419 INFO [train.py:1046] (1/4) Epoch 47, batch 3450, loss[loss=0.1426, simple_loss=0.2223, pruned_loss=0.03148, over 23322.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2345, pruned_loss=0.03688, over 4715580.90 frames. ], batch size: 119, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:19:36,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 12:19:36,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1652053.3333333333, ans=0.2 2023-10-04 12:19:38,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:19:40,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:19:40,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 12:19:41,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:19:44,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:19:48,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1652053.3333333333, ans=0.0 2023-10-04 12:19:48,513 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.06 vs. limit=15.0 2023-10-04 12:19:50,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:19:50,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:19:52,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:19:52,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:19:55,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:19:58,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 12:20:03,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 12:20:03,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:20:05,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:20:07,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:20:13,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 12:20:13,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:20:16,201 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=10.34 vs. limit=12.0 2023-10-04 12:20:20,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:20:20,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:20:21,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:20:21,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:20:23,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 12:20:23,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:20:25,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:20:29,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:20:31,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 12:20:34,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:20:40,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:20:41,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:20:43,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:20:46,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:20:46,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:20:48,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:20:49,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:20:51,001 INFO [train.py:1046] (1/4) Epoch 47, batch 3500, loss[loss=0.1368, simple_loss=0.1936, pruned_loss=0.04, over 19401.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2333, pruned_loss=0.03652, over 4708534.73 frames. ], batch size: 388, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:20:52,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:20:52,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1652386.6666666667, ans=0.2 2023-10-04 12:20:52,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1652386.6666666667, ans=0.04949747468305833 2023-10-04 12:20:57,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:20:57,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 12:20:59,815 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.045e+02 2.266e+02 2.721e+02 4.311e+02, threshold=4.532e+02, percent-clipped=0.0 2023-10-04 12:20:59,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:21:02,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 12:21:04,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:21:04,318 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:21:05,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 12:21:09,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:21:09,862 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:21:11,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:21:11,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:21:11,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 12:21:13,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:13,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:21:13,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 12:21:16,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:16,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 12:21:19,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:21:22,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:23,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 12:21:23,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:21:26,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:21:26,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:21:28,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:30,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:21:30,136 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:21:31,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 12:21:32,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 12:21:34,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 12:21:34,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:21:35,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:37,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:21:37,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:21:41,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 12:21:42,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:21:50,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:21:50,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 12:21:50,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 12:21:50,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:21:50,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1652653.3333333333, ans=0.07 2023-10-04 12:21:51,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:21:53,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:21:54,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:57,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 12:21:57,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:21:57,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1652653.3333333333, ans=0.125 2023-10-04 12:22:00,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:22:00,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 12:22:01,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 12:22:03,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:22:03,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:22:04,327 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.50 vs. limit=15.0 2023-10-04 12:22:04,906 INFO [train.py:1046] (1/4) Epoch 47, batch 3550, loss[loss=0.1568, simple_loss=0.2291, pruned_loss=0.04224, over 23437.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2317, pruned_loss=0.03634, over 4696471.09 frames. ], batch size: 285, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:22:04,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:22:04,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:22:07,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:22:15,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:22:17,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 12:22:17,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1652720.0, ans=0.0 2023-10-04 12:22:19,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:22:20,507 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.29 vs. limit=15.0 2023-10-04 12:22:21,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:22:21,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:22:22,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:22:22,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:22:26,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:22:26,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:22:28,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:22:28,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 12:22:28,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:22:33,688 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.25 vs. limit=15.0 2023-10-04 12:22:34,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:22:35,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:22:38,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:22:38,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:22:38,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:22:39,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 12:22:39,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:22:40,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:22:42,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 12:22:47,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:22:48,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:22:48,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:22:50,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 12:22:51,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:22:53,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 12:22:53,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:22:57,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:22:57,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:22:57,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1652920.0, ans=0.5 2023-10-04 12:23:00,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 12:23:01,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:23:03,743 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1652986.6666666667, ans=0.125 2023-10-04 12:23:06,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:23:07,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 12:23:09,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:23:11,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:23:13,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 12:23:18,586 INFO [train.py:1046] (1/4) Epoch 47, batch 3600, loss[loss=0.1585, simple_loss=0.2424, pruned_loss=0.03731, over 23969.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2318, pruned_loss=0.03645, over 4700450.77 frames. ], batch size: 86, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:23:20,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 12:23:20,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:23:20,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:23:23,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:23:23,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:23:24,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:23:27,221 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.182e+02 2.453e+02 2.820e+02 4.666e+02, threshold=4.905e+02, percent-clipped=3.0 2023-10-04 12:23:27,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:23:27,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:23:30,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:23:31,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:23:31,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:23:31,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 12:23:36,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:23:37,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:23:40,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:23:41,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:23:43,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:23:43,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:23:43,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 12:23:45,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:23:46,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:23:46,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:23:50,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:23:53,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:23:54,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:23:55,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 12:24:00,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:24:00,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1653186.6666666667, ans=0.0 2023-10-04 12:24:01,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:24:01,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 12:24:06,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:24:06,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1653253.3333333333, ans=0.0 2023-10-04 12:24:11,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:24:14,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:24:23,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:24:23,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:24:23,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 12:24:24,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 12:24:26,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 12:24:29,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:24:29,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:24:31,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 12:24:31,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:24:31,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:24:31,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:24:34,583 INFO [train.py:1046] (1/4) Epoch 47, batch 3650, loss[loss=0.1537, simple_loss=0.23, pruned_loss=0.03868, over 23345.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.233, pruned_loss=0.03683, over 4696270.75 frames. ], batch size: 119, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:24:34,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 12:24:35,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 12:24:38,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:24:38,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 12:24:39,090 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:24:42,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 12:24:43,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:24:45,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 12:24:46,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 12:24:51,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:24:51,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:24:51,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:24:53,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1653453.3333333333, ans=0.125 2023-10-04 12:24:55,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 12:24:55,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:24:57,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 12:24:58,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:24:58,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:24:58,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 12:25:00,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 12:25:00,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:25:00,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:25:04,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:25:04,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1653520.0, ans=0.0 2023-10-04 12:25:05,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 12:25:07,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 12:25:07,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:25:09,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 12:25:10,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=1653520.0, ans=10.0 2023-10-04 12:25:11,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:25:11,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:25:17,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:25:18,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:25:18,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:25:19,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:25:20,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:25:25,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:25:28,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:25:28,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:25:29,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:25:31,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 12:25:32,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:25:32,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:25:38,417 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 12:25:39,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1653653.3333333333, ans=0.1 2023-10-04 12:25:41,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:25:41,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:25:43,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:25:43,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:25:44,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:25:44,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:25:47,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 12:25:47,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:25:48,577 INFO [train.py:1046] (1/4) Epoch 47, batch 3700, loss[loss=0.1493, simple_loss=0.2306, pruned_loss=0.034, over 23467.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2338, pruned_loss=0.03669, over 4699200.08 frames. ], batch size: 106, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:25:48,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:25:50,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:25:52,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:25:56,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:25:56,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 12:25:56,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:25:56,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 12:25:58,223 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.684e+02 1.988e+02 2.215e+02 2.636e+02 3.694e+02, threshold=4.430e+02, percent-clipped=0.0 2023-10-04 12:25:58,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:26:01,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:26:01,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1653720.0, ans=0.125 2023-10-04 12:26:02,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:26:02,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:26:03,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:26:05,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:26:05,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 12:26:08,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:26:09,464 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 12:26:16,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:26:16,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:26:19,145 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.06 vs. limit=15.0 2023-10-04 12:26:19,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:26:19,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 12:26:19,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:26:24,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:26:24,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 12:26:25,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:26:26,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1653853.3333333333, ans=0.125 2023-10-04 12:26:27,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:26:27,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:26:29,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:26:31,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 12:26:36,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:26:37,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 12:26:37,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:26:37,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 12:26:42,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:26:42,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:26:45,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:26:45,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 12:26:48,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:26:48,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:26:48,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:26:48,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:26:48,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1653986.6666666667, ans=0.125 2023-10-04 12:26:52,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:26:52,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 12:26:54,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 12:26:55,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:26:55,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:26:55,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:26:57,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:26:59,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:27:01,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:27:01,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1653986.6666666667, ans=0.0 2023-10-04 12:27:02,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:27:03,834 INFO [train.py:1046] (1/4) Epoch 47, batch 3750, loss[loss=0.1609, simple_loss=0.247, pruned_loss=0.03737, over 23443.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2349, pruned_loss=0.03698, over 4716505.30 frames. ], batch size: 105, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:27:05,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 12:27:05,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 12:27:08,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 12:27:08,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 12:27:09,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:27:10,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:27:12,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:27:14,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:27:14,678 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1654053.3333333333, ans=0.125 2023-10-04 12:27:18,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:27:21,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:27:22,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:27:24,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:27:27,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:27:30,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 12:27:30,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:27:34,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:27:34,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:27:35,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 12:27:39,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 12:27:39,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:27:41,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:27:42,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:27:47,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:27:48,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 12:27:51,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 12:27:54,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:27:57,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:27:58,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:27:58,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1654253.3333333333, ans=0.0 2023-10-04 12:28:00,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:28:05,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 12:28:06,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:28:08,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:28:09,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:28:11,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:28:16,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1654320.0, ans=0.125 2023-10-04 12:28:18,583 INFO [train.py:1046] (1/4) Epoch 47, batch 3800, loss[loss=0.1437, simple_loss=0.236, pruned_loss=0.02569, over 24647.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.235, pruned_loss=0.03704, over 4730692.85 frames. ], batch size: 68, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:28:18,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:28:22,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:28:24,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 12:28:24,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 12:28:27,397 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.785e+02 2.014e+02 2.169e+02 2.643e+02 4.021e+02, threshold=4.338e+02, percent-clipped=0.0 2023-10-04 12:28:27,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:28:28,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:28:30,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 12:28:32,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 12:28:32,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:28:32,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:28:32,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1654453.3333333333, ans=0.0 2023-10-04 12:28:33,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:28:33,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:28:34,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1654453.3333333333, ans=0.1 2023-10-04 12:28:35,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:28:35,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 12:28:38,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 12:28:38,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:28:41,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:28:44,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:28:44,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:28:45,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:28:45,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:28:47,267 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:28:48,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1654520.0, ans=0.07 2023-10-04 12:28:49,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:28:50,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:28:52,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1654520.0, ans=0.125 2023-10-04 12:28:55,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 12:28:55,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 12:28:57,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:29:03,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:29:07,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:29:07,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1654586.6666666667, ans=0.0 2023-10-04 12:29:10,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 12:29:10,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 12:29:12,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:29:14,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:29:14,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:29:15,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 12:29:21,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 12:29:21,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 12:29:21,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:29:22,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:29:26,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:29:26,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:29:27,788 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:29:33,661 INFO [train.py:1046] (1/4) Epoch 47, batch 3850, loss[loss=0.1172, simple_loss=0.1731, pruned_loss=0.03069, over 19292.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2332, pruned_loss=0.03658, over 4715416.25 frames. ], batch size: 388, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:29:33,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:29:33,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 12:29:33,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1654720.0, ans=0.125 2023-10-04 12:29:36,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:29:36,527 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:29:40,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:29:43,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:29:46,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 12:29:46,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 12:29:52,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:29:54,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:29:55,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:29:56,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:29:58,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1654786.6666666667, ans=0.0 2023-10-04 12:29:58,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1654786.6666666667, ans=0.125 2023-10-04 12:30:00,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:01,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:30:01,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:01,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:30:03,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:04,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:04,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:04,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:30:04,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 12:30:06,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 12:30:08,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:30:08,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:09,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1654853.3333333333, ans=0.0 2023-10-04 12:30:11,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:11,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:11,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1654853.3333333333, ans=0.2 2023-10-04 12:30:12,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 12:30:14,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 12:30:14,335 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:30:16,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:18,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 12:30:20,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 12:30:25,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1654920.0, ans=0.1 2023-10-04 12:30:26,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:27,087 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff2.min_abs, batch_count=1654920.0, ans=0.1 2023-10-04 12:30:28,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:30,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:31,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 12:30:32,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 12:30:33,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1654986.6666666667, ans=0.125 2023-10-04 12:30:36,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:37,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:40,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:30:40,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:30:40,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:41,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:41,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:30:41,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 12:30:41,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1654986.6666666667, ans=0.125 2023-10-04 12:30:43,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:30:44,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 12:30:44,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:44,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:47,537 INFO [train.py:1046] (1/4) Epoch 47, batch 3900, loss[loss=0.1558, simple_loss=0.2313, pruned_loss=0.04021, over 23889.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2318, pruned_loss=0.03615, over 4712798.18 frames. ], batch size: 212, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:30:47,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:30:47,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:50,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:30:50,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:50,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:50,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:30:50,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 12:30:50,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:51,248 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.67 vs. limit=15.0 2023-10-04 12:30:54,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:30:56,161 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.655e+02 1.990e+02 2.251e+02 2.622e+02 3.660e+02, threshold=4.502e+02, percent-clipped=0.0 2023-10-04 12:30:56,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:30:56,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:30:58,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:30:59,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1655053.3333333333, ans=0.1 2023-10-04 12:31:00,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:31:00,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:31:02,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:31:02,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 12:31:02,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:31:03,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 12:31:05,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:31:05,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 12:31:08,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 12:31:11,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:31:13,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:31:13,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:31:14,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:31:19,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:31:21,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:31:23,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:31:23,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:31:24,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:31:30,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:31:30,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:31:37,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:31:39,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:31:44,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1655253.3333333333, ans=0.015 2023-10-04 12:31:47,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:31:51,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:31:53,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 12:31:53,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 12:31:53,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:31:54,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 12:31:55,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:31:55,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 12:32:01,861 INFO [train.py:1046] (1/4) Epoch 47, batch 3950, loss[loss=0.1524, simple_loss=0.2276, pruned_loss=0.03865, over 23733.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2316, pruned_loss=0.03589, over 4706565.08 frames. ], batch size: 232, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:32:03,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:32:04,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 12:32:04,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:32:07,627 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.51 vs. limit=12.0 2023-10-04 12:32:08,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:32:10,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:32:15,627 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 12:32:15,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:32:17,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 12:32:17,069 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 12:32:17,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:32:20,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:32:20,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:32:20,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:32:24,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 12:32:25,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:32:25,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:32:26,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:32:27,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:32:27,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:32:35,271 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=8.00 vs. limit=12.0 2023-10-04 12:32:39,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:32:39,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:32:44,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 12:32:45,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1655586.6666666667, ans=0.0 2023-10-04 12:32:49,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 12:32:49,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 12:32:50,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:32:51,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:32:56,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:32:58,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:32:58,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:32:58,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:32:58,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 12:33:01,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1655653.3333333333, ans=0.0 2023-10-04 12:33:02,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:33:02,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1655653.3333333333, ans=0.0 2023-10-04 12:33:04,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:33:07,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 12:33:10,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1655653.3333333333, ans=0.07 2023-10-04 12:33:16,701 INFO [train.py:1046] (1/4) Epoch 47, batch 4000, loss[loss=0.161, simple_loss=0.2485, pruned_loss=0.0368, over 24438.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2324, pruned_loss=0.03603, over 4703687.36 frames. ], batch size: 77, lr: 2.15e-03, grad_scale: 32.0 2023-10-04 12:33:18,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:33:24,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:33:25,573 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 2.039e+02 2.177e+02 2.497e+02 4.705e+02, threshold=4.355e+02, percent-clipped=2.0 2023-10-04 12:33:28,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:33:28,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1655720.0, ans=0.125 2023-10-04 12:33:28,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1655720.0, ans=0.125 2023-10-04 12:33:29,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:33:29,967 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:33:31,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 12:33:31,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:33:32,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 12:33:32,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:33:32,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 12:33:34,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:33:35,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1655786.6666666667, ans=0.1 2023-10-04 12:33:37,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:33:37,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:33:37,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:33:37,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:33:37,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 12:33:39,602 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.28 vs. limit=12.0 2023-10-04 12:33:40,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:33:42,212 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 12:33:42,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:33:43,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:33:46,971 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 12:33:47,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:33:48,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:33:51,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1655853.3333333333, ans=0.125 2023-10-04 12:33:54,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 12:33:55,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:33:56,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:33:58,318 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 12:33:59,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:34:01,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 12:34:01,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:34:02,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:34:03,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:34:06,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:34:06,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:34:07,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:34:09,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1655920.0, ans=0.1 2023-10-04 12:34:10,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 12:34:10,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:34:10,868 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 12:34:11,504 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.88 vs. limit=15.0 2023-10-04 12:34:15,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:34:18,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 12:34:19,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:34:20,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1655986.6666666667, ans=0.0 2023-10-04 12:34:21,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:34:21,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:34:23,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:34:27,987 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:34:30,572 INFO [train.py:1046] (1/4) Epoch 47, batch 4050, loss[loss=0.1593, simple_loss=0.2331, pruned_loss=0.04271, over 23733.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.233, pruned_loss=0.0362, over 4699116.00 frames. ], batch size: 212, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:34:30,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 12:34:31,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 12:34:33,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:34:34,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:34:36,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:34:36,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:34:37,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:34:40,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:34:43,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:34:44,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 12:34:46,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:34:46,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:34:50,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:34:51,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:34:54,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 12:34:56,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 12:34:56,150 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 12:34:58,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:35:05,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 12:35:06,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:35:10,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:35:13,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1656253.3333333333, ans=0.1 2023-10-04 12:35:14,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:35:15,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:35:15,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:35:19,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:35:21,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 12:35:21,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:35:24,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:35:26,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 12:35:28,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1656320.0, ans=0.125 2023-10-04 12:35:30,183 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.97 vs. limit=22.5 2023-10-04 12:35:30,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:35:31,442 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.17 vs. limit=15.0 2023-10-04 12:35:36,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 12:35:38,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:35:38,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:35:40,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 12:35:40,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 12:35:40,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:35:41,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:35:43,411 INFO [train.py:1046] (1/4) Epoch 47, batch 4100, loss[loss=0.1732, simple_loss=0.2423, pruned_loss=0.052, over 23547.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2335, pruned_loss=0.03623, over 4718773.11 frames. ], batch size: 256, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:35:43,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:35:43,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:35:47,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 12:35:49,643 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 12:35:50,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 12:35:52,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 12:35:52,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:35:52,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:35:54,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:35:54,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:35:55,641 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 12:35:56,923 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 2.115e+02 2.367e+02 2.901e+02 4.348e+02, threshold=4.733e+02, percent-clipped=0.0 2023-10-04 12:35:57,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:35:58,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:35:58,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:35:58,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:35:59,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=1656453.3333333333, ans=6.0 2023-10-04 12:36:03,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:36:04,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1656453.3333333333, ans=0.125 2023-10-04 12:36:05,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:36:06,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:36:06,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 12:36:07,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:36:07,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:36:07,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:36:07,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:36:07,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 12:36:10,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:36:14,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 12:36:14,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:36:16,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:36:16,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 12:36:18,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:36:19,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:36:19,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:36:21,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 12:36:22,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:36:24,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:36:26,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 12:36:26,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:36:27,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:36:30,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:36:35,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:36:39,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:36:39,912 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:36:47,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:36:47,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:36:50,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=1656653.3333333333, ans=0.95 2023-10-04 12:36:51,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:36:54,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:36:56,206 INFO [train.py:1046] (1/4) Epoch 47, batch 4150, loss[loss=0.1636, simple_loss=0.2398, pruned_loss=0.04368, over 23500.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2339, pruned_loss=0.03647, over 4721459.35 frames. ], batch size: 93, lr: 2.15e-03, grad_scale: 4.0 2023-10-04 12:36:58,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:36:58,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1656720.0, ans=0.2 2023-10-04 12:36:59,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:37:01,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:37:01,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:37:05,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 12:37:05,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:37:05,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 12:37:06,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 12:37:06,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 12:37:06,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:37:09,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1656786.6666666667, ans=0.0 2023-10-04 12:37:11,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:37:11,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:37:11,717 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.63 vs. limit=10.0 2023-10-04 12:37:15,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:37:15,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:37:16,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:37:18,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 12:37:18,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:37:20,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 12:37:24,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:37:29,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:37:29,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 12:37:30,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 12:37:31,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:37:31,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 12:37:31,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:37:32,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:37:36,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:37:36,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.63 vs. limit=15.0 2023-10-04 12:37:37,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:37:40,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 12:37:42,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:37:44,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:37:46,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 12:37:46,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:37:47,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 12:37:48,384 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=6.23 vs. limit=12.0 2023-10-04 12:37:49,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:37:51,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1656920.0, ans=0.2 2023-10-04 12:37:52,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:37:53,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:37:54,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 12:37:54,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:37:54,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:37:55,612 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.02 vs. limit=15.0 2023-10-04 12:37:56,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 12:37:59,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 12:37:59,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:37:59,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:37:59,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:38:01,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 12:38:01,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:38:02,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:38:02,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:38:04,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:38:04,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 12:38:04,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:38:08,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:38:09,638 INFO [train.py:1046] (1/4) Epoch 47, batch 4200, loss[loss=0.1485, simple_loss=0.2216, pruned_loss=0.03766, over 23843.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2333, pruned_loss=0.03632, over 4722898.72 frames. ], batch size: 195, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:38:09,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 12:38:11,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:38:12,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:38:13,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:38:14,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:38:14,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:38:18,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 12:38:18,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1657053.3333333333, ans=0.0 2023-10-04 12:38:20,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 12:38:20,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:38:23,200 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.120e+02 2.356e+02 2.706e+02 4.391e+02, threshold=4.711e+02, percent-clipped=0.0 2023-10-04 12:38:23,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:38:27,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:38:27,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1657120.0, ans=0.1 2023-10-04 12:38:28,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 12:38:29,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:38:31,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:38:31,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 12:38:31,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:38:34,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:38:34,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:38:34,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:38:36,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:38:38,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 12:38:38,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:38:38,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1657186.6666666667, ans=0.0 2023-10-04 12:38:38,734 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.89 vs. limit=10.0 2023-10-04 12:38:42,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:38:43,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:38:45,764 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.80 vs. limit=15.0 2023-10-04 12:38:46,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:38:47,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:38:47,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1657186.6666666667, ans=0.0 2023-10-04 12:38:50,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:38:50,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 12:38:50,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:38:50,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1657186.6666666667, ans=0.0 2023-10-04 12:38:51,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:38:55,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:38:55,933 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.04 vs. limit=6.0 2023-10-04 12:38:56,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:39:02,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:39:05,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 12:39:06,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:39:10,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:39:10,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:39:12,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 12:39:18,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:39:23,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:39:24,715 INFO [train.py:1046] (1/4) Epoch 47, batch 4250, loss[loss=0.1498, simple_loss=0.234, pruned_loss=0.03281, over 24651.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2323, pruned_loss=0.0362, over 4709096.75 frames. ], batch size: 65, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:39:24,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:39:26,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:39:32,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:39:32,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 12:39:32,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:39:35,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:39:35,913 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.05 vs. limit=15.0 2023-10-04 12:39:38,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:39:42,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:39:42,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:39:45,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:39:45,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:39:47,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:39:48,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:39:48,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:39:50,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:39:52,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:39:53,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 12:39:56,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 12:39:56,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:39:57,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:39:57,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:39:59,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:39:59,778 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:39:59,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:40:03,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 12:40:03,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:40:08,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:40:09,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:40:09,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 12:40:09,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:40:10,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 12:40:11,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:40:14,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:40:14,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:40:14,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:40:17,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 12:40:18,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:40:20,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:40:24,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:40:27,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:40:29,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:40:31,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:40:31,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:40:31,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1657653.3333333333, ans=0.125 2023-10-04 12:40:32,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1657653.3333333333, ans=0.1 2023-10-04 12:40:33,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:40:34,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:40:34,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 12:40:35,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:40:39,652 INFO [train.py:1046] (1/4) Epoch 47, batch 4300, loss[loss=0.1562, simple_loss=0.2296, pruned_loss=0.04139, over 23755.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.232, pruned_loss=0.03624, over 4686752.98 frames. ], batch size: 179, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:40:39,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:40:39,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:40:45,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:40:45,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1657720.0, ans=0.125 2023-10-04 12:40:52,816 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 2.021e+02 2.286e+02 2.632e+02 3.512e+02, threshold=4.572e+02, percent-clipped=0.0 2023-10-04 12:40:54,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:40:54,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 12:40:54,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:40:57,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:40:57,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:40:57,698 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 12:41:02,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:41:03,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:41:06,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 12:41:06,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:41:07,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 12:41:10,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 12:41:11,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:41:14,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:41:14,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:41:14,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:41:16,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:41:16,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:41:16,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 12:41:17,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 12:41:19,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1657853.3333333333, ans=0.125 2023-10-04 12:41:20,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:41:23,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:41:23,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:41:25,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:41:25,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:41:25,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 12:41:25,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 12:41:25,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 12:41:25,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:41:27,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 12:41:27,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 12:41:30,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:41:31,566 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 12:41:33,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:41:35,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:41:35,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:41:37,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 12:41:37,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:41:37,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:41:38,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:41:38,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:41:39,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1657986.6666666667, ans=0.125 2023-10-04 12:41:40,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:41:42,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:41:44,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:41:45,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:41:45,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:41:50,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1657986.6666666667, ans=0.0 2023-10-04 12:41:50,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1657986.6666666667, ans=0.1 2023-10-04 12:41:51,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 12:41:53,008 INFO [train.py:1046] (1/4) Epoch 47, batch 4350, loss[loss=0.1396, simple_loss=0.2381, pruned_loss=0.02058, over 24690.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2331, pruned_loss=0.03608, over 4701060.07 frames. ], batch size: 73, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:41:53,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 12:41:57,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:41:58,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:42:01,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:42:01,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:42:01,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1658053.3333333333, ans=0.0 2023-10-04 12:42:01,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1658053.3333333333, ans=0.2 2023-10-04 12:42:07,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:42:10,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:42:11,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:42:11,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:42:14,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1658120.0, ans=0.125 2023-10-04 12:42:15,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:42:15,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1658120.0, ans=0.0 2023-10-04 12:42:18,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:42:20,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:42:20,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1658120.0, ans=0.0 2023-10-04 12:42:26,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 12:42:26,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:42:28,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:42:31,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:42:34,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 12:42:38,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:42:40,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 12:42:40,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1658253.3333333333, ans=0.125 2023-10-04 12:42:40,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1658253.3333333333, ans=0.125 2023-10-04 12:42:44,383 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 12:42:45,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:42:46,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1658253.3333333333, ans=0.0 2023-10-04 12:42:47,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:42:47,217 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 12:42:48,615 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 12:42:48,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:42:48,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:42:49,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:42:50,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:42:51,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:42:51,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:42:54,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 12:42:54,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:42:54,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:42:54,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:42:54,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 12:42:56,111 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 12:42:56,115 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 12:42:56,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 12:42:58,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1658320.0, ans=0.2 2023-10-04 12:42:59,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1658320.0, ans=0.125 2023-10-04 12:43:00,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:43:01,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:43:01,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:43:03,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:43:04,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1658320.0, ans=0.125 2023-10-04 12:43:05,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 12:43:06,647 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 12:43:06,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:43:07,949 INFO [train.py:1046] (1/4) Epoch 47, batch 4400, loss[loss=0.1676, simple_loss=0.2489, pruned_loss=0.04315, over 24684.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2333, pruned_loss=0.03617, over 4700781.39 frames. ], batch size: 65, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:43:11,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:43:11,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:43:12,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:43:15,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 12:43:15,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 12:43:16,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 12:43:16,899 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 12:43:18,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 12:43:18,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:43:20,886 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.809e+02 2.049e+02 2.248e+02 2.577e+02 4.164e+02, threshold=4.496e+02, percent-clipped=0.0 2023-10-04 12:43:20,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 12:43:22,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:43:22,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:43:22,455 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 12:43:25,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:43:25,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 12:43:25,796 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 12:43:29,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 12:43:30,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 12:43:30,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 12:43:31,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:43:33,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:43:33,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:43:33,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1658453.3333333333, ans=0.125 2023-10-04 12:43:34,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:43:36,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 12:43:37,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 12:43:38,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:43:41,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:43:41,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:43:43,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:43:43,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:43:43,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 12:43:45,057 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 12:43:48,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:43:48,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1658520.0, ans=0.0 2023-10-04 12:43:53,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:43:55,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 12:43:58,253 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:44:01,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:44:04,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:44:04,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1658586.6666666667, ans=0.125 2023-10-04 12:44:06,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 12:44:06,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:44:06,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:44:06,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:44:06,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:44:10,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 12:44:12,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1658653.3333333333, ans=0.0 2023-10-04 12:44:13,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 12:44:15,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 12:44:15,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:44:15,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 12:44:16,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:44:19,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:44:21,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1658720.0, ans=0.2 2023-10-04 12:44:22,203 INFO [train.py:1046] (1/4) Epoch 47, batch 4450, loss[loss=0.1627, simple_loss=0.236, pruned_loss=0.04468, over 23534.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2339, pruned_loss=0.03622, over 4701286.28 frames. ], batch size: 256, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:44:23,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 12:44:23,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1658720.0, ans=0.1 2023-10-04 12:44:26,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:44:29,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:44:29,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:44:34,758 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:44:35,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:44:35,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:44:39,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:44:40,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:44:43,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:44:43,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:44:43,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 12:44:43,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:44:44,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:44:44,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:44:44,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:44:48,044 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:44:53,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:44:53,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:44:55,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:44:55,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:44:56,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:45:00,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 12:45:01,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 12:45:01,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 12:45:01,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:45:04,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:45:04,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1658853.3333333333, ans=0.2 2023-10-04 12:45:05,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 12:45:07,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:45:11,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:45:13,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 12:45:13,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:45:13,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:45:13,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:45:14,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:45:16,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:45:19,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:45:19,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 12:45:20,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:45:23,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:45:24,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:45:26,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:45:26,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 12:45:29,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:45:30,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 12:45:32,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:45:36,908 INFO [train.py:1046] (1/4) Epoch 47, batch 4500, loss[loss=0.1425, simple_loss=0.2236, pruned_loss=0.03069, over 24285.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2339, pruned_loss=0.03611, over 4704955.84 frames. ], batch size: 61, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:45:37,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:45:39,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 12:45:39,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 12:45:41,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:45:45,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:45:45,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:45:47,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:45:47,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:45:48,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:45:48,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:45:49,652 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.076e+02 2.315e+02 2.981e+02 4.706e+02, threshold=4.629e+02, percent-clipped=1.0 2023-10-04 12:45:51,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1659120.0, ans=0.125 2023-10-04 12:45:59,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:45:59,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:46:01,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:46:02,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:46:05,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:46:11,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:46:12,460 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.65 vs. limit=6.0 2023-10-04 12:46:13,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1659186.6666666667, ans=0.125 2023-10-04 12:46:14,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:46:15,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1659186.6666666667, ans=0.07 2023-10-04 12:46:18,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:46:21,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:46:21,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 12:46:22,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:46:24,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:46:27,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:46:27,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:46:29,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:46:29,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 12:46:29,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 12:46:29,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:46:34,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:46:34,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:46:36,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1659320.0, ans=0.125 2023-10-04 12:46:37,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:46:39,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:46:39,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:46:41,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 12:46:43,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 12:46:43,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 12:46:44,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1659320.0, ans=0.0 2023-10-04 12:46:48,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 12:46:48,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1659386.6666666667, ans=0.125 2023-10-04 12:46:49,742 INFO [train.py:1046] (1/4) Epoch 47, batch 4550, loss[loss=0.1477, simple_loss=0.2256, pruned_loss=0.03487, over 23453.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2338, pruned_loss=0.03599, over 4715662.44 frames. ], batch size: 119, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:46:49,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 12:46:51,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:46:54,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:46:55,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:46:57,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:47:01,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1659386.6666666667, ans=0.0 2023-10-04 12:47:03,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:47:05,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:47:05,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:47:05,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:47:05,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:08,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:47:08,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:47:12,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:47:15,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 12:47:16,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 12:47:17,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:47:18,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 12:47:21,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 12:47:21,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:47:25,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 12:47:25,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:47:30,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:30,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:30,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:47:31,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 12:47:33,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:47:36,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:36,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:47:36,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:47:37,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 12:47:38,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1659586.6666666667, ans=0.2 2023-10-04 12:47:39,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 12:47:39,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:47:40,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 12:47:42,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 12:47:42,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:47:45,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:47:45,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:47:46,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:46,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:47:48,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:47:49,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 12:47:51,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:47:51,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 12:47:51,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 12:47:51,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:47:51,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 12:47:54,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:47:54,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:47:56,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:47:56,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:58,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:47:59,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:47:59,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:48:03,443 INFO [train.py:1046] (1/4) Epoch 47, batch 4600, loss[loss=0.1557, simple_loss=0.2292, pruned_loss=0.04106, over 23766.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2318, pruned_loss=0.03598, over 4699166.93 frames. ], batch size: 179, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:48:03,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:03,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:48:06,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:48:06,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:48:06,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:48:07,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 12:48:09,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:48:12,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:48:13,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:48:14,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:17,264 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.789e+02 2.101e+02 2.353e+02 2.748e+02 3.773e+02, threshold=4.707e+02, percent-clipped=0.0 2023-10-04 12:48:21,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 12:48:21,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:25,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:27,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:48:28,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:48:33,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 12:48:33,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:48:35,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:48:39,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:41,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:48:43,270 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.51 vs. limit=12.0 2023-10-04 12:48:43,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:48:46,695 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 12:48:48,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 12:48:52,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:48:55,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:48:58,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:48:58,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 12:48:58,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:59,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 12:48:59,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:48:59,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:49:01,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:49:02,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:49:03,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:49:04,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 12:49:04,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 12:49:04,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 12:49:04,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:49:06,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:49:08,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:49:08,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:49:14,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1659986.6666666667, ans=0.2 2023-10-04 12:49:18,387 INFO [train.py:1046] (1/4) Epoch 47, batch 4650, loss[loss=0.1467, simple_loss=0.2228, pruned_loss=0.03528, over 23687.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2318, pruned_loss=0.03596, over 4712309.25 frames. ], batch size: 232, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:49:18,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:49:20,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:49:21,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:49:21,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:49:21,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:49:22,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:49:23,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:49:26,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 12:49:30,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:49:30,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1660053.3333333333, ans=0.1 2023-10-04 12:49:31,934 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:49:33,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 12:49:33,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:49:34,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 12:49:34,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:49:34,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 12:49:34,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 12:49:34,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:49:36,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:49:37,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1660120.0, ans=0.5 2023-10-04 12:49:40,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:49:43,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:49:43,699 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 12:49:46,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:49:48,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 12:49:49,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:49:49,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:49:50,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 12:49:52,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:49:53,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:49:54,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1660186.6666666667, ans=0.09899494936611666 2023-10-04 12:49:57,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:50:02,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:50:06,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:50:07,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:50:07,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:50:08,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 12:50:08,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 12:50:10,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 12:50:10,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 12:50:12,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:50:19,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:50:19,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:50:19,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 12:50:19,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:50:19,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:50:19,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:50:22,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:50:23,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:50:23,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:50:25,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:50:28,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:50:28,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:50:28,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:50:29,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 12:50:31,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:50:32,624 INFO [train.py:1046] (1/4) Epoch 47, batch 4700, loss[loss=0.1536, simple_loss=0.2339, pruned_loss=0.03666, over 23819.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2323, pruned_loss=0.03615, over 4717129.93 frames. ], batch size: 212, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:50:33,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 12:50:40,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:50:41,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1660386.6666666667, ans=0.1 2023-10-04 12:50:42,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:50:42,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:50:42,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1660386.6666666667, ans=0.125 2023-10-04 12:50:44,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:50:44,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:50:46,210 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.148e+02 2.347e+02 2.759e+02 4.268e+02, threshold=4.695e+02, percent-clipped=0.0 2023-10-04 12:50:49,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 12:50:49,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 12:50:51,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:50:52,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:50:53,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:50:54,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:51:00,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1660520.0, ans=0.0 2023-10-04 12:51:02,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:51:04,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 12:51:05,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:51:11,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1660520.0, ans=0.125 2023-10-04 12:51:13,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 12:51:14,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:51:15,175 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:51:17,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:21,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 12:51:21,872 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:51:23,798 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.46 vs. limit=22.5 2023-10-04 12:51:26,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:51:27,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 12:51:28,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:28,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:51:31,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:51:32,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:51:32,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 12:51:33,032 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 12:51:34,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:51:38,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:38,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:38,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 12:51:39,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:42,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 12:51:46,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:51:47,647 INFO [train.py:1046] (1/4) Epoch 47, batch 4750, loss[loss=0.1642, simple_loss=0.2425, pruned_loss=0.04297, over 23234.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2333, pruned_loss=0.03635, over 4714782.40 frames. ], batch size: 93, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:51:47,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:51:52,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:51:52,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:51:54,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 12:51:54,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:51:57,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 12:51:58,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:51:58,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:51:58,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:51:59,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1660720.0, ans=0.125 2023-10-04 12:52:01,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1660786.6666666667, ans=0.125 2023-10-04 12:52:05,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 12:52:08,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:52:11,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 12:52:12,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:52:15,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:52:15,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:52:15,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:52:17,096 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 12:52:17,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 12:52:21,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 12:52:23,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:52:24,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1660853.3333333333, ans=0.2 2023-10-04 12:52:25,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:52:26,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:52:26,711 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 12:52:26,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:52:28,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:52:31,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:52:33,462 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.91 vs. limit=15.0 2023-10-04 12:52:34,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 12:52:34,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 12:52:35,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:52:36,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:52:36,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:52:38,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:52:38,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 12:52:40,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1660920.0, ans=0.0 2023-10-04 12:52:42,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 12:52:44,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:52:47,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:52:47,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 12:52:47,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:52:49,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:52:51,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:52:53,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:52:53,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:52:58,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:52:58,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 12:52:58,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 12:53:00,035 INFO [train.py:1046] (1/4) Epoch 47, batch 4800, loss[loss=0.1757, simple_loss=0.249, pruned_loss=0.05116, over 23503.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2339, pruned_loss=0.03652, over 4725686.45 frames. ], batch size: 285, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:53:00,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 12:53:01,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:53:01,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:53:04,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 12:53:07,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:53:09,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:53:14,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:53:15,289 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.757e+02 2.075e+02 2.450e+02 3.076e+02 6.025e+02, threshold=4.900e+02, percent-clipped=3.0 2023-10-04 12:53:15,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:53:15,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:53:15,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1661120.0, ans=0.125 2023-10-04 12:53:17,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 12:53:17,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:53:19,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:53:20,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:53:22,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1661120.0, ans=0.125 2023-10-04 12:53:24,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:53:26,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:53:26,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:53:27,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:53:27,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 12:53:27,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:53:28,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:53:31,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:53:32,431 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.40 vs. limit=10.0 2023-10-04 12:53:33,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:53:35,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:53:35,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:53:37,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:53:37,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:53:38,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 12:53:38,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 12:53:41,983 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:53:42,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:53:43,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:53:43,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:53:43,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:53:44,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:53:44,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:53:44,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1661253.3333333333, ans=0.2 2023-10-04 12:53:44,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1661253.3333333333, ans=0.2 2023-10-04 12:53:48,144 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:53:51,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:53:51,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1661253.3333333333, ans=0.0 2023-10-04 12:53:54,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:53:58,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 12:53:58,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:53:59,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:53:59,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:54:01,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:54:01,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1661320.0, ans=0.125 2023-10-04 12:54:04,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:54:05,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:54:05,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:54:05,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:54:05,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:54:06,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:54:09,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:54:11,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:54:11,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:54:13,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 12:54:13,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 12:54:14,694 INFO [train.py:1046] (1/4) Epoch 47, batch 4850, loss[loss=0.1728, simple_loss=0.2548, pruned_loss=0.04542, over 23938.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2346, pruned_loss=0.03678, over 4715445.60 frames. ], batch size: 86, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:54:14,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:54:14,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:54:14,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:54:14,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:54:17,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:54:19,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1661386.6666666667, ans=0.125 2023-10-04 12:54:26,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 12:54:26,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:54:27,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1661386.6666666667, ans=0.125 2023-10-04 12:54:31,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:54:32,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:54:32,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:54:33,057 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.34 vs. limit=15.0 2023-10-04 12:54:36,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:54:37,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:54:39,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:54:39,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 12:54:42,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:54:44,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:54:44,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:54:45,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:54:45,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 12:54:49,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:54:49,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:54:55,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:54:55,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 12:54:56,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 12:54:57,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:55:00,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1661586.6666666667, ans=0.125 2023-10-04 12:55:03,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:55:04,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 12:55:04,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:55:04,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:55:06,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:55:06,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 12:55:06,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:55:07,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 12:55:07,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:55:07,766 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:55:09,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 12:55:17,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:55:23,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:55:23,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:55:28,919 INFO [train.py:1046] (1/4) Epoch 47, batch 4900, loss[loss=0.1468, simple_loss=0.2282, pruned_loss=0.0327, over 23158.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.234, pruned_loss=0.03675, over 4712705.89 frames. ], batch size: 51, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:55:28,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 12:55:28,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:55:33,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:55:34,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:55:34,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:55:34,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1661720.0, ans=0.1 2023-10-04 12:55:38,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 12:55:43,497 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.716e+02 2.085e+02 2.441e+02 2.898e+02 5.040e+02, threshold=4.881e+02, percent-clipped=1.0 2023-10-04 12:55:44,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 12:55:47,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 12:55:49,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 12:55:49,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:55:49,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:55:49,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:55:49,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:55:49,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:55:50,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 12:55:55,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 12:55:55,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:55:57,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:55:57,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:55:59,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:56:00,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:56:01,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:56:01,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 12:56:04,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:56:05,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:56:05,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 12:56:05,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 12:56:07,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1661853.3333333333, ans=0.0 2023-10-04 12:56:09,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 12:56:11,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:56:11,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:56:11,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:56:13,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:56:14,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 12:56:14,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:56:14,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 12:56:17,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:56:18,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1661920.0, ans=0.1 2023-10-04 12:56:19,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 12:56:20,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:56:25,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 12:56:25,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:56:28,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 12:56:28,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 12:56:32,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:56:34,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:56:35,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 12:56:35,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:56:35,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:56:36,304 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.79 vs. limit=22.5 2023-10-04 12:56:38,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:56:41,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:56:41,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:56:41,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:56:42,880 INFO [train.py:1046] (1/4) Epoch 47, batch 4950, loss[loss=0.1273, simple_loss=0.2029, pruned_loss=0.02586, over 24285.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2325, pruned_loss=0.03638, over 4712434.56 frames. ], batch size: 56, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:56:42,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 12:56:44,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:56:45,807 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:56:47,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:56:47,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:56:50,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 12:56:50,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 12:56:50,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:56:51,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 12:56:51,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:56:51,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:56:53,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:56:53,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:56:56,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:56:56,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:56:57,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:56:59,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:57:00,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:57:01,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:57:04,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:57:08,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:57:10,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1662186.6666666667, ans=0.0 2023-10-04 12:57:11,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:57:13,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:57:13,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:57:16,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:57:16,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 12:57:17,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 12:57:19,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:57:21,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:57:21,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:57:21,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:57:21,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:57:22,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:57:25,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:57:27,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:57:29,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:57:31,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:57:31,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:57:32,421 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.21 vs. limit=12.0 2023-10-04 12:57:33,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 12:57:33,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:57:34,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:57:37,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1662253.3333333333, ans=0.2 2023-10-04 12:57:38,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:57:40,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:57:40,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:57:40,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:57:40,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:57:41,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:57:43,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:57:43,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1662320.0, ans=0.125 2023-10-04 12:57:44,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:57:44,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:57:45,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 12:57:51,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:57:56,421 INFO [train.py:1046] (1/4) Epoch 47, batch 5000, loss[loss=0.1608, simple_loss=0.2484, pruned_loss=0.03661, over 23736.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2318, pruned_loss=0.03627, over 4709416.33 frames. ], batch size: 85, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:57:56,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 12:57:56,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 12:57:56,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1662386.6666666667, ans=0.2 2023-10-04 12:58:04,030 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:58:04,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:58:05,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 12:58:06,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 12:58:09,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:58:10,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 12:58:10,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:58:10,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:58:12,123 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.008e+02 2.119e+02 2.497e+02 3.372e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-04 12:58:13,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 12:58:13,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:58:14,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:58:15,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 12:58:15,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:58:15,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:58:17,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 12:58:17,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 12:58:18,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:58:18,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 12:58:18,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:58:19,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:58:19,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:58:19,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 12:58:21,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 12:58:21,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1662453.3333333333, ans=0.0 2023-10-04 12:58:22,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 12:58:22,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:58:23,148 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.55 vs. limit=15.0 2023-10-04 12:58:24,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:58:24,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1662520.0, ans=0.1 2023-10-04 12:58:26,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 12:58:26,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:58:27,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:58:29,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:58:29,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 12:58:32,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 12:58:33,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:58:34,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:58:37,685 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 12:58:41,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:58:43,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:58:43,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:58:45,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 12:58:45,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:58:45,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:58:45,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:58:47,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 12:58:49,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:58:51,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:58:53,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:58:56,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1662653.3333333333, ans=0.04949747468305833 2023-10-04 12:58:58,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 12:59:01,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:59:09,022 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.11 vs. limit=15.0 2023-10-04 12:59:09,861 INFO [train.py:1046] (1/4) Epoch 47, batch 5050, loss[loss=0.1492, simple_loss=0.2275, pruned_loss=0.03546, over 23651.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2326, pruned_loss=0.03634, over 4708511.04 frames. ], batch size: 232, lr: 2.15e-03, grad_scale: 4.0 2023-10-04 12:59:11,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:59:12,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:59:12,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:59:12,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:59:12,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:59:14,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:59:14,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:59:18,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:59:18,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 12:59:20,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:59:21,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:59:22,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:59:23,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 12:59:26,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:59:26,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:59:29,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:59:29,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:59:30,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:59:39,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 12:59:40,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 12:59:40,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:59:41,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 12:59:42,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:59:43,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:59:43,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:59:45,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:59:45,159 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 12:59:46,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 12:59:46,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:59:49,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:59:52,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:59:53,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 12:59:55,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:59:56,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 12:59:57,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1662920.0, ans=0.125 2023-10-04 12:59:58,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:59:58,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:00:00,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:00:00,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:00:00,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1662920.0, ans=0.125 2023-10-04 13:00:00,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1662920.0, ans=0.0 2023-10-04 13:00:03,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:00:06,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:00:06,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:06,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:00:06,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:00:06,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 13:00:07,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:00:09,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:00:10,098 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.84 vs. limit=10.0 2023-10-04 13:00:13,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:00:13,278 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 13:00:13,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:00:13,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1662986.6666666667, ans=0.0 2023-10-04 13:00:14,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:00:14,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:15,842 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 13:00:16,522 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.01 vs. limit=6.0 2023-10-04 13:00:17,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:00:17,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 13:00:17,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:22,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:00:22,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:22,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 13:00:23,425 INFO [train.py:1046] (1/4) Epoch 47, batch 5100, loss[loss=0.1946, simple_loss=0.2715, pruned_loss=0.05881, over 18931.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2338, pruned_loss=0.03663, over 4711093.04 frames. ], batch size: 388, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 13:00:23,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 13:00:24,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:00:24,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:00:24,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:00:28,055 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 13:00:29,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:00:33,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 13:00:33,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 13:00:34,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:00:35,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:00:38,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:00:38,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 13:00:38,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1663120.0, ans=0.125 2023-10-04 13:00:40,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 13:00:41,329 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.691e+02 2.103e+02 2.372e+02 2.934e+02 4.830e+02, threshold=4.743e+02, percent-clipped=2.0 2023-10-04 13:00:44,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:00:44,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:00:45,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1663120.0, ans=0.125 2023-10-04 13:00:48,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:00:53,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 13:00:53,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:00:54,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:54,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 13:00:57,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:00:59,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1663186.6666666667, ans=0.0 2023-10-04 13:01:00,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:01:00,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 13:01:03,228 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 13:01:03,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:01:03,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 13:01:03,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 13:01:07,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:01:15,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:01:18,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 13:01:18,911 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 13:01:18,926 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 13:01:20,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 13:01:20,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:01:22,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 13:01:27,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 13:01:30,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 13:01:30,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:01:31,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 13:01:34,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:01:34,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 13:01:37,697 INFO [train.py:1046] (1/4) Epoch 47, batch 5150, loss[loss=0.1707, simple_loss=0.2451, pruned_loss=0.0481, over 22867.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2355, pruned_loss=0.03726, over 4706621.65 frames. ], batch size: 323, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 13:01:39,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:01:39,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:01:39,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:01:41,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:01:41,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:01:42,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:01:43,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 13:01:43,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 13:01:43,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 13:01:45,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:01:45,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 13:01:46,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:01:46,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 13:01:48,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:01:48,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:01:54,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 13:01:55,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 13:01:55,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:01:55,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:01:58,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 13:01:58,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:01:58,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:01:59,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:01:59,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:02:01,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 13:02:03,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:02:03,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:02:04,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 13:02:06,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 13:02:08,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:02:12,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:02:14,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 13:02:15,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:02:23,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:02:23,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:02:27,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:02:27,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:02:29,603 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.86 vs. limit=6.0 2023-10-04 13:02:31,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 13:02:34,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:02:34,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:02:36,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:02:39,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:02:39,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:02:41,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 13:02:45,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:02:46,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 13:02:48,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1663653.3333333333, ans=0.2 2023-10-04 13:02:50,830 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.42 vs. limit=15.0 2023-10-04 13:02:51,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:02:51,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:02:51,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:02:51,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 13:02:51,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:02:51,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:02:52,696 INFO [train.py:1046] (1/4) Epoch 47, batch 5200, loss[loss=0.1472, simple_loss=0.2281, pruned_loss=0.03317, over 23878.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2359, pruned_loss=0.03744, over 4692528.67 frames. ], batch size: 86, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 13:02:55,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:02:56,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:02:59,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:03:04,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 13:03:04,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:03:05,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:03:07,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:03:09,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:03:09,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:03:11,065 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.051e+02 2.238e+02 2.543e+02 5.592e+02, threshold=4.477e+02, percent-clipped=1.0 2023-10-04 13:03:12,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 13:03:15,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 13:03:16,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:03:18,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 13:03:20,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:03:20,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 13:03:21,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 13:03:21,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 13:03:23,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 13:03:24,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:03:24,312 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 13:03:24,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:03:25,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:03:25,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:03:25,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1663853.3333333333, ans=0.0 2023-10-04 13:03:26,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 13:03:26,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:03:28,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:03:30,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 13:03:31,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 13:03:31,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 13:03:36,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 13:03:37,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:03:39,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1663920.0, ans=0.0 2023-10-04 13:03:43,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:03:44,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:03:44,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 13:03:46,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:03:46,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 13:03:46,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:03:46,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:03:51,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:03:52,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:03:53,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1663986.6666666667, ans=0.0 2023-10-04 13:03:55,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:03:55,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:03:55,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:04:01,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:04:02,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 13:04:02,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:04:04,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:04:05,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:04:06,797 INFO [train.py:1046] (1/4) Epoch 47, batch 5250, loss[loss=0.1435, simple_loss=0.2122, pruned_loss=0.03735, over 23606.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2346, pruned_loss=0.03764, over 4671746.40 frames. ], batch size: 256, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 13:04:06,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:04:06,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:04:10,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:04:12,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:04:13,314 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.35 vs. limit=6.0 2023-10-04 13:04:13,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:04:16,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:04:20,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:04:22,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:04:22,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1664120.0, ans=0.0 2023-10-04 13:04:25,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:04:25,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:04:27,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 13:04:27,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:04:29,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:04:35,406 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.81 vs. limit=15.0 2023-10-04 13:05:01,721 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.63 vs. limit=15.0 2023-10-04 13:05:05,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1664320.0, ans=0.125 2023-10-04 13:05:05,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1664320.0, ans=0.07 2023-10-04 13:05:08,895 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.02 vs. limit=15.0 2023-10-04 13:05:15,766 INFO [train.py:1046] (1/4) Epoch 47, batch 5300, loss[loss=0.1417, simple_loss=0.2285, pruned_loss=0.02745, over 24628.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2335, pruned_loss=0.03737, over 4687553.47 frames. ], batch size: 65, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 13:05:29,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:05:29,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 13:05:29,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 13:05:29,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:05:30,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:05:30,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:05:30,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:05:30,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:05:30,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:05:30,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:05:30,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:05:30,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:05:30,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 13:05:30,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 13:05:30,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 13:05:30,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:05:30,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 13:05:30,934 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 13:05:31,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:05:31,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:05:31,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:05:31,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:05:31,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:05:32,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:05:32,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:05:32,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:05:32,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:05:32,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:05:32,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:05:32,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:05:32,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:05:33,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 13:05:33,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:05:33,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:05:33,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 13:05:33,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 13:05:33,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:05:33,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:05:33,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 13:05:34,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 13:05:34,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 13:05:34,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:05:34,723 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:05:34,809 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 13:05:34,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 13:05:34,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 13:05:34,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:05:35,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 13:05:35,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 13:05:35,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 13:05:35,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 13:05:37,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1664466.6666666667, ans=0.125 2023-10-04 13:05:41,673 INFO [train.py:1046] (1/4) Epoch 48, batch 0, loss[loss=0.1588, simple_loss=0.2343, pruned_loss=0.04167, over 23396.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2343, pruned_loss=0.04167, over 23396.00 frames. ], batch size: 285, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:05:41,674 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 13:05:54,828 INFO [train.py:1078] (1/4) Epoch 48, validation: loss=0.3604, simple_loss=0.2801, pruned_loss=0.2204, over 1125622.00 frames. 2023-10-04 13:05:54,829 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 13:05:56,147 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.805e+02 2.072e+02 2.267e+02 2.671e+02 6.295e+02, threshold=4.535e+02, percent-clipped=1.0 2023-10-04 13:05:56,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 13:05:56,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:05:59,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:06:05,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:06:05,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:06:05,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:06:06,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 13:06:07,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 13:06:09,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:06:09,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1664533.3333333333, ans=0.0 2023-10-04 13:06:11,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:06:15,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:06:17,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:06:17,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:06:17,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:06:18,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 13:06:20,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_na.min_abs, batch_count=1664533.3333333333, ans=0.02 2023-10-04 13:06:21,380 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:06:27,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:06:28,066 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.57 vs. limit=6.0 2023-10-04 13:06:28,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:06:30,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 13:06:33,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1664600.0, ans=0.0 2023-10-04 13:06:34,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:06:34,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:06:35,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:06:36,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1664600.0, ans=0.125 2023-10-04 13:06:40,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:06:43,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:06:48,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 13:06:49,486 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.35 vs. limit=10.0 2023-10-04 13:06:52,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 13:06:52,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:06:52,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:06:54,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:06:54,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:06:56,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 13:06:57,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:06:57,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:06:59,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1664733.3333333333, ans=0.0 2023-10-04 13:07:02,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:07:04,844 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 13:07:06,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:07:09,446 INFO [train.py:1046] (1/4) Epoch 48, batch 50, loss[loss=0.1427, simple_loss=0.2239, pruned_loss=0.03072, over 24467.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2354, pruned_loss=0.03571, over 1074733.07 frames. ], batch size: 58, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:07:11,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:07:12,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:07:12,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 13:07:14,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:07:14,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:07:16,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:07:18,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:07:20,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:07:22,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 13:07:22,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:07:27,246 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.41 vs. limit=15.0 2023-10-04 13:07:30,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:07:32,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 13:07:33,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 13:07:35,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:07:37,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:07:37,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:07:37,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:07:38,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:07:38,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 13:07:38,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:07:40,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1664933.3333333333, ans=0.0 2023-10-04 13:07:46,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:07:48,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:07:48,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:07:50,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 13:07:51,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:07:52,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:07:52,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 13:07:54,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:07:55,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 13:08:04,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:08:04,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:08:04,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:08:05,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:08:05,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:08:08,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 13:08:08,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 13:08:10,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:08:11,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:08:12,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:08:13,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:08:13,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 13:08:14,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 13:08:16,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 13:08:18,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:08:18,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:08:19,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 13:08:19,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 13:08:21,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:08:21,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:08:24,076 INFO [train.py:1046] (1/4) Epoch 48, batch 100, loss[loss=0.1615, simple_loss=0.2529, pruned_loss=0.03501, over 24597.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2347, pruned_loss=0.03507, over 1891980.59 frames. ], batch size: 71, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:08:24,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 13:08:24,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:08:25,442 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 2.052e+02 2.272e+02 2.677e+02 5.287e+02, threshold=4.544e+02, percent-clipped=2.0 2023-10-04 13:08:25,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:08:29,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:08:32,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:08:34,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 13:08:34,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:08:38,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:08:38,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:08:38,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:08:38,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:08:38,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:08:40,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 13:08:41,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 13:08:41,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:08:42,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:08:42,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:08:44,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1665200.0, ans=0.125 2023-10-04 13:08:45,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 13:08:47,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:08:48,054 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=3.96 vs. limit=10.0 2023-10-04 13:08:48,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:08:50,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:08:50,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1665200.0, ans=0.1 2023-10-04 13:08:52,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:08:57,932 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 13:08:57,954 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 13:09:00,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:09:00,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:09:03,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:09:04,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:09:06,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:11,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:12,430 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 13:09:13,373 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.30 vs. limit=15.0 2023-10-04 13:09:13,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 13:09:16,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:09:16,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1665333.3333333333, ans=0.0 2023-10-04 13:09:18,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:09:19,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:24,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:09:24,886 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.91 vs. limit=6.0 2023-10-04 13:09:27,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:09:28,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:09:30,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:31,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:09:33,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:09:33,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:09:33,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:34,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 13:09:34,462 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 13:09:34,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:09:35,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:09:35,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:35,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:09:37,760 INFO [train.py:1046] (1/4) Epoch 48, batch 150, loss[loss=0.1602, simple_loss=0.2411, pruned_loss=0.03963, over 23753.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2357, pruned_loss=0.03559, over 2518990.19 frames. ], batch size: 85, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:09:37,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 13:09:37,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 13:09:37,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 13:09:37,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:39,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:09:39,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:09:39,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:09:40,205 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.89 vs. limit=15.0 2023-10-04 13:09:40,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:09:43,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:09:46,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:09:46,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:09:46,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:49,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:09:49,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:51,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:09:52,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:55,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 13:09:55,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 13:09:55,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 13:10:00,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:10:00,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:10:01,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:10:02,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1665533.3333333333, ans=0.1 2023-10-04 13:10:03,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:10:03,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:10:03,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:10:03,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:10:04,614 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 13:10:06,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:10:12,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:10:12,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1665600.0, ans=0.1 2023-10-04 13:10:16,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:10:16,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 13:10:16,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1665600.0, ans=0.0 2023-10-04 13:10:19,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:10:19,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:10:19,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:10:21,104 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:10:23,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:10:23,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:10:25,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:10:26,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:10:28,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 13:10:34,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:10:35,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:10:35,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:10:35,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:10:38,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:10:39,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 13:10:41,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:10:43,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:10:44,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:10:46,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:10:47,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 13:10:47,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:10:47,394 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 13:10:48,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1665733.3333333333, ans=0.125 2023-10-04 13:10:51,288 INFO [train.py:1046] (1/4) Epoch 48, batch 200, loss[loss=0.1578, simple_loss=0.2437, pruned_loss=0.03595, over 23435.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2364, pruned_loss=0.03637, over 3007294.83 frames. ], batch size: 106, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:10:51,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:10:54,669 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.717e+02 2.085e+02 2.349e+02 2.813e+02 4.148e+02, threshold=4.699e+02, percent-clipped=0.0 2023-10-04 13:10:54,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:10:54,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:10:56,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1665800.0, ans=0.1 2023-10-04 13:10:57,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 13:10:59,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:10:59,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:11:02,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 13:11:03,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:11:05,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:11:06,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:11:09,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:11:10,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:11:10,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:11:11,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1665866.6666666667, ans=0.125 2023-10-04 13:11:30,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:11:30,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:11:30,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1665933.3333333333, ans=0.0 2023-10-04 13:11:31,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:11:31,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:11:31,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 13:11:31,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:11:33,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1665933.3333333333, ans=0.0 2023-10-04 13:11:35,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:11:36,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:11:36,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:11:37,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:11:39,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 13:11:39,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 13:11:40,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:11:43,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:11:49,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:11:56,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:11:56,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:12:01,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:12:04,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 13:12:05,742 INFO [train.py:1046] (1/4) Epoch 48, batch 250, loss[loss=0.1537, simple_loss=0.236, pruned_loss=0.03575, over 24024.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2364, pruned_loss=0.03672, over 3390548.90 frames. ], batch size: 80, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:12:05,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:12:05,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:12:05,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:12:07,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:12:07,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 13:12:09,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:12:09,248 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 13:12:10,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:12:13,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:12:13,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:12:15,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:12:17,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:12:17,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:12:19,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:12:20,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:12:31,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1666200.0, ans=0.125 2023-10-04 13:12:31,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1666200.0, ans=0.05 2023-10-04 13:12:32,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:12:34,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:12:35,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:12:42,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:12:42,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:12:43,084 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:12:44,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:12:44,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:12:46,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:12:46,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:12:46,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:12:46,882 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.07 vs. limit=15.0 2023-10-04 13:12:49,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:12:51,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 13:12:51,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:12:53,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:12:53,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:12:53,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:12:54,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:12:55,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:12:55,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:12:57,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:12:57,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1666333.3333333333, ans=0.125 2023-10-04 13:12:59,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:12:59,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:13:02,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:13:06,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:13:11,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:13:11,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1666400.0, ans=0.0 2023-10-04 13:13:14,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:13:16,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:13:18,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 13:13:20,238 INFO [train.py:1046] (1/4) Epoch 48, batch 300, loss[loss=0.123, simple_loss=0.1816, pruned_loss=0.03222, over 22673.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2334, pruned_loss=0.03588, over 3668963.14 frames. ], batch size: 322, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:13:20,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:13:20,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:13:22,912 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.014e+02 2.190e+02 2.558e+02 4.207e+02, threshold=4.380e+02, percent-clipped=0.0 2023-10-04 13:13:23,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 13:13:23,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 13:13:24,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:13:24,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 13:13:24,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1666466.6666666667, ans=0.95 2023-10-04 13:13:29,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:13:29,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:13:34,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:13:34,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 13:13:36,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:13:36,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1666533.3333333333, ans=0.2 2023-10-04 13:13:37,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:13:37,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 13:13:37,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:13:42,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 13:13:46,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:13:46,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 13:13:49,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 13:13:49,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:13:52,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:13:55,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:13:55,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 13:13:55,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:13:56,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:13:58,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:13:58,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:14:02,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 13:14:02,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 13:14:02,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:14:07,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:14:09,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 13:14:09,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:14:13,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1666666.6666666667, ans=0.0 2023-10-04 13:14:14,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:14:17,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:14:17,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 13:14:20,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:14:20,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:14:22,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:14:22,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:14:24,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 13:14:24,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 13:14:24,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:14:25,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 13:14:28,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:14:28,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:30,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:14:30,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:14:31,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:34,147 INFO [train.py:1046] (1/4) Epoch 48, batch 350, loss[loss=0.1506, simple_loss=0.2367, pruned_loss=0.03224, over 24474.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2323, pruned_loss=0.03604, over 3881964.24 frames. ], batch size: 66, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:14:35,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:14:36,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 13:14:38,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:44,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:14:47,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:14:47,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:49,528 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.45 vs. limit=22.5 2023-10-04 13:14:50,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 13:14:51,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:14:51,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 13:14:55,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:55,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 13:14:56,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:14:57,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 13:14:59,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:15:00,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:15:02,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:15:03,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:03,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:03,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:15:03,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:15:05,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:15:06,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:15:06,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:15:13,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:15:14,518 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:15:14,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:15:15,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:15:20,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 13:15:20,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:15:24,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:15:24,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:15:24,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:15:25,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 13:15:28,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:15:28,607 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 13:15:30,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 13:15:30,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:33,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:15:33,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 13:15:34,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:15:39,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:15:39,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:41,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:15:41,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:15:43,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:15:46,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:15:47,970 INFO [train.py:1046] (1/4) Epoch 48, batch 400, loss[loss=0.1558, simple_loss=0.2342, pruned_loss=0.03874, over 23727.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2326, pruned_loss=0.03585, over 4063172.70 frames. ], batch size: 179, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:15:48,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1667133.3333333333, ans=0.125 2023-10-04 13:15:49,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:15:49,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 13:15:49,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:15:50,745 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.047e+02 2.274e+02 2.611e+02 3.617e+02, threshold=4.549e+02, percent-clipped=0.0 2023-10-04 13:15:50,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:15:52,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:15:53,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:15:56,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:57,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:15:59,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 13:16:01,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 13:16:01,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:16:02,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 13:16:02,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1667200.0, ans=0.1 2023-10-04 13:16:02,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1667200.0, ans=0.2 2023-10-04 13:16:03,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:16:04,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1667200.0, ans=0.0 2023-10-04 13:16:06,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:16:06,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:16:06,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 13:16:08,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:16:08,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:16:08,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:16:08,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:16:13,136 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 13:16:13,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 13:16:17,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:16:18,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:16:19,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 13:16:19,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1667266.6666666667, ans=0.125 2023-10-04 13:16:20,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 13:16:24,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:16:27,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:16:33,455 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 13:16:36,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:16:37,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 13:16:39,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:16:40,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:16:41,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 13:16:45,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:16:47,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:16:48,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:16:51,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:16:52,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 13:16:54,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 13:16:55,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 13:16:56,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:16:56,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:16:58,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 13:16:59,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:16:59,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:17:00,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 13:17:01,175 INFO [train.py:1046] (1/4) Epoch 48, batch 450, loss[loss=0.1776, simple_loss=0.2495, pruned_loss=0.05285, over 19459.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2327, pruned_loss=0.03594, over 4184142.77 frames. ], batch size: 388, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:17:01,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 13:17:01,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:17:01,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:17:02,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:17:02,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 13:17:04,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:17:06,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:17:07,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:17:18,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:17:18,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:17:18,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1667533.3333333333, ans=0.125 2023-10-04 13:17:20,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 13:17:21,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 13:17:23,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:17:25,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:17:27,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:17:29,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:17:31,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:17:33,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1667600.0, ans=0.0 2023-10-04 13:17:34,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 13:17:35,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 13:17:38,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 13:17:38,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:17:39,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:17:41,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:17:43,575 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 13:17:43,583 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 13:17:44,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:17:46,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:17:47,801 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 13:17:51,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:17:51,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:17:52,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 13:17:53,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 13:17:53,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1667666.6666666667, ans=0.125 2023-10-04 13:17:55,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:17:56,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:17:56,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:17:59,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 13:18:03,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:18:03,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 13:18:05,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 13:18:06,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:18:08,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1667733.3333333333, ans=0.1 2023-10-04 13:18:11,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:18:14,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:18:15,894 INFO [train.py:1046] (1/4) Epoch 48, batch 500, loss[loss=0.1417, simple_loss=0.2285, pruned_loss=0.02741, over 23465.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2339, pruned_loss=0.03606, over 4306065.81 frames. ], batch size: 134, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:18:15,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:18:15,985 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 13:18:18,920 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 1.967e+02 2.163e+02 2.442e+02 3.421e+02, threshold=4.326e+02, percent-clipped=0.0 2023-10-04 13:18:19,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:18:20,009 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.44 vs. limit=15.0 2023-10-04 13:18:20,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:18:20,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:18:20,494 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 13:18:21,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 13:18:21,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:18:25,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 13:18:27,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 13:18:27,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1667800.0, ans=0.125 2023-10-04 13:18:29,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:18:31,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:18:31,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:18:33,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:18:33,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1667866.6666666667, ans=0.125 2023-10-04 13:18:39,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1667866.6666666667, ans=0.025 2023-10-04 13:18:45,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:18:45,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:18:45,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 13:18:46,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:18:46,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 13:18:46,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:18:48,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1667933.3333333333, ans=0.07 2023-10-04 13:18:49,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:18:51,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:18:51,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:18:51,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:18:52,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 13:18:56,677 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 13:18:58,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:19:00,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:19:01,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:19:01,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:19:01,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:19:04,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 13:19:07,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:19:08,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:19:13,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:19:14,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:19:20,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:19:22,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 13:19:22,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:19:22,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:19:25,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 13:19:25,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=1668066.6666666667, ans=22.5 2023-10-04 13:19:26,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 13:19:28,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:19:29,991 INFO [train.py:1046] (1/4) Epoch 48, batch 550, loss[loss=0.146, simple_loss=0.2347, pruned_loss=0.02859, over 24642.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.235, pruned_loss=0.03681, over 4392359.20 frames. ], batch size: 65, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:19:31,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1668133.3333333333, ans=0.125 2023-10-04 13:19:34,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 13:19:35,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1668133.3333333333, ans=0.125 2023-10-04 13:19:36,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 13:19:36,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:19:36,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 13:19:36,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:19:37,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:19:38,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:19:39,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:19:39,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:19:39,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1668133.3333333333, ans=0.09899494936611666 2023-10-04 13:19:42,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:19:44,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:19:44,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 13:19:44,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:19:49,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:19:50,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:19:52,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:19:53,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:19:55,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 13:19:56,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 13:19:58,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:20:02,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:20:02,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:20:05,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:20:06,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:20:07,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1668266.6666666667, ans=0.09899494936611666 2023-10-04 13:20:08,203 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 13:20:08,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:20:09,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 13:20:12,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:20:14,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:20:14,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:20:16,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:20:17,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 13:20:20,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 13:20:20,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:20:20,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:20:21,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:20:21,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:20:23,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:20:24,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:20:26,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:20:27,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:20:28,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 13:20:29,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:20:30,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:20:31,290 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.12 vs. limit=12.0 2023-10-04 13:20:32,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:20:32,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:20:33,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 13:20:33,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 13:20:37,162 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.13 vs. limit=12.0 2023-10-04 13:20:40,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 13:20:41,635 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.49 vs. limit=15.0 2023-10-04 13:20:43,295 INFO [train.py:1046] (1/4) Epoch 48, batch 600, loss[loss=0.1447, simple_loss=0.2138, pruned_loss=0.03786, over 23651.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.235, pruned_loss=0.03698, over 4460018.60 frames. ], batch size: 256, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:20:44,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 13:20:44,291 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:20:44,709 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.86 vs. limit=10.0 2023-10-04 13:20:45,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:20:45,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:20:45,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:20:46,919 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 2.073e+02 2.337e+02 2.691e+02 3.660e+02, threshold=4.674e+02, percent-clipped=0.0 2023-10-04 13:20:53,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:20:55,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:20:57,263 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 13:20:59,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:21:01,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:21:03,215 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:21:04,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 13:21:04,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:21:10,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 13:21:11,176 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.62 vs. limit=22.5 2023-10-04 13:21:12,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:21:12,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:21:14,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:21:19,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:21:19,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:21:20,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:21:23,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1668600.0, ans=0.0 2023-10-04 13:21:26,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1668666.6666666667, ans=0.0 2023-10-04 13:21:29,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:21:32,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:21:32,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:21:32,097 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:21:38,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 13:21:43,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:21:43,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:21:45,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 13:21:47,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:21:48,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 13:21:49,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:21:49,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:21:54,423 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.54 vs. limit=22.5 2023-10-04 13:21:56,553 INFO [train.py:1046] (1/4) Epoch 48, batch 650, loss[loss=0.1421, simple_loss=0.219, pruned_loss=0.03259, over 24280.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2344, pruned_loss=0.03677, over 4514797.07 frames. ], batch size: 56, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:21:56,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 13:21:56,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1668800.0, ans=0.09899494936611666 2023-10-04 13:21:58,031 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 13:21:58,321 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:21:59,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:22:00,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:22:02,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:06,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 13:22:06,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:22:11,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:22:11,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:22:16,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:22:19,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 13:22:19,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1668866.6666666667, ans=0.0 2023-10-04 13:22:21,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:22:21,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:22:23,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:22:23,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 13:22:26,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:22:28,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:28,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 13:22:29,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:30,952 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:22:34,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:22:34,270 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 13:22:34,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:22:34,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:22:38,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:38,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:22:39,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:22:39,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:22:41,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 13:22:43,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:22:43,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:22:43,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:22:43,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:22:45,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:22:46,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 13:22:49,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 13:22:49,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:49,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:22:50,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:22:50,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:22:52,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:22:52,914 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.35 vs. limit=15.0 2023-10-04 13:22:56,729 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:56,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:22:56,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1669066.6666666667, ans=0.1 2023-10-04 13:22:58,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:23:00,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:23:00,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:23:01,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:23:09,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:23:09,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:23:09,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:23:10,817 INFO [train.py:1046] (1/4) Epoch 48, batch 700, loss[loss=0.1618, simple_loss=0.2554, pruned_loss=0.03408, over 24362.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2329, pruned_loss=0.03625, over 4546189.82 frames. ], batch size: 74, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:23:10,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:23:16,004 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.014e+02 2.298e+02 2.689e+02 4.568e+02, threshold=4.597e+02, percent-clipped=0.0 2023-10-04 13:23:16,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 13:23:16,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 13:23:18,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 13:23:19,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:23:19,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1669133.3333333333, ans=0.125 2023-10-04 13:23:21,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:23:22,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 13:23:27,860 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:23:29,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:23:31,523 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.18 vs. limit=15.0 2023-10-04 13:23:32,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:23:32,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:23:32,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:23:33,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1669200.0, ans=0.125 2023-10-04 13:23:34,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:23:39,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 13:23:39,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:23:40,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 13:23:44,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 13:23:47,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:23:48,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:23:49,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:23:54,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:23:54,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 13:23:58,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:23:58,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:23:59,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 13:24:00,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1669333.3333333333, ans=0.125 2023-10-04 13:24:02,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:24:04,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:24:08,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:24:08,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1669333.3333333333, ans=0.125 2023-10-04 13:24:11,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:24:11,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 13:24:14,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 13:24:14,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 13:24:19,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:24:21,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:24:21,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:24:24,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:24:24,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 13:24:25,873 INFO [train.py:1046] (1/4) Epoch 48, batch 750, loss[loss=0.145, simple_loss=0.2279, pruned_loss=0.03102, over 24314.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2318, pruned_loss=0.03586, over 4573060.30 frames. ], batch size: 61, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:24:27,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 13:24:27,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 13:24:27,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 13:24:27,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1669466.6666666667, ans=0.07 2023-10-04 13:24:28,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 13:24:28,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 13:24:29,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1669466.6666666667, ans=0.1 2023-10-04 13:24:30,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:24:30,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 13:24:32,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:24:32,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:24:34,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:24:34,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1669466.6666666667, ans=0.125 2023-10-04 13:24:35,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:24:35,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 13:24:37,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:24:40,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:24:40,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:24:40,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1669533.3333333333, ans=0.125 2023-10-04 13:24:40,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1669533.3333333333, ans=0.5 2023-10-04 13:24:41,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:24:43,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:24:45,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:24:46,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 13:24:48,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:24:49,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:24:52,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:24:52,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 13:24:53,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 13:24:53,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:24:56,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 13:24:56,491 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 13:24:57,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 13:24:57,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:24:57,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1669600.0, ans=0.1 2023-10-04 13:24:59,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 13:25:00,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:25:07,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:25:07,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:25:07,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:25:09,024 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:25:10,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:25:12,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:25:12,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 13:25:12,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:25:13,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 13:25:15,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:25:17,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:25:17,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 13:25:19,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:25:23,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1669733.3333333333, ans=0.0 2023-10-04 13:25:24,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:25:26,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:25:27,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:25:29,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:25:30,675 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:25:33,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 13:25:33,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:25:34,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:25:35,391 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.51 vs. limit=15.0 2023-10-04 13:25:35,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:25:37,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:25:38,589 INFO [train.py:1046] (1/4) Epoch 48, batch 800, loss[loss=0.1651, simple_loss=0.2393, pruned_loss=0.04549, over 23716.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2327, pruned_loss=0.03596, over 4617040.57 frames. ], batch size: 212, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:25:38,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:25:38,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:25:38,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1669800.0, ans=0.125 2023-10-04 13:25:43,439 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 1.959e+02 2.276e+02 2.649e+02 3.901e+02, threshold=4.552e+02, percent-clipped=0.0 2023-10-04 13:25:44,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:25:44,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:25:48,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:25:48,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:25:48,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:25:48,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:25:50,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:25:54,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:25:54,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1669866.6666666667, ans=0.0 2023-10-04 13:25:55,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:25:57,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 13:25:58,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:25:59,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:25:59,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:25:59,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:25:59,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 13:26:01,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:26:01,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 13:26:01,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1669866.6666666667, ans=0.2 2023-10-04 13:26:03,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:26:06,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:26:08,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:26:08,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:26:11,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:26:11,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:26:14,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:26:15,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:26:15,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 13:26:17,400 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 13:26:17,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 13:26:17,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:26:19,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:26:19,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:26:19,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:26:24,814 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 13:26:24,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 13:26:26,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:26:29,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:26:33,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:26:37,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:26:37,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 13:26:37,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:26:40,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 13:26:46,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:26:48,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:26:48,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1670066.6666666667, ans=0.0 2023-10-04 13:26:49,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 13:26:49,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:26:51,810 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:26:53,139 INFO [train.py:1046] (1/4) Epoch 48, batch 850, loss[loss=0.2088, simple_loss=0.2796, pruned_loss=0.06895, over 19431.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2335, pruned_loss=0.03626, over 4645224.82 frames. ], batch size: 388, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:26:53,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 13:26:53,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:26:53,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:26:55,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:26:57,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:26:58,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:27:00,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 13:27:00,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 13:27:00,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 13:27:02,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:27:02,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:27:04,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:27:05,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:27:05,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:27:07,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1670200.0, ans=0.125 2023-10-04 13:27:11,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:27:11,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:27:11,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 13:27:13,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 13:27:14,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1670200.0, ans=0.5 2023-10-04 13:27:19,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:27:20,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 13:27:23,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 13:27:23,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 13:27:27,448 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 13:27:27,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:27:27,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:27:27,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 13:27:30,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:27:31,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:27:32,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 13:27:35,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:27:35,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:27:36,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:27:38,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:27:39,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:27:41,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:27:41,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 13:27:45,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:27:45,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:27:45,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:27:45,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:27:46,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:27:50,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:27:52,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:27:52,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:27:54,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:27:55,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:28:03,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 13:28:05,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1670466.6666666667, ans=0.125 2023-10-04 13:28:06,356 INFO [train.py:1046] (1/4) Epoch 48, batch 900, loss[loss=0.1496, simple_loss=0.2363, pruned_loss=0.03142, over 24467.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2339, pruned_loss=0.03624, over 4667558.33 frames. ], batch size: 69, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:28:06,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:28:06,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 13:28:06,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:28:06,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:28:09,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 13:28:10,516 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.664e+02 2.013e+02 2.239e+02 2.502e+02 3.512e+02, threshold=4.478e+02, percent-clipped=0.0 2023-10-04 13:28:14,301 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.54 vs. limit=15.0 2023-10-04 13:28:14,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:28:15,009 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1670466.6666666667, ans=0.125 2023-10-04 13:28:17,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:28:18,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 13:28:22,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:28:22,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 13:28:22,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 13:28:22,671 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:28:23,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:28:23,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:28:23,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:28:25,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:28:25,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1670533.3333333333, ans=0.125 2023-10-04 13:28:28,614 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:28:31,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1670533.3333333333, ans=0.125 2023-10-04 13:28:37,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:28:37,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:28:37,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:28:40,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:28:43,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 13:28:44,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1670600.0, ans=0.125 2023-10-04 13:28:45,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:28:46,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1670600.0, ans=0.125 2023-10-04 13:28:47,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1670600.0, ans=0.125 2023-10-04 13:28:48,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:28:48,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:28:48,766 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 13:28:50,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 13:28:52,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1670666.6666666667, ans=0.0 2023-10-04 13:28:58,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:28:58,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:28:59,772 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.27 vs. limit=6.0 2023-10-04 13:29:00,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:29:06,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:29:06,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:29:07,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 13:29:07,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:29:10,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 13:29:11,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:29:11,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:29:13,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:29:13,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:29:17,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 13:29:17,192 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 13:29:18,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 13:29:20,562 INFO [train.py:1046] (1/4) Epoch 48, batch 950, loss[loss=0.1544, simple_loss=0.223, pruned_loss=0.04292, over 23800.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2348, pruned_loss=0.03649, over 4667487.16 frames. ], batch size: 212, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:29:20,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 13:29:22,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:29:24,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 13:29:31,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:29:34,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:29:34,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:29:35,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 13:29:38,446 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 13:29:38,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1670866.6666666667, ans=0.125 2023-10-04 13:29:41,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:29:41,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:29:42,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:29:42,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:29:42,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 13:29:45,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 13:29:45,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:29:46,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 13:29:46,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:29:51,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:29:51,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:29:51,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:29:52,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 13:29:54,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 13:29:55,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:29:57,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:29:57,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1670933.3333333333, ans=0.0 2023-10-04 13:30:05,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:30:05,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:30:07,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 13:30:09,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 13:30:09,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:30:09,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:30:09,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:30:09,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:30:13,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 13:30:14,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:30:17,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:30:17,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:30:17,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 13:30:17,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:30:17,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:30:17,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 13:30:21,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1671066.6666666667, ans=0.125 2023-10-04 13:30:22,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:30:24,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1671066.6666666667, ans=0.125 2023-10-04 13:30:25,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:30:30,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:30:30,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 13:30:31,611 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.88 vs. limit=6.0 2023-10-04 13:30:32,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 13:30:34,972 INFO [train.py:1046] (1/4) Epoch 48, batch 1000, loss[loss=0.1365, simple_loss=0.2212, pruned_loss=0.02592, over 24593.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2342, pruned_loss=0.03627, over 4690637.79 frames. ], batch size: 60, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:30:35,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:30:37,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 13:30:39,097 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.794e+02 2.109e+02 2.410e+02 2.800e+02 4.729e+02, threshold=4.820e+02, percent-clipped=1.0 2023-10-04 13:30:39,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:30:40,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1671133.3333333333, ans=0.1 2023-10-04 13:30:43,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:30:46,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 13:30:46,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 13:30:50,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:30:50,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:30:52,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:30:56,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 13:30:59,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 13:30:59,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 13:31:02,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:31:03,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 13:31:06,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 13:31:06,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 13:31:07,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:31:07,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:08,426 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.43 vs. limit=6.0 2023-10-04 13:31:14,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:31:14,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:31:15,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:15,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:31:15,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 13:31:15,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:31:16,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1671266.6666666667, ans=0.035 2023-10-04 13:31:19,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:31:19,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:31:20,456 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 13:31:23,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 13:31:24,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 13:31:27,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 13:31:27,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:31:33,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:33,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:31:34,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:35,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:31:37,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 13:31:39,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:31:39,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 13:31:39,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 13:31:40,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1671400.0, ans=0.125 2023-10-04 13:31:40,487 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.23 vs. limit=22.5 2023-10-04 13:31:41,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:31:41,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:31:44,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:31:44,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1671400.0, ans=0.0 2023-10-04 13:31:46,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:31:48,219 INFO [train.py:1046] (1/4) Epoch 48, batch 1050, loss[loss=0.1322, simple_loss=0.2101, pruned_loss=0.02712, over 24339.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2331, pruned_loss=0.03603, over 4698151.37 frames. ], batch size: 56, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:31:48,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:31:51,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:31:53,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:31:53,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1671466.6666666667, ans=0.0 2023-10-04 13:31:55,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:31:57,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:58,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:32:01,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:32:03,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:32:05,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:32:06,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:32:07,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:32:07,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:32:08,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 13:32:09,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:32:09,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 13:32:10,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:32:10,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 13:32:10,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 13:32:16,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:32:17,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:32:17,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:32:21,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 13:32:21,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 13:32:22,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:32:23,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 13:32:27,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 13:32:28,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1671600.0, ans=0.1 2023-10-04 13:32:29,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:32:32,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 13:32:34,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 13:32:34,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:32:34,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:32:37,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1671666.6666666667, ans=10.0 2023-10-04 13:32:37,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1671666.6666666667, ans=0.07 2023-10-04 13:32:38,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:32:40,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 13:32:41,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 13:32:41,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 13:32:41,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:32:42,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:32:43,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 13:32:47,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:32:49,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:32:49,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:32:49,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:32:49,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:32:50,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1671733.3333333333, ans=0.1 2023-10-04 13:32:53,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:32:53,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 13:32:55,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:32:55,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 13:32:55,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 13:32:57,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:32:58,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1671733.3333333333, ans=0.1 2023-10-04 13:33:00,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:33:02,719 INFO [train.py:1046] (1/4) Epoch 48, batch 1100, loss[loss=0.1499, simple_loss=0.2396, pruned_loss=0.03013, over 23995.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2322, pruned_loss=0.03612, over 4689708.06 frames. ], batch size: 86, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:33:05,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:33:07,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1671800.0, ans=0.0 2023-10-04 13:33:07,950 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.798e+02 2.096e+02 2.413e+02 2.876e+02 5.398e+02, threshold=4.826e+02, percent-clipped=2.0 2023-10-04 13:33:10,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:33:12,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:33:13,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:33:14,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 13:33:16,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:33:19,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:33:19,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1671866.6666666667, ans=0.125 2023-10-04 13:33:20,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:33:24,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:33:24,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 13:33:25,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 13:33:27,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:33:27,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:33:28,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:33:30,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1671933.3333333333, ans=0.0 2023-10-04 13:33:32,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:33:35,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:33:38,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 13:33:39,437 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 13:33:39,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:33:42,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:33:43,067 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.17 vs. limit=15.0 2023-10-04 13:33:43,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 13:33:43,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:33:44,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 13:33:45,832 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.09 vs. limit=22.5 2023-10-04 13:33:46,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:33:46,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:33:46,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:33:47,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:33:47,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 13:33:50,000 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.20 vs. limit=15.0 2023-10-04 13:33:55,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:33:55,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 13:33:56,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:34:01,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:34:05,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 13:34:05,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 13:34:07,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:34:11,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:34:11,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:34:12,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 13:34:12,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:34:12,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:34:14,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 13:34:14,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:34:14,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 13:34:16,399 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.79 vs. limit=22.5 2023-10-04 13:34:16,810 INFO [train.py:1046] (1/4) Epoch 48, batch 1150, loss[loss=0.1349, simple_loss=0.217, pruned_loss=0.02636, over 18545.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2317, pruned_loss=0.0355, over 4695601.97 frames. ], batch size: 40, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:34:16,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:34:16,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:34:18,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:34:18,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1672133.3333333333, ans=0.04949747468305833 2023-10-04 13:34:21,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:34:24,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:34:25,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:34:26,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:34:26,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 13:34:26,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:34:29,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 13:34:31,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:34:31,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:34:31,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1672200.0, ans=0.125 2023-10-04 13:34:35,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1672200.0, ans=0.07 2023-10-04 13:34:36,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1672200.0, ans=0.0 2023-10-04 13:34:37,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 13:34:39,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:34:43,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:34:43,499 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:34:44,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 13:34:44,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:34:44,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:34:45,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1672266.6666666667, ans=0.1 2023-10-04 13:34:48,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 13:34:49,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:34:51,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:35:01,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:35:06,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:35:06,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 13:35:08,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:35:08,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:35:13,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1672333.3333333333, ans=0.0 2023-10-04 13:35:15,818 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 13:35:17,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:35:17,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1672400.0, ans=0.0 2023-10-04 13:35:24,313 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 13:35:24,747 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:35:27,714 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:35:27,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:35:27,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:35:29,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:35:30,494 INFO [train.py:1046] (1/4) Epoch 48, batch 1200, loss[loss=0.1565, simple_loss=0.2455, pruned_loss=0.0338, over 24379.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.233, pruned_loss=0.03624, over 4711779.30 frames. ], batch size: 77, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:35:31,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:35:36,968 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.737e+02 1.971e+02 2.130e+02 2.381e+02 3.707e+02, threshold=4.260e+02, percent-clipped=0.0 2023-10-04 13:35:37,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:35:37,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:35:38,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:35:38,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:35:38,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:35:41,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:35:43,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:35:44,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:35:44,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:35:47,310 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 13:35:51,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 13:35:54,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:35:55,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:35:58,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:36:01,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:36:01,555 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 13:36:01,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:36:08,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 13:36:08,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:36:08,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 13:36:10,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:36:13,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 13:36:16,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 13:36:16,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:36:17,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:36:17,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:36:19,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:36:20,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:36:20,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:36:20,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:36:21,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 13:36:21,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:36:23,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:36:23,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:36:23,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1672666.6666666667, ans=0.125 2023-10-04 13:36:24,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:36:24,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:36:28,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 13:36:32,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:36:35,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 13:36:39,987 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 13:36:41,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:36:42,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:36:44,131 INFO [train.py:1046] (1/4) Epoch 48, batch 1250, loss[loss=0.1463, simple_loss=0.2233, pruned_loss=0.03466, over 24618.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2345, pruned_loss=0.03689, over 4695216.18 frames. ], batch size: 60, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:36:44,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:36:46,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:36:47,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 13:36:47,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1672800.0, ans=0.0 2023-10-04 13:36:50,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:36:51,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:36:51,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 13:36:53,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:36:56,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:36:59,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:37:00,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:37:01,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:37:01,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:37:03,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:37:06,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 13:37:07,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:37:07,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:37:09,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:37:09,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1672866.6666666667, ans=0.125 2023-10-04 13:37:10,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:37:12,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1672933.3333333333, ans=0.2 2023-10-04 13:37:13,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:37:15,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 13:37:15,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1672933.3333333333, ans=0.125 2023-10-04 13:37:15,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1672933.3333333333, ans=0.1 2023-10-04 13:37:21,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 13:37:21,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:37:22,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:37:23,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 13:37:25,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:37:25,689 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 13:37:25,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:37:25,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:37:28,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:37:32,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:37:32,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:37:34,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 13:37:34,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 13:37:35,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 13:37:38,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:37:40,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 13:37:40,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:37:44,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 13:37:44,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:37:45,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 13:37:45,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 13:37:47,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:37:47,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 13:37:47,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:37:50,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 13:37:52,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:37:53,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:37:54,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:37:56,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:37:57,681 INFO [train.py:1046] (1/4) Epoch 48, batch 1300, loss[loss=0.1578, simple_loss=0.2432, pruned_loss=0.03617, over 24018.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2351, pruned_loss=0.03681, over 4707501.16 frames. ], batch size: 80, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:38:00,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:38:01,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 13:38:03,131 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.754e+02 2.065e+02 2.223e+02 2.420e+02 4.502e+02, threshold=4.446e+02, percent-clipped=1.0 2023-10-04 13:38:04,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:38:06,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 13:38:07,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:38:10,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:38:12,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:38:12,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 13:38:13,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1673200.0, ans=0.125 2023-10-04 13:38:16,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:38:17,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:38:19,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 13:38:22,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:38:25,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:38:26,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:38:28,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:38:28,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:38:29,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:38:29,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 13:38:31,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 13:38:35,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:38:35,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:38:36,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 13:38:36,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1673266.6666666667, ans=0.125 2023-10-04 13:38:37,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 13:38:39,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:38:41,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:38:41,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 13:38:41,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1673333.3333333333, ans=0.0 2023-10-04 13:38:43,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:38:43,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 13:38:44,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:38:44,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1673333.3333333333, ans=0.125 2023-10-04 13:38:49,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:38:49,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:38:52,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 13:38:52,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 13:38:54,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1673333.3333333333, ans=0.0 2023-10-04 13:38:55,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 13:38:59,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:38:59,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1673400.0, ans=0.125 2023-10-04 13:39:01,251 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 13:39:02,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:39:10,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 13:39:11,724 INFO [train.py:1046] (1/4) Epoch 48, batch 1350, loss[loss=0.1492, simple_loss=0.223, pruned_loss=0.03767, over 23832.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2344, pruned_loss=0.03643, over 4702778.32 frames. ], batch size: 195, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:39:11,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:39:14,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:39:16,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1673466.6666666667, ans=0.125 2023-10-04 13:39:17,626 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:39:19,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:39:19,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:39:20,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:39:20,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:39:25,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:39:26,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 13:39:28,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:39:29,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:39:30,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 13:39:32,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:39:33,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:39:34,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 13:39:36,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 13:39:37,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 13:39:40,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:39:40,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 13:39:47,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1673600.0, ans=0.0 2023-10-04 13:39:51,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:40:00,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:40:00,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:40:00,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 13:40:04,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:40:05,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 13:40:05,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:40:06,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:40:08,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:40:08,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1673666.6666666667, ans=0.125 2023-10-04 13:40:09,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 13:40:12,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:40:17,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 13:40:18,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 13:40:24,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1673733.3333333333, ans=0.125 2023-10-04 13:40:25,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 13:40:25,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:40:26,574 INFO [train.py:1046] (1/4) Epoch 48, batch 1400, loss[loss=0.1264, simple_loss=0.1822, pruned_loss=0.03531, over 19103.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2341, pruned_loss=0.03608, over 4709965.28 frames. ], batch size: 388, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:40:29,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:40:30,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:40:32,028 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.732e+02 2.089e+02 2.315e+02 2.656e+02 4.133e+02, threshold=4.629e+02, percent-clipped=0.0 2023-10-04 13:40:32,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1673800.0, ans=0.0 2023-10-04 13:40:34,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 13:40:36,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 13:40:43,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1673866.6666666667, ans=0.2 2023-10-04 13:40:45,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:40:47,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:40:49,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:40:49,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 13:40:53,693 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:40:55,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 13:41:03,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:04,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:09,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 13:41:09,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:41:09,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1674000.0, ans=0.1 2023-10-04 13:41:10,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:41:10,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:41:12,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:41:14,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:41:14,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:41:15,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:41:16,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 13:41:16,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:41:18,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1674000.0, ans=0.0 2023-10-04 13:41:20,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:21,786 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.13 vs. limit=15.0 2023-10-04 13:41:24,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:41:28,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1674066.6666666667, ans=0.125 2023-10-04 13:41:31,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 13:41:33,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:41:33,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:41:34,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 13:41:34,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:41:37,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:41:39,732 INFO [train.py:1046] (1/4) Epoch 48, batch 1450, loss[loss=0.1515, simple_loss=0.2086, pruned_loss=0.04719, over 19368.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2331, pruned_loss=0.03566, over 4700039.10 frames. ], batch size: 389, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:41:41,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:41:43,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:41:43,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:43,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 13:41:46,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:41:46,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:41:49,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:41:49,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 13:41:50,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:41:52,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 13:41:53,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:55,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:41:55,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 13:41:55,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:41:57,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:41:57,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 13:41:57,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:41:58,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:42:00,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:42:01,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1674200.0, ans=0.2 2023-10-04 13:42:02,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:42:05,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:42:05,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:42:08,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:42:08,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:42:09,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:42:09,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:42:11,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:42:11,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:42:13,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1674266.6666666667, ans=0.0 2023-10-04 13:42:15,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 13:42:17,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:42:21,971 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 13:42:23,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:42:25,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:42:27,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:42:28,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 13:42:32,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:42:33,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 13:42:35,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 13:42:35,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:42:38,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:42:38,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:42:41,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 13:42:42,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 13:42:44,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 13:42:44,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=1674400.0, ans=10.0 2023-10-04 13:42:45,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:42:45,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:42:54,641 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.31 vs. limit=15.0 2023-10-04 13:42:55,125 INFO [train.py:1046] (1/4) Epoch 48, batch 1500, loss[loss=0.156, simple_loss=0.2458, pruned_loss=0.03309, over 24653.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2341, pruned_loss=0.03565, over 4719475.00 frames. ], batch size: 73, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:42:58,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 13:42:58,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:42:58,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:42:59,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:42:59,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:43:00,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1674466.6666666667, ans=0.125 2023-10-04 13:43:01,095 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.007e+02 2.219e+02 2.655e+02 4.541e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-04 13:43:01,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:43:01,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 13:43:02,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=1674466.6666666667, ans=6.0 2023-10-04 13:43:02,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:43:02,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 13:43:02,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:43:04,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:43:05,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:43:05,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:43:11,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:43:11,236 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 13:43:13,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:43:13,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:43:14,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:43:17,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 13:43:21,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 13:43:23,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:43:25,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 13:43:25,819 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.53 vs. limit=15.0 2023-10-04 13:43:26,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 13:43:28,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:43:28,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:43:30,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:43:30,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 13:43:31,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:43:31,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:43:32,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 13:43:32,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:43:38,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:43:38,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 13:43:43,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:43:44,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:43:47,543 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 13:43:49,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:43:49,478 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 13:43:50,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:43:50,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:43:52,707 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 13:43:54,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:43:55,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 13:43:58,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:44:02,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:44:02,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:44:02,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:44:02,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:44:04,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:44:06,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 13:44:06,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 13:44:07,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:44:08,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 13:44:08,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 13:44:09,771 INFO [train.py:1046] (1/4) Epoch 48, batch 1550, loss[loss=0.1486, simple_loss=0.2214, pruned_loss=0.03786, over 22905.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2347, pruned_loss=0.03609, over 4723895.86 frames. ], batch size: 322, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:44:11,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:44:12,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:44:12,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:44:12,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:44:13,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1674800.0, ans=0.125 2023-10-04 13:44:14,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:44:15,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:44:18,788 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 13:44:20,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:44:20,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:44:20,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:44:23,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:44:23,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 13:44:24,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:44:25,595 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.63 vs. limit=12.0 2023-10-04 13:44:26,440 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 13:44:29,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 13:44:29,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 13:44:29,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:44:29,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:44:33,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:44:33,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1674866.6666666667, ans=0.2 2023-10-04 13:44:35,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 13:44:35,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 13:44:35,874 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.02 vs. limit=15.0 2023-10-04 13:44:43,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:44:46,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1674933.3333333333, ans=0.125 2023-10-04 13:44:47,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:44:47,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:44:47,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:44:48,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 13:44:52,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1675000.0, ans=0.125 2023-10-04 13:44:55,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:44:57,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:44:57,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1675000.0, ans=0.125 2023-10-04 13:45:00,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:45:01,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:45:03,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:45:03,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 13:45:03,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:45:04,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:45:06,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:45:07,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 13:45:07,369 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 13:45:09,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:45:13,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 13:45:18,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:45:20,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:45:20,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 13:45:22,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:45:22,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:45:22,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:45:22,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:45:22,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1675133.3333333333, ans=0.125 2023-10-04 13:45:23,604 INFO [train.py:1046] (1/4) Epoch 48, batch 1600, loss[loss=0.1423, simple_loss=0.2262, pruned_loss=0.02921, over 23266.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2353, pruned_loss=0.03641, over 4723300.02 frames. ], batch size: 105, lr: 2.12e-03, grad_scale: 32.0 2023-10-04 13:45:23,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:45:27,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:45:27,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 13:45:28,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 13:45:30,110 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.052e+02 2.355e+02 2.599e+02 3.468e+02, threshold=4.711e+02, percent-clipped=0.0 2023-10-04 13:45:30,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 13:45:32,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:45:34,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 13:45:34,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1675133.3333333333, ans=0.125 2023-10-04 13:45:35,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:45:37,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:45:41,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:45:46,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 13:45:48,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:45:48,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 13:45:48,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:45:50,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 13:45:56,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 13:46:01,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1675266.6666666667, ans=0.0 2023-10-04 13:46:02,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1675266.6666666667, ans=10.0 2023-10-04 13:46:05,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:46:05,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 13:46:06,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:46:06,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:46:06,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:46:09,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 13:46:13,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 13:46:16,305 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:46:17,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:46:17,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:46:17,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:46:20,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:46:20,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:46:21,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:46:24,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1675400.0, ans=0.0 2023-10-04 13:46:29,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:46:29,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:46:31,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 13:46:31,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:46:33,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 13:46:36,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:46:36,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1675466.6666666667, ans=0.1 2023-10-04 13:46:37,342 INFO [train.py:1046] (1/4) Epoch 48, batch 1650, loss[loss=0.1302, simple_loss=0.2085, pruned_loss=0.02599, over 24352.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2359, pruned_loss=0.03622, over 4732638.96 frames. ], batch size: 56, lr: 2.12e-03, grad_scale: 32.0 2023-10-04 13:46:38,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:46:38,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:46:40,115 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 13:46:40,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 13:46:40,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 13:46:40,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1675466.6666666667, ans=0.125 2023-10-04 13:46:41,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 13:46:44,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:46:44,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:46:46,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:46:46,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:46:47,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:46:49,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 13:46:49,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1675466.6666666667, ans=0.0 2023-10-04 13:46:51,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:46:51,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:46:51,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:46:51,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:46:51,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 13:46:51,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 13:46:57,197 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.32 vs. limit=6.0 2023-10-04 13:46:58,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:46:59,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:47:06,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1675600.0, ans=0.0 2023-10-04 13:47:08,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 13:47:08,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:47:11,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 13:47:16,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:47:19,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:47:19,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:47:20,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:47:23,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:47:23,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:47:24,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:47:24,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:47:26,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:47:26,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:47:26,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:47:27,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:47:32,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:47:32,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 13:47:35,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:47:35,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 13:47:36,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 13:47:36,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 13:47:36,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:47:38,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:47:38,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:47:39,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:47:39,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 13:47:41,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1675733.3333333333, ans=0.2 2023-10-04 13:47:43,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:47:45,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:47:45,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:47:48,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 13:47:51,552 INFO [train.py:1046] (1/4) Epoch 48, batch 1700, loss[loss=0.1552, simple_loss=0.2457, pruned_loss=0.03237, over 24442.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2348, pruned_loss=0.03623, over 4740181.14 frames. ], batch size: 69, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:47:52,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:47:52,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:47:52,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 13:47:54,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:47:54,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:47:54,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:47:57,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:47:57,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:47:57,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1675800.0, ans=0.125 2023-10-04 13:47:58,925 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.807e+02 2.221e+02 2.607e+02 3.070e+02 5.494e+02, threshold=5.214e+02, percent-clipped=5.0 2023-10-04 13:47:58,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 13:48:02,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:48:09,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:48:12,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:48:16,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:48:18,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:48:18,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:48:18,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:48:21,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 13:48:22,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:48:23,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:48:24,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:48:25,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 13:48:28,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 13:48:28,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 13:48:30,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:48:31,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 13:48:31,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:48:39,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:48:40,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:48:41,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:48:43,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 13:48:43,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 13:48:43,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:48:46,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:48:46,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 13:48:46,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:48:46,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:48:46,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:48:46,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:48:48,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:48:48,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:48:50,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:48:50,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:48:50,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:48:55,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:48:55,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 13:48:58,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:48:59,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:49:01,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 13:49:05,491 INFO [train.py:1046] (1/4) Epoch 48, batch 1750, loss[loss=0.136, simple_loss=0.2169, pruned_loss=0.02753, over 21174.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.233, pruned_loss=0.03584, over 4728452.66 frames. ], batch size: 46, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:49:07,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:49:07,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1676133.3333333333, ans=0.125 2023-10-04 13:49:10,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:49:10,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 13:49:11,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 13:49:11,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:49:12,560 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.08 vs. limit=12.0 2023-10-04 13:49:13,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:49:13,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:49:17,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 13:49:20,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:49:21,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 13:49:21,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:49:23,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:49:23,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1676200.0, ans=0.125 2023-10-04 13:49:26,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 13:49:26,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1676200.0, ans=0.0 2023-10-04 13:49:28,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 13:49:29,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:49:31,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 13:49:38,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:49:39,981 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:49:42,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:49:42,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:49:45,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:49:45,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:49:48,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:49:49,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:49:51,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:49:51,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:49:51,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 13:49:51,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1676333.3333333333, ans=0.2 2023-10-04 13:49:51,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1676333.3333333333, ans=0.125 2023-10-04 13:49:54,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:49:59,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 13:49:59,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:50:00,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:50:00,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:50:05,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:50:07,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 13:50:08,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:50:09,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:50:12,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:50:14,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:50:15,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:50:15,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1676400.0, ans=0.1 2023-10-04 13:50:16,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 13:50:16,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:50:18,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:50:18,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:50:20,041 INFO [train.py:1046] (1/4) Epoch 48, batch 1800, loss[loss=0.1617, simple_loss=0.2357, pruned_loss=0.04379, over 23652.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2323, pruned_loss=0.03547, over 4723670.10 frames. ], batch size: 256, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:50:20,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:50:20,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:50:20,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:50:22,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:50:24,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:50:25,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 13:50:27,377 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.717e+02 2.032e+02 2.223e+02 2.665e+02 4.084e+02, threshold=4.447e+02, percent-clipped=0.0 2023-10-04 13:50:27,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:50:30,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 13:50:32,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:50:36,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:50:36,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1676533.3333333333, ans=0.0 2023-10-04 13:50:37,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:50:38,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:50:41,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:50:42,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:50:42,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 13:50:44,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:50:46,387 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.15 vs. limit=15.0 2023-10-04 13:50:46,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:50:51,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 13:50:53,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 13:50:53,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 13:50:54,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:50:55,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:50:55,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:50:57,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:51:03,263 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 13:51:04,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:51:06,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:51:08,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 13:51:08,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 13:51:09,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:51:10,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:51:10,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:51:15,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 13:51:20,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:51:21,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 13:51:23,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:51:23,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:51:23,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:51:23,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 13:51:26,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:51:27,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1676733.3333333333, ans=0.125 2023-10-04 13:51:28,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:51:28,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1676733.3333333333, ans=0.0 2023-10-04 13:51:30,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 13:51:30,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:51:33,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:51:33,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:51:33,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:51:33,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1676800.0, ans=10.0 2023-10-04 13:51:34,532 INFO [train.py:1046] (1/4) Epoch 48, batch 1850, loss[loss=0.1639, simple_loss=0.2488, pruned_loss=0.03953, over 23994.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2327, pruned_loss=0.03575, over 4717350.62 frames. ], batch size: 80, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:51:34,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:51:34,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:51:37,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:51:37,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:51:39,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:51:40,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:51:42,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1676800.0, ans=0.125 2023-10-04 13:51:45,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:51:45,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 13:51:49,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 13:51:51,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 13:51:55,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:51:56,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 13:51:56,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 13:52:00,439 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.94 vs. limit=15.0 2023-10-04 13:52:07,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:52:08,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 13:52:09,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:52:11,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:52:14,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 13:52:16,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:16,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 13:52:17,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:52:19,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:52:20,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:52:23,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:52:24,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:24,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 13:52:24,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:52:26,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:52:28,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:52:31,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 13:52:32,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:52:36,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:52:36,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:52:36,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 13:52:36,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 13:52:38,898 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 13:52:40,301 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 13:52:42,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:52:42,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:52:42,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:52:42,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:42,277 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 13:52:43,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:52:43,588 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:43,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:52:45,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:52:46,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:52:46,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 13:52:48,909 INFO [train.py:1046] (1/4) Epoch 48, batch 1900, loss[loss=0.1546, simple_loss=0.2279, pruned_loss=0.04068, over 23716.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2332, pruned_loss=0.03608, over 4705548.66 frames. ], batch size: 164, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:52:49,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:49,014 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 13:52:49,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:52:50,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:52:55,965 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 2.090e+02 2.355e+02 2.808e+02 4.439e+02, threshold=4.709e+02, percent-clipped=0.0 2023-10-04 13:52:56,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:52:56,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1677133.3333333333, ans=0.125 2023-10-04 13:52:59,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:52:59,267 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 13:52:59,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 13:53:02,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:53:02,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:53:02,670 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 13:53:03,989 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 13:53:07,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 13:53:08,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:53:12,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1677200.0, ans=0.07 2023-10-04 13:53:13,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 13:53:16,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 13:53:19,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1677266.6666666667, ans=0.1 2023-10-04 13:53:24,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 13:53:27,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 13:53:27,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:53:27,400 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 13:53:27,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 13:53:29,141 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 13:53:29,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 13:53:29,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:53:33,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 13:53:35,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:53:38,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:53:38,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 13:53:39,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:53:44,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 13:53:46,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:53:50,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:53:50,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:53:50,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:53:51,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:53:53,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:53:53,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 13:53:53,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:53:57,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:53:57,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:53:58,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:53:58,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:54:00,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:54:01,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:54:03,589 INFO [train.py:1046] (1/4) Epoch 48, batch 1950, loss[loss=0.1532, simple_loss=0.2454, pruned_loss=0.03055, over 24570.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2333, pruned_loss=0.03623, over 4714517.88 frames. ], batch size: 71, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:54:05,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:54:06,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:54:08,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:54:08,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:54:09,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 13:54:11,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 13:54:12,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:54:14,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:54:15,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:54:17,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:54:17,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:54:18,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:54:20,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:54:20,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:54:21,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:54:21,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:54:22,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:54:25,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:54:25,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:54:25,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 13:54:25,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 13:54:26,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 13:54:27,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:54:27,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:54:31,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:54:35,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:54:38,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:54:41,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:54:41,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:54:43,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 13:54:43,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:54:47,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:54:48,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:54:48,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:54:56,893 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:54:58,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:55:01,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:55:04,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:55:04,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1677733.3333333333, ans=0.1 2023-10-04 13:55:07,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:55:07,737 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:55:09,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 13:55:09,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:55:10,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:55:12,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 13:55:13,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1677733.3333333333, ans=0.125 2023-10-04 13:55:16,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:55:18,192 INFO [train.py:1046] (1/4) Epoch 48, batch 2000, loss[loss=0.1401, simple_loss=0.22, pruned_loss=0.03006, over 24327.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2346, pruned_loss=0.03654, over 4712126.81 frames. ], batch size: 56, lr: 2.12e-03, grad_scale: 32.0 2023-10-04 13:55:18,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=1677800.0, ans=0.05 2023-10-04 13:55:19,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:55:21,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:55:22,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:55:22,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:55:25,199 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.085e+02 2.271e+02 2.591e+02 3.651e+02, threshold=4.543e+02, percent-clipped=0.0 2023-10-04 13:55:25,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:55:29,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 13:55:29,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:55:32,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:55:35,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 13:55:36,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:55:36,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:55:38,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:55:40,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 13:55:41,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:55:43,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:55:43,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:55:43,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1677866.6666666667, ans=0.025 2023-10-04 13:55:44,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 13:55:44,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 13:55:46,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 13:55:48,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:55:49,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:55:51,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:55:51,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:55:51,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:55:53,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:55:53,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 13:55:56,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 13:55:56,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:55:56,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:00,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:56:00,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:56:00,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:56:02,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:56:03,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:56:05,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:56:05,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:56:05,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:56:06,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:09,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:56:10,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 13:56:16,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:56:16,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:20,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:20,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:56:24,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:26,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:56:26,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:27,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:56:27,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:56:28,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:30,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:31,612 INFO [train.py:1046] (1/4) Epoch 48, batch 2050, loss[loss=0.1633, simple_loss=0.2563, pruned_loss=0.03509, over 24552.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.234, pruned_loss=0.0365, over 4697280.70 frames. ], batch size: 71, lr: 2.12e-03, grad_scale: 32.0 2023-10-04 13:56:33,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:56:34,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:40,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:56:42,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:56:43,636 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:43,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:56:45,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 13:56:45,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:56:47,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1678200.0, ans=0.2 2023-10-04 13:56:47,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1678200.0, ans=0.0 2023-10-04 13:56:48,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:56:48,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:56:53,515 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.50 vs. limit=15.0 2023-10-04 13:56:57,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:56:57,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:59,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 13:57:01,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:57:01,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1678266.6666666667, ans=0.2 2023-10-04 13:57:02,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 13:57:02,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:57:05,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:57:08,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:57:10,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:57:10,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:57:11,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:57:13,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:57:13,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:57:18,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:57:20,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:57:21,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:57:21,873 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:57:23,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:57:27,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:57:33,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:57:34,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 13:57:38,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:57:40,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:57:42,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:57:42,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1678400.0, ans=0.125 2023-10-04 13:57:43,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 13:57:45,299 INFO [train.py:1046] (1/4) Epoch 48, batch 2100, loss[loss=0.1515, simple_loss=0.2159, pruned_loss=0.04356, over 22850.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2328, pruned_loss=0.03616, over 4698902.01 frames. ], batch size: 322, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:57:46,961 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 13:57:46,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:57:48,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:57:49,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:57:49,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:57:49,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 13:57:51,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 13:57:52,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:57:52,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1678466.6666666667, ans=0.0 2023-10-04 13:57:55,617 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.079e+02 2.318e+02 2.598e+02 4.333e+02, threshold=4.637e+02, percent-clipped=0.0 2023-10-04 13:57:55,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:57:55,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:57:58,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:57:58,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1678533.3333333333, ans=0.125 2023-10-04 13:57:59,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:57:59,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 13:58:01,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:58:01,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 13:58:01,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 13:58:02,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:58:02,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:58:02,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 13:58:03,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 13:58:07,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1678533.3333333333, ans=0.125 2023-10-04 13:58:08,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 13:58:08,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:58:12,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:58:12,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:58:14,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1678600.0, ans=0.125 2023-10-04 13:58:17,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:58:17,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 13:58:17,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:58:17,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 13:58:20,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 13:58:20,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:58:20,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 13:58:21,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 13:58:21,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 13:58:21,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1678600.0, ans=0.125 2023-10-04 13:58:23,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:58:26,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:58:26,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1678600.0, ans=0.0 2023-10-04 13:58:27,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:58:29,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:58:30,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:58:32,156 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:58:32,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 13:58:32,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:58:32,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:58:33,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:58:33,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 13:58:36,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 13:58:36,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 13:58:39,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:58:41,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:58:42,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 13:58:47,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:58:49,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:58:49,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:58:51,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:58:51,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 13:58:51,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:58:52,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:58:52,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:58:52,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:58:52,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:58:52,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1678733.3333333333, ans=0.1 2023-10-04 13:58:55,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 13:58:56,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 13:58:56,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:58:58,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1678800.0, ans=0.125 2023-10-04 13:58:59,264 INFO [train.py:1046] (1/4) Epoch 48, batch 2150, loss[loss=0.1323, simple_loss=0.219, pruned_loss=0.02282, over 24627.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2325, pruned_loss=0.03621, over 4700585.91 frames. ], batch size: 60, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:59:00,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:59:00,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:59:00,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:59:00,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:59:02,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1678800.0, ans=0.125 2023-10-04 13:59:04,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 13:59:07,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:59:09,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:59:10,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:59:10,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:12,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:59:16,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:59:16,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:59:16,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:59:19,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:21,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 13:59:25,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:59:26,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:59:27,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:27,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:59:27,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:28,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:59:29,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:59:29,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:59:29,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:59:29,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1678933.3333333333, ans=0.0 2023-10-04 13:59:30,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 13:59:32,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1678933.3333333333, ans=0.025 2023-10-04 13:59:33,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:59:34,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:59:34,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:59:36,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:59:36,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:59:38,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:59:38,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:59:42,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:59:42,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 13:59:42,202 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:59:45,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:59:45,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:47,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:59:47,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:59:48,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:59:48,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:48,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 13:59:51,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 13:59:51,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:59:51,922 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 13:59:51,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:59:51,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:59:54,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 13:59:54,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:59:54,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 13:59:54,577 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 13:59:54,578 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 13:59:55,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 13:59:58,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:59:58,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:59:58,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:59:59,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:01,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 14:00:02,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:00:02,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:11,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:00:12,994 INFO [train.py:1046] (1/4) Epoch 48, batch 2200, loss[loss=0.1625, simple_loss=0.2469, pruned_loss=0.03909, over 24375.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2329, pruned_loss=0.0364, over 4696844.08 frames. ], batch size: 77, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 14:00:13,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 14:00:16,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:00:16,628 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:00:22,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:23,614 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.795e+02 2.040e+02 2.222e+02 2.604e+02 4.042e+02, threshold=4.443e+02, percent-clipped=0.0 2023-10-04 14:00:23,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:00:23,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:00:25,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:00:27,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:00:27,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:00:27,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 14:00:33,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 14:00:36,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:00:41,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 14:00:42,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:44,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:00:45,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:00:48,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:00:49,282 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 14:00:52,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 14:00:52,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:52,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 14:00:56,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1679266.6666666667, ans=0.125 2023-10-04 14:00:57,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:00:58,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:01:01,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:01:02,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:01:04,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 14:01:04,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:01:05,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 14:01:06,137 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.17 vs. limit=15.0 2023-10-04 14:01:06,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:01:06,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:01:06,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:01:08,886 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.84 vs. limit=6.0 2023-10-04 14:01:09,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:01:09,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:01:09,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:01:10,922 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:01:12,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 14:01:12,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:01:13,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1679400.0, ans=0.5 2023-10-04 14:01:15,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:01:18,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 14:01:18,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:01:21,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:01:23,031 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 14:01:23,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1679400.0, ans=0.125 2023-10-04 14:01:26,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:01:26,450 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 14:01:27,701 INFO [train.py:1046] (1/4) Epoch 48, batch 2250, loss[loss=0.1637, simple_loss=0.2459, pruned_loss=0.04077, over 23290.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.233, pruned_loss=0.03615, over 4703661.81 frames. ], batch size: 93, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 14:01:27,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 14:01:29,104 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 14:01:30,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:01:30,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 14:01:32,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:01:33,543 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 14:01:36,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:01:37,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:01:44,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:01:46,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:01:48,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:01:48,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:01:50,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:01:53,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 14:01:53,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:01:53,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:01:56,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 14:01:58,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:01:58,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:01:59,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:02:04,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:02:04,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:02:04,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 14:02:06,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1679600.0, ans=0.2 2023-10-04 14:02:06,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 14:02:08,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:02:09,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:02:13,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:02:15,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:02:16,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:02:16,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:02:19,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:02:19,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:02:21,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1679666.6666666667, ans=0.2 2023-10-04 14:02:24,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:02:27,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 14:02:33,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:02:33,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:02:33,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:02:37,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:02:39,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:02:39,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 14:02:40,417 INFO [train.py:1046] (1/4) Epoch 48, batch 2300, loss[loss=0.2098, simple_loss=0.276, pruned_loss=0.07185, over 19265.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2341, pruned_loss=0.03636, over 4711666.63 frames. ], batch size: 388, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 14:02:40,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:02:40,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:02:43,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 14:02:44,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:02:44,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:02:44,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1679800.0, ans=0.2 2023-10-04 14:02:44,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1679800.0, ans=0.0 2023-10-04 14:02:50,762 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.746e+02 2.018e+02 2.211e+02 2.601e+02 3.731e+02, threshold=4.421e+02, percent-clipped=0.0 2023-10-04 14:02:50,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:02:50,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:02:52,799 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 14:02:54,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:03:02,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:03:02,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:03:02,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:03:02,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:03:02,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 14:03:03,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:03:06,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:03:07,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:03:09,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:03:11,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:03:13,512 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:03:14,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:03:17,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1679933.3333333333, ans=0.125 2023-10-04 14:03:19,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:03:20,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:03:23,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:03:24,310 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.99 vs. limit=12.0 2023-10-04 14:03:28,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:03:32,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:03:33,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:03:34,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:03:34,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 14:03:37,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:03:39,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:03:39,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:03:39,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:03:39,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:03:40,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 14:03:40,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:03:41,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 14:03:41,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:03:41,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:03:42,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 14:03:46,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:03:51,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:03:53,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:03:53,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:03:53,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:03:55,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:03:55,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:03:55,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:03:56,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1680133.3333333333, ans=0.125 2023-10-04 14:03:57,267 INFO [train.py:1046] (1/4) Epoch 48, batch 2350, loss[loss=0.1494, simple_loss=0.2414, pruned_loss=0.02875, over 24645.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2354, pruned_loss=0.037, over 4698662.02 frames. ], batch size: 73, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 14:03:57,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 14:04:02,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1680133.3333333333, ans=0.0 2023-10-04 14:04:04,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:04:04,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 14:04:10,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 14:04:11,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:04:15,908 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:04:15,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:04:15,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:04:15,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:04:19,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 14:04:22,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:04:26,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 14:04:28,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:04:31,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:04:31,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:04:34,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:04:36,015 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 14:04:36,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:04:37,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:04:38,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:04:38,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:04:41,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:04:44,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 14:04:44,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:04:45,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:04:45,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:04:47,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 14:04:48,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:04:50,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 14:04:50,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:04:55,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 14:04:59,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 14:05:00,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:05:00,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:05:00,851 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 14:05:00,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 14:05:02,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1680400.0, ans=0.1 2023-10-04 14:05:04,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 14:05:06,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:05:08,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:05:11,134 INFO [train.py:1046] (1/4) Epoch 48, batch 2400, loss[loss=0.1381, simple_loss=0.1978, pruned_loss=0.03918, over 19504.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2345, pruned_loss=0.03663, over 4701374.32 frames. ], batch size: 388, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:05:12,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:05:13,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:05:15,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 14:05:15,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 14:05:21,888 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.60 vs. limit=6.0 2023-10-04 14:05:22,854 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.080e+02 2.454e+02 2.862e+02 5.375e+02, threshold=4.908e+02, percent-clipped=3.0 2023-10-04 14:05:24,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:05:24,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:05:27,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 14:05:27,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:05:27,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:05:27,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 14:05:35,101 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:05:36,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 14:05:39,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:05:39,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1680600.0, ans=0.0 2023-10-04 14:05:43,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 14:05:43,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1680600.0, ans=0.125 2023-10-04 14:05:46,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:05:47,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:05:54,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:05:54,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 14:05:55,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:06:03,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:06:05,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:06:09,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:06:10,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:06:10,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:06:10,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:06:10,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:06:10,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:06:10,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:06:13,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:06:13,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:06:14,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 14:06:14,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 14:06:17,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:06:17,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:06:18,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 14:06:18,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 14:06:20,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 14:06:20,255 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 14:06:20,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 14:06:22,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:06:24,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:06:24,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:06:25,331 INFO [train.py:1046] (1/4) Epoch 48, batch 2450, loss[loss=0.143, simple_loss=0.2355, pruned_loss=0.02523, over 24335.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.233, pruned_loss=0.03623, over 4695555.35 frames. ], batch size: 74, lr: 2.11e-03, grad_scale: 4.0 2023-10-04 14:06:25,428 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 14:06:26,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:06:26,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:06:29,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:06:29,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:06:32,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:06:33,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:06:35,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 14:06:41,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:06:41,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:06:44,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:06:44,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:06:44,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:06:45,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 14:06:49,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:06:51,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:06:53,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:06:56,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:06:56,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:06:58,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:06:58,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:06:59,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 14:07:00,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:07:06,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:07:07,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1680933.3333333333, ans=0.125 2023-10-04 14:07:08,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:07:08,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:07:08,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:07:10,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:07:11,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:07:12,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 14:07:15,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:07:15,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:07:16,445 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.52 vs. limit=15.0 2023-10-04 14:07:18,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:07:19,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:07:23,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:07:23,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 14:07:24,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:07:24,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:07:24,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 14:07:26,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:07:26,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:07:30,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:07:32,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:07:32,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:07:36,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 14:07:38,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:07:39,863 INFO [train.py:1046] (1/4) Epoch 48, batch 2500, loss[loss=0.1637, simple_loss=0.2477, pruned_loss=0.03988, over 23724.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2316, pruned_loss=0.03577, over 4699827.09 frames. ], batch size: 85, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:07:44,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:07:51,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:07:52,952 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.739e+02 2.028e+02 2.220e+02 2.568e+02 3.726e+02, threshold=4.440e+02, percent-clipped=0.0 2023-10-04 14:07:53,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:07:54,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:07:54,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 14:08:00,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:08:01,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:08:01,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 14:08:03,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:08:03,195 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 14:08:05,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:08:06,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:08:06,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 14:08:06,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:08:08,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 14:08:08,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:08:13,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:08:13,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:08:16,620 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:08:16,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 14:08:17,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:08:18,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:08:22,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:08:22,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1681333.3333333333, ans=0.125 2023-10-04 14:08:25,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:08:27,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:08:33,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 14:08:34,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 14:08:34,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:08:36,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 14:08:39,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:08:39,985 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:08:41,450 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 14:08:41,450 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 14:08:41,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 14:08:42,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:08:45,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 14:08:47,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 14:08:47,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:08:48,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 14:08:51,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 14:08:54,283 INFO [train.py:1046] (1/4) Epoch 48, batch 2550, loss[loss=0.1574, simple_loss=0.2329, pruned_loss=0.04094, over 22863.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2317, pruned_loss=0.03549, over 4712285.59 frames. ], batch size: 322, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:08:54,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:08:55,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1681466.6666666667, ans=0.125 2023-10-04 14:08:57,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:08:58,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:09:00,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:09:00,406 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 14:09:01,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:09:05,867 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 14:09:07,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:09:08,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:10,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:09:10,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 14:09:10,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:09:12,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:09:12,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:09:14,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:09:15,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 14:09:15,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 14:09:15,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:15,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 14:09:18,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1681533.3333333333, ans=0.125 2023-10-04 14:09:25,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:09:30,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:09:30,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:31,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:09:33,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 14:09:34,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1681600.0, ans=0.125 2023-10-04 14:09:39,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1681666.6666666667, ans=0.125 2023-10-04 14:09:40,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:09:42,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:09:42,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:09:42,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:09:43,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 14:09:43,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:09:48,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:09:48,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:50,151 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.17 vs. limit=10.0 2023-10-04 14:09:53,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:09:53,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 14:09:53,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:09:53,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:55,074 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.65 vs. limit=6.0 2023-10-04 14:09:55,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 14:09:58,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:10:00,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:10:05,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:10:07,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:10:08,548 INFO [train.py:1046] (1/4) Epoch 48, batch 2600, loss[loss=0.1297, simple_loss=0.2094, pruned_loss=0.02496, over 24316.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2325, pruned_loss=0.03531, over 4709147.02 frames. ], batch size: 56, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:10:08,767 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 14:10:12,836 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 14:10:12,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:10:12,897 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 14:10:14,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 14:10:14,940 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 14:10:17,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:10:17,738 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 14:10:19,129 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 14:10:20,617 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 14:10:21,866 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.727e+02 2.037e+02 2.298e+02 2.610e+02 5.474e+02, threshold=4.596e+02, percent-clipped=1.0 2023-10-04 14:10:22,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:10:22,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1681866.6666666667, ans=0.125 2023-10-04 14:10:24,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 14:10:25,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 14:10:27,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 14:10:27,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 14:10:31,273 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 14:10:31,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 14:10:36,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:10:36,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:10:36,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:10:36,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 14:10:37,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1681933.3333333333, ans=0.0 2023-10-04 14:10:38,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:10:45,729 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 14:10:48,061 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.23 vs. limit=15.0 2023-10-04 14:10:50,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:10:50,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:10:51,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 14:10:51,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:10:51,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:10:53,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 14:10:54,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:10:54,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:10:58,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:11:02,542 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 14:11:02,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:11:02,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:11:05,527 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1682000.0, ans=0.0 2023-10-04 14:11:05,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1682000.0, ans=0.125 2023-10-04 14:11:08,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:11:08,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:11:08,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 14:11:09,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:11:10,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:11:11,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:11:17,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 14:11:18,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:11:20,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:11:22,763 INFO [train.py:1046] (1/4) Epoch 48, batch 2650, loss[loss=0.2048, simple_loss=0.272, pruned_loss=0.06883, over 19417.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.234, pruned_loss=0.03584, over 4712102.14 frames. ], batch size: 389, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:11:24,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 14:11:25,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:11:25,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:11:28,641 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 14:11:28,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:11:30,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:11:33,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:11:33,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:11:36,411 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:11:37,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 14:11:37,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:11:37,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:11:39,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 14:11:40,527 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 14:11:41,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:11:45,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 14:11:45,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:11:45,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 14:11:49,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:11:49,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:11:49,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:11:49,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:11:55,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 14:11:55,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 14:11:55,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1682266.6666666667, ans=0.1 2023-10-04 14:11:56,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:12:02,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 14:12:02,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:12:04,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:04,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:12:04,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:12:05,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:12:07,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:12:08,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:12:10,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:12:11,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:12:11,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:12:12,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:12:12,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:12:14,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:12:16,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:12:16,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:12:20,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:20,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:12:20,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:12:20,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 14:12:26,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:12:27,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:28,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:30,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:12:30,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:12:32,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:12:34,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:12:35,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1682466.6666666667, ans=0.125 2023-10-04 14:12:36,123 INFO [train.py:1046] (1/4) Epoch 48, batch 2700, loss[loss=0.1507, simple_loss=0.2232, pruned_loss=0.03911, over 23715.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2345, pruned_loss=0.03628, over 4712674.11 frames. ], batch size: 232, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:12:36,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 14:12:39,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:12:41,382 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.78 vs. limit=12.0 2023-10-04 14:12:41,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 14:12:43,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:12:43,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:12:44,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:12:45,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1682466.6666666667, ans=0.125 2023-10-04 14:12:46,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:12:46,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:12:47,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:12:47,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 14:12:47,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 14:12:48,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:12:49,268 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.735e+02 2.027e+02 2.284e+02 2.628e+02 4.660e+02, threshold=4.569e+02, percent-clipped=1.0 2023-10-04 14:12:49,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:12:49,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:12:50,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:53,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:12:54,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1682533.3333333333, ans=0.125 2023-10-04 14:12:55,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 14:12:55,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:13:00,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:13:00,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:13:06,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:13:06,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:13:07,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:13:07,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:13:09,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:13:09,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1682600.0, ans=0.0 2023-10-04 14:13:12,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:13:12,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:13:12,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:13:15,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:13:15,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:13:23,586 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.72 vs. limit=6.0 2023-10-04 14:13:25,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:13:26,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:13:30,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:13:30,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:13:30,973 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.52 vs. limit=15.0 2023-10-04 14:13:33,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1682666.6666666667, ans=0.2 2023-10-04 14:13:34,151 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.29 vs. limit=15.0 2023-10-04 14:13:34,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:13:35,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:13:35,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:13:37,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:13:38,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:13:38,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:13:41,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:13:41,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:13:41,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:13:45,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 14:13:45,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:13:49,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:13:49,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 14:13:49,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 14:13:50,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1682800.0, ans=0.125 2023-10-04 14:13:51,620 INFO [train.py:1046] (1/4) Epoch 48, batch 2750, loss[loss=0.1536, simple_loss=0.2191, pruned_loss=0.04403, over 23527.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.233, pruned_loss=0.03626, over 4697380.91 frames. ], batch size: 256, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:13:51,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:13:54,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:13:55,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:13:57,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:13:57,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:13:57,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:13:57,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1682800.0, ans=0.2 2023-10-04 14:14:01,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:14:03,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:14:03,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:14:03,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:14:03,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 14:14:03,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:14:03,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:14:08,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 14:14:09,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:14:10,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:14:10,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:14:10,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1682866.6666666667, ans=0.125 2023-10-04 14:14:12,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:14:12,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1682866.6666666667, ans=0.125 2023-10-04 14:14:13,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:14:13,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:14:14,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:14:15,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:14:19,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:14:19,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:14:19,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=1682933.3333333333, ans=10.0 2023-10-04 14:14:21,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:14:21,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:14:22,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:14:28,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:14:30,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:14:30,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:14:36,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:14:36,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:14:36,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:14:36,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1683000.0, ans=0.125 2023-10-04 14:14:42,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1683000.0, ans=0.04949747468305833 2023-10-04 14:14:43,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:14:43,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:14:43,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 14:14:46,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:14:49,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 14:14:54,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 14:14:55,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:14:57,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 14:14:57,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:14:59,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:14:59,945 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 14:15:01,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:15:05,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 14:15:06,641 INFO [train.py:1046] (1/4) Epoch 48, batch 2800, loss[loss=0.1489, simple_loss=0.2261, pruned_loss=0.03588, over 15340.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2329, pruned_loss=0.036, over 4702703.89 frames. ], batch size: 33, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:15:06,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:15:06,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:15:08,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 14:15:08,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:15:08,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:15:09,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:15:09,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1683133.3333333333, ans=0.125 2023-10-04 14:15:10,856 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 14:15:10,856 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 14:15:15,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:15:16,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:15:16,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:15:17,654 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.08 vs. limit=22.5 2023-10-04 14:15:19,684 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 2.049e+02 2.249e+02 2.706e+02 5.185e+02, threshold=4.498e+02, percent-clipped=5.0 2023-10-04 14:15:19,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:15:21,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 14:15:24,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 14:15:24,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 14:15:25,314 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.41 vs. limit=15.0 2023-10-04 14:15:27,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:15:27,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:15:27,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:15:30,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:15:30,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:15:30,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 14:15:32,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:15:41,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:15:42,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:15:45,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:15:47,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:15:47,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:15:51,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:15:52,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 14:15:53,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:15:54,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:15:54,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:15:57,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:15:57,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:16:00,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:16:02,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:16:02,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:16:02,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:16:05,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 14:16:06,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:16:06,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:16:06,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 14:16:06,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:16:09,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:16:09,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:16:10,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 14:16:12,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:16:12,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:16:12,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:16:13,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 14:16:17,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1683400.0, ans=0.04949747468305833 2023-10-04 14:16:19,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:16:19,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:16:19,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:16:20,886 INFO [train.py:1046] (1/4) Epoch 48, batch 2850, loss[loss=0.1531, simple_loss=0.2375, pruned_loss=0.0343, over 24676.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2333, pruned_loss=0.03619, over 4711728.37 frames. ], batch size: 65, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:16:22,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:16:25,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:16:26,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:16:26,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:16:28,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:16:28,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:16:29,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:16:31,228 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 14:16:35,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 14:16:35,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:16:38,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 14:16:39,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:16:42,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 14:16:43,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 14:16:43,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:16:47,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1683533.3333333333, ans=0.125 2023-10-04 14:16:57,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:16:58,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:16:58,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:17:00,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:17:00,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:17:00,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:17:01,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:17:02,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 14:17:06,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:17:06,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:17:08,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:17:08,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:17:10,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:17:10,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:17:12,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:17:13,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:17:15,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:17:15,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:17:17,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:17:21,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:17:24,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:17:24,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1683733.3333333333, ans=0.125 2023-10-04 14:17:25,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 14:17:25,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 14:17:27,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:17:28,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:17:28,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 14:17:28,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:17:30,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:17:30,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:17:30,297 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:17:30,297 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 14:17:31,507 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 14:17:31,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:17:32,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:17:34,201 INFO [train.py:1046] (1/4) Epoch 48, batch 2900, loss[loss=0.1588, simple_loss=0.2428, pruned_loss=0.03739, over 24688.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2336, pruned_loss=0.03612, over 4711773.06 frames. ], batch size: 73, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:17:35,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 14:17:36,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:17:37,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:17:39,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 14:17:41,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:17:41,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 14:17:43,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 14:17:43,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:17:43,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:17:47,415 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.804e+02 2.112e+02 2.348e+02 2.859e+02 4.205e+02, threshold=4.696e+02, percent-clipped=0.0 2023-10-04 14:17:47,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:17:47,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:17:49,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1683866.6666666667, ans=0.125 2023-10-04 14:17:50,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:17:52,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:17:55,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:17:55,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 14:17:55,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:17:56,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:17:57,364 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.36 vs. limit=12.0 2023-10-04 14:17:59,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 14:17:59,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 14:18:02,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:18:02,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 14:18:03,605 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:18:05,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:18:05,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 14:18:08,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:18:09,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:18:10,432 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.36 vs. limit=15.0 2023-10-04 14:18:13,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:18:16,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:18:18,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 14:18:18,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 14:18:18,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:18:23,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:18:24,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 14:18:26,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:18:29,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:18:37,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:18:38,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:18:39,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 14:18:42,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:18:42,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 14:18:43,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:18:43,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:18:47,734 INFO [train.py:1046] (1/4) Epoch 48, batch 2950, loss[loss=0.1584, simple_loss=0.234, pruned_loss=0.04134, over 23907.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2341, pruned_loss=0.03623, over 4715317.47 frames. ], batch size: 196, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:18:51,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:18:52,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 14:18:52,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:18:52,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:18:54,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:18:55,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:18:55,808 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 14:18:57,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 14:18:57,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:18:57,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:19:04,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:19:05,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:19:08,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:19:10,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:19:12,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:19:12,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:19:15,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:19:16,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:19:16,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:19:19,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 14:19:19,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1684266.6666666667, ans=0.0 2023-10-04 14:19:22,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 14:19:24,271 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 14:19:24,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:19:25,828 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 14:19:27,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 14:19:27,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:19:27,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:19:27,793 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 14:19:27,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:19:30,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 14:19:31,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:19:31,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:19:34,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:19:37,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:19:37,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:19:37,317 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 14:19:37,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:19:37,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 14:19:42,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:19:44,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:19:44,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 14:19:44,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:19:46,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1684400.0, ans=0.1 2023-10-04 14:19:47,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 14:19:51,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:19:51,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:19:51,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:19:54,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:19:54,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 14:19:56,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:19:56,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:19:56,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:19:58,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:19:58,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:19:59,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:20:01,170 INFO [train.py:1046] (1/4) Epoch 48, batch 3000, loss[loss=0.1558, simple_loss=0.2284, pruned_loss=0.0416, over 23759.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2348, pruned_loss=0.03659, over 4713994.53 frames. ], batch size: 232, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:20:01,170 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 14:20:13,598 INFO [train.py:1078] (1/4) Epoch 48, validation: loss=0.3623, simple_loss=0.2785, pruned_loss=0.223, over 1125622.00 frames. 2023-10-04 14:20:13,598 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 14:20:13,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:20:13,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 14:20:13,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:20:16,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:20:17,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:20:19,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1684466.6666666667, ans=0.125 2023-10-04 14:20:21,003 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 14:20:21,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 14:20:22,464 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:20:22,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:20:24,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 14:20:24,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:20:27,082 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 2.063e+02 2.332e+02 2.863e+02 4.745e+02, threshold=4.665e+02, percent-clipped=1.0 2023-10-04 14:20:30,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:20:40,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:20:47,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 14:20:49,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:20:51,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:20:51,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:20:51,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:20:52,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:20:53,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 14:20:54,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 14:20:55,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:20:57,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:20:59,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:20:59,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:21:00,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:21:00,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:21:02,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1684666.6666666667, ans=0.0 2023-10-04 14:21:03,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1684666.6666666667, ans=0.125 2023-10-04 14:21:04,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:21:04,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:21:04,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:21:06,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:21:10,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 14:21:11,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:21:11,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:21:11,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:21:15,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:21:16,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:21:17,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 14:21:17,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 14:21:17,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1684733.3333333333, ans=0.0 2023-10-04 14:21:18,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:21:18,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 14:21:18,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:21:19,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1684733.3333333333, ans=0.125 2023-10-04 14:21:20,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 14:21:25,658 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:21:25,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:21:25,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 14:21:25,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1684733.3333333333, ans=0.0 2023-10-04 14:21:27,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 14:21:27,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 14:21:27,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:21:29,219 INFO [train.py:1046] (1/4) Epoch 48, batch 3050, loss[loss=0.1468, simple_loss=0.2334, pruned_loss=0.03016, over 24476.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2357, pruned_loss=0.03687, over 4709876.31 frames. ], batch size: 63, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:21:30,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:21:30,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 14:21:30,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:21:30,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:21:32,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 14:21:33,609 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:21:36,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:21:36,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:21:39,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:21:42,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 14:21:42,914 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.34 vs. limit=15.0 2023-10-04 14:21:45,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 14:21:46,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 14:21:46,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:21:51,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:21:55,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:21:55,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:21:55,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:21:55,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1684866.6666666667, ans=0.0 2023-10-04 14:22:00,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:22:00,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:22:01,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:22:01,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:22:01,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:22:04,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:22:05,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:22:08,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:22:08,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 14:22:09,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:22:09,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:22:12,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:22:13,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:22:13,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:22:14,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:22:17,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:22:17,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:22:23,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:22:25,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:22:25,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:22:27,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:22:29,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:22:30,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:22:30,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 14:22:31,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:22:31,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:22:34,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 14:22:35,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:22:41,466 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:22:43,325 INFO [train.py:1046] (1/4) Epoch 48, batch 3100, loss[loss=0.1461, simple_loss=0.2267, pruned_loss=0.03277, over 24615.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2357, pruned_loss=0.03687, over 4708395.87 frames. ], batch size: 60, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:22:43,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:22:44,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:22:46,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 14:22:49,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 14:22:50,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 14:22:50,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:22:53,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:22:53,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:22:57,527 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.59 vs. limit=15.0 2023-10-04 14:22:58,167 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.071e+02 2.302e+02 2.680e+02 4.838e+02, threshold=4.605e+02, percent-clipped=1.0 2023-10-04 14:22:58,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 14:23:02,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:23:06,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 14:23:11,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 14:23:11,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:12,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:23:13,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:23:14,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 14:23:17,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:23:17,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 14:23:17,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:23:18,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:23:18,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 14:23:20,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:23:21,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:23:23,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 14:23:23,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 14:23:24,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:25,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:23:28,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:23:28,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:28,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:23:31,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:23:31,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:23:33,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:23:33,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:23:33,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:33,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 14:23:38,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:23:39,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 14:23:41,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:23:42,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 14:23:43,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1685400.0, ans=0.125 2023-10-04 14:23:44,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:23:44,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:44,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 14:23:54,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1685400.0, ans=0.0 2023-10-04 14:23:56,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 14:23:57,943 INFO [train.py:1046] (1/4) Epoch 48, batch 3150, loss[loss=0.1314, simple_loss=0.2128, pruned_loss=0.02507, over 21490.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2339, pruned_loss=0.0362, over 4707545.33 frames. ], batch size: 47, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:23:59,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:23:59,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:24:00,865 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:24:00,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:24:02,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 14:24:03,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:24:04,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 14:24:06,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 14:24:08,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:24:09,778 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 14:24:12,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 14:24:12,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:24:12,783 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 14:24:14,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 14:24:14,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 14:24:14,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 14:24:16,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 14:24:16,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:24:16,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:24:17,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:24:18,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 14:24:19,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1685533.3333333333, ans=0.2 2023-10-04 14:24:20,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:24:20,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:24:20,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:24:21,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1685533.3333333333, ans=0.125 2023-10-04 14:24:23,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 14:24:24,127 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.73 vs. limit=15.0 2023-10-04 14:24:29,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 14:24:29,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:24:32,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:24:33,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:24:33,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 14:24:36,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 14:24:36,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:24:37,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 14:24:37,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 14:24:39,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:24:39,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:24:39,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:24:40,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:24:40,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 14:24:42,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:24:42,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:24:42,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:24:42,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:24:43,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 14:24:44,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1685666.6666666667, ans=0.125 2023-10-04 14:24:45,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:24:46,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 14:24:46,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:24:48,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 14:24:48,937 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.29 vs. limit=15.0 2023-10-04 14:24:49,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 14:24:49,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:24:50,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:24:50,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 14:24:52,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 14:24:52,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:24:56,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:24:57,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:24:57,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:25:03,000 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:25:03,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:25:04,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 14:25:08,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1685733.3333333333, ans=0.1 2023-10-04 14:25:09,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:25:09,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 14:25:12,334 INFO [train.py:1046] (1/4) Epoch 48, batch 3200, loss[loss=0.1492, simple_loss=0.22, pruned_loss=0.03919, over 23539.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2318, pruned_loss=0.03562, over 4695311.07 frames. ], batch size: 256, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:25:12,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:25:13,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:25:13,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 14:25:16,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:25:19,792 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.76 vs. limit=15.0 2023-10-04 14:25:21,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:25:25,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:25:27,012 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.180e+02 2.542e+02 3.306e+02 4.972e+02, threshold=5.085e+02, percent-clipped=5.0 2023-10-04 14:25:33,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:25:35,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1685866.6666666667, ans=0.2 2023-10-04 14:25:41,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 14:25:43,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:25:43,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1685933.3333333333, ans=0.125 2023-10-04 14:25:46,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 14:25:47,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:25:52,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:25:52,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:25:54,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:25:56,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1686000.0, ans=0.0 2023-10-04 14:25:57,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 14:25:58,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 14:26:00,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 14:26:04,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 14:26:06,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:26:14,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:26:14,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:26:14,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:26:14,344 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 14:26:14,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:26:19,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:26:20,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 14:26:20,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 14:26:21,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 14:26:23,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 14:26:24,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:26:26,711 INFO [train.py:1046] (1/4) Epoch 48, batch 3250, loss[loss=0.1508, simple_loss=0.2334, pruned_loss=0.03412, over 24444.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2319, pruned_loss=0.03539, over 4703057.10 frames. ], batch size: 63, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:26:26,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:26:28,102 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 14:26:28,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:26:28,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:28,248 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 14:26:32,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:26:35,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:26:40,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:26:41,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 14:26:43,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:26:43,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:26:43,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:26:44,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:26:44,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:26:47,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:47,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:26:48,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:26:49,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:49,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:49,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:26:49,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1686200.0, ans=0.07 2023-10-04 14:26:52,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:26:53,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:26:54,832 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.04 vs. limit=15.0 2023-10-04 14:26:56,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:26:56,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:57,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:26:59,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:26:59,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:27:04,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 14:27:05,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:27:05,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:27:06,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:27:06,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:27:12,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:27:17,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1686333.3333333333, ans=0.05 2023-10-04 14:27:21,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:27:21,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:27:21,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 14:27:21,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:27:21,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 14:27:22,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:27:24,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 14:27:24,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 14:27:24,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1686400.0, ans=0.125 2023-10-04 14:27:25,467 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.30 vs. limit=6.0 2023-10-04 14:27:26,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:27:27,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:27:27,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:27:27,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 14:27:29,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:27:30,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1686400.0, ans=0.125 2023-10-04 14:27:30,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1686400.0, ans=0.1 2023-10-04 14:27:33,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:27:33,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:27:36,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 14:27:36,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:27:38,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1686400.0, ans=0.125 2023-10-04 14:27:39,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:27:39,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 14:27:39,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1686466.6666666667, ans=0.0 2023-10-04 14:27:40,901 INFO [train.py:1046] (1/4) Epoch 48, batch 3300, loss[loss=0.1685, simple_loss=0.2575, pruned_loss=0.03981, over 24371.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2329, pruned_loss=0.03543, over 4715360.39 frames. ], batch size: 77, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:27:43,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:27:43,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 14:27:45,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 14:27:46,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 14:27:47,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:27:50,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:27:51,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:27:51,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:27:52,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:27:53,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 14:27:56,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:27:57,694 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.840e+02 2.042e+02 2.235e+02 2.474e+02 3.621e+02, threshold=4.470e+02, percent-clipped=0.0 2023-10-04 14:27:57,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:28:02,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 14:28:02,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:28:02,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:28:04,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:04,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1686533.3333333333, ans=0.0 2023-10-04 14:28:05,637 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 14:28:05,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:28:05,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:28:07,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:28:07,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:28:08,384 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 14:28:11,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:28:11,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:28:13,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:13,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 14:28:14,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 14:28:16,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:17,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:28:18,904 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 14:28:20,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 14:28:21,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:28:23,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 14:28:26,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:28:28,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 14:28:28,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:28:30,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:28:30,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:28:30,957 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:28:32,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:28:34,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:28:34,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:35,329 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.35 vs. limit=10.0 2023-10-04 14:28:36,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:28:37,528 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 14:28:37,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 14:28:39,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:28:39,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=1686666.6666666667, ans=15.0 2023-10-04 14:28:40,474 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:28:40,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:28:41,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:28:41,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:28:43,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:28:45,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:28:45,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:28:45,604 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.86 vs. limit=12.0 2023-10-04 14:28:46,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:47,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:28:50,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 14:28:50,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:28:51,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:28:53,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:28:53,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:28:54,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:28:56,576 INFO [train.py:1046] (1/4) Epoch 48, batch 3350, loss[loss=0.146, simple_loss=0.2271, pruned_loss=0.03244, over 24604.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.234, pruned_loss=0.03596, over 4722321.91 frames. ], batch size: 60, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:28:56,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:28:56,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:28:59,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:28:59,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1686800.0, ans=0.1 2023-10-04 14:29:00,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:02,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:29:05,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:07,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:29:08,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:29:08,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:29:10,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 14:29:11,654 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 14:29:12,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:29:17,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 14:29:17,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 14:29:18,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:29:18,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:29:20,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:29:20,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 14:29:20,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:20,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:29:23,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:25,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:25,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:26,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:29:29,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:29:32,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:32,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:29:37,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:29:38,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:40,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:40,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:29:41,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:29:43,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 14:29:44,500 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:29:44,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 14:29:44,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:29:46,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 14:29:46,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:29:47,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:53,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1687000.0, ans=0.2 2023-10-04 14:29:54,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:29:56,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 14:29:56,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:29:57,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:29:58,506 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.83 vs. limit=15.0 2023-10-04 14:29:59,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:30:05,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:30:06,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 14:30:07,192 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.66 vs. limit=15.0 2023-10-04 14:30:08,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:30:08,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:30:10,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:30:11,861 INFO [train.py:1046] (1/4) Epoch 48, batch 3400, loss[loss=0.1564, simple_loss=0.2293, pruned_loss=0.04179, over 23818.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.235, pruned_loss=0.03641, over 4709403.91 frames. ], batch size: 164, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:30:11,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 14:30:11,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:30:11,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 14:30:14,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:30:14,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:30:16,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:30:16,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:30:16,194 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 14:30:20,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1687133.3333333333, ans=0.125 2023-10-04 14:30:22,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 14:30:22,255 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 14:30:22,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:30:26,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:30:26,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:30:27,670 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.768e+02 2.082e+02 2.351e+02 2.856e+02 4.234e+02, threshold=4.702e+02, percent-clipped=0.0 2023-10-04 14:30:27,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:30:29,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:30:30,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1687200.0, ans=0.1 2023-10-04 14:30:35,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:30:36,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 14:30:39,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:30:43,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:30:43,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:30:43,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 14:30:47,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:30:51,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 14:30:57,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:30:57,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:30:57,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 14:30:57,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:30:57,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1687333.3333333333, ans=0.125 2023-10-04 14:30:58,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:31:00,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:31:00,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:31:02,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:31:06,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:31:06,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:31:11,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:31:13,329 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 14:31:18,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:31:19,749 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.77 vs. limit=6.0 2023-10-04 14:31:23,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 14:31:23,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1687400.0, ans=0.2 2023-10-04 14:31:26,012 INFO [train.py:1046] (1/4) Epoch 48, batch 3450, loss[loss=0.151, simple_loss=0.2071, pruned_loss=0.04744, over 19898.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2348, pruned_loss=0.03656, over 4697757.58 frames. ], batch size: 389, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:31:27,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 14:31:27,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:31:28,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:31:28,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 14:31:30,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:31:33,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:31:37,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1687466.6666666667, ans=0.1 2023-10-04 14:31:38,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:31:40,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:31:40,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:31:40,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:31:43,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:31:46,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1687533.3333333333, ans=0.1 2023-10-04 14:31:50,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 14:31:56,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 14:31:56,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:31:56,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:31:57,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:32:04,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 14:32:04,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:32:08,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:32:08,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:32:09,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:32:11,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:32:12,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 14:32:14,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:32:16,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:32:18,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:32:20,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 14:32:22,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:32:28,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:32:29,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1687733.3333333333, ans=0.125 2023-10-04 14:32:30,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:32:32,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:32:36,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:32:37,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:32:37,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:32:39,006 INFO [train.py:1046] (1/4) Epoch 48, batch 3500, loss[loss=0.1449, simple_loss=0.222, pruned_loss=0.03387, over 23238.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2341, pruned_loss=0.03576, over 4714466.05 frames. ], batch size: 105, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:32:39,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:32:42,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1687800.0, ans=0.125 2023-10-04 14:32:43,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:32:46,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:32:46,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 14:32:49,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:32:52,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 14:32:53,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:32:53,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 14:32:54,996 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 2.040e+02 2.265e+02 2.652e+02 4.123e+02, threshold=4.530e+02, percent-clipped=0.0 2023-10-04 14:32:58,507 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:32:59,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:32:59,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:32:59,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:33:00,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 14:33:01,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:02,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:33:02,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 14:33:05,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:05,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 14:33:07,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:33:08,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1687933.3333333333, ans=0.2 2023-10-04 14:33:09,304 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.02 vs. limit=15.0 2023-10-04 14:33:12,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:12,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 14:33:12,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:33:15,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:33:15,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:33:16,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:16,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1687933.3333333333, ans=0.1 2023-10-04 14:33:18,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:33:18,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:33:19,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 14:33:20,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 14:33:20,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 14:33:20,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:33:23,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:25,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:33:25,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:33:28,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 14:33:30,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:33:34,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:33:36,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 14:33:36,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 14:33:36,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:33:37,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:33:39,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:33:41,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:44,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 14:33:44,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:33:44,424 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:33:46,871 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:33:46,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 14:33:49,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 14:33:51,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1688066.6666666667, ans=0.125 2023-10-04 14:33:52,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:53,600 INFO [train.py:1046] (1/4) Epoch 48, batch 3550, loss[loss=0.164, simple_loss=0.2383, pruned_loss=0.04485, over 23831.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2323, pruned_loss=0.03559, over 4706676.62 frames. ], batch size: 212, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:33:53,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:33:53,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:33:53,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:33:56,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:34:05,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:34:07,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1688200.0, ans=0.125 2023-10-04 14:34:08,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 14:34:10,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:34:12,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:34:13,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:34:13,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:34:13,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:34:18,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:34:19,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:34:19,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:34:19,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 14:34:20,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:34:22,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1688266.6666666667, ans=0.1 2023-10-04 14:34:25,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:34:25,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:34:28,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:34:28,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:34:29,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:34:29,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 14:34:29,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:34:30,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:34:32,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 14:34:36,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:34:38,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:34:40,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:34:41,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 14:34:42,125 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.01 vs. limit=15.0 2023-10-04 14:34:43,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:34:44,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 14:34:44,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:34:47,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:34:47,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:34:49,470 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.95 vs. limit=15.0 2023-10-04 14:34:50,326 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 14:34:51,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:34:51,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1688400.0, ans=0.125 2023-10-04 14:34:55,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:34:56,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 14:34:57,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:35:00,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:35:01,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 14:35:02,453 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.57 vs. limit=15.0 2023-10-04 14:35:06,889 INFO [train.py:1046] (1/4) Epoch 48, batch 3600, loss[loss=0.1383, simple_loss=0.2242, pruned_loss=0.02622, over 24336.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2318, pruned_loss=0.03532, over 4715656.36 frames. ], batch size: 61, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:35:10,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 14:35:10,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:35:11,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:35:14,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:35:14,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:35:16,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:35:18,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:35:19,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:35:20,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:35:21,377 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.68 vs. limit=15.0 2023-10-04 14:35:21,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:35:23,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:35:23,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 14:35:25,592 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 2.134e+02 2.513e+02 3.119e+02 5.278e+02, threshold=5.026e+02, percent-clipped=3.0 2023-10-04 14:35:25,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:35:27,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:35:29,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:35:32,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:35:32,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:35:34,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:35:34,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 14:35:34,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:35:36,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:35:36,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:35:40,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:35:41,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:35:42,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:35:43,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 14:35:45,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1688600.0, ans=0.1 2023-10-04 14:35:49,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:35:50,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:35:52,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 14:35:56,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:35:59,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:36:02,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:36:06,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:36:06,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:36:06,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 14:36:08,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 14:36:10,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 14:36:11,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:36:13,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:36:14,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 14:36:14,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:36:14,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:36:14,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:36:15,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 14:36:15,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 14:36:18,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:36:18,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 14:36:21,509 INFO [train.py:1046] (1/4) Epoch 48, batch 3650, loss[loss=0.1608, simple_loss=0.235, pruned_loss=0.04331, over 23634.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2326, pruned_loss=0.03564, over 4717689.56 frames. ], batch size: 232, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:36:22,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 14:36:24,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:36:28,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 14:36:29,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 14:36:32,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1688800.0, ans=0.2 2023-10-04 14:36:35,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:36:35,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:36:36,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:36:40,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:36:40,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:36:42,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 14:36:43,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:36:44,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:36:44,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 14:36:46,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 14:36:46,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:36:46,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:36:49,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:36:50,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 14:36:52,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 14:36:53,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:36:55,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 14:36:57,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:36:57,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:37:03,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:37:04,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:37:05,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:37:06,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:37:06,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:37:09,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:37:12,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:37:12,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:37:12,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1689000.0, ans=0.125 2023-10-04 14:37:14,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:37:16,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:37:17,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:37:17,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:37:21,795 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 14:37:22,452 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.22 vs. limit=15.0 2023-10-04 14:37:25,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:37:25,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:37:27,307 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:37:27,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:37:28,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:37:29,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:37:31,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 14:37:31,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:37:34,170 INFO [train.py:1046] (1/4) Epoch 48, batch 3700, loss[loss=0.1385, simple_loss=0.2318, pruned_loss=0.02258, over 24565.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2333, pruned_loss=0.03576, over 4727336.93 frames. ], batch size: 71, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:37:34,244 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:37:35,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:37:36,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:37:39,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:37:39,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 14:37:39,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:37:41,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 14:37:42,949 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:37:46,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:37:51,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:37:51,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:37:52,597 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 2.073e+02 2.286e+02 2.591e+02 3.912e+02, threshold=4.572e+02, percent-clipped=0.0 2023-10-04 14:37:52,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:37:52,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:37:53,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 14:37:54,577 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.06 vs. limit=15.0 2023-10-04 14:37:55,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:37:56,957 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 14:38:01,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1689200.0, ans=0.125 2023-10-04 14:38:04,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:38:04,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 14:38:05,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:38:05,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 14:38:05,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:38:10,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:38:12,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 14:38:13,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:38:13,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:38:17,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:38:17,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:38:19,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 14:38:24,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:38:24,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 14:38:24,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:38:24,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 14:38:29,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:38:31,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:38:33,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:38:33,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 14:38:35,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:38:35,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:38:36,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:38:36,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:38:39,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:38:40,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 14:38:40,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1689400.0, ans=0.2 2023-10-04 14:38:41,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 14:38:43,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:38:43,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:38:45,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:38:45,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:38:46,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:38:48,519 INFO [train.py:1046] (1/4) Epoch 48, batch 3750, loss[loss=0.1535, simple_loss=0.2448, pruned_loss=0.03106, over 24572.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2346, pruned_loss=0.03578, over 4733771.74 frames. ], batch size: 71, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:38:49,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:38:49,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:38:51,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 14:38:53,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1689466.6666666667, ans=0.125 2023-10-04 14:38:54,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 14:38:56,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 14:38:56,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 14:38:57,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:38:58,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:39:00,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:39:01,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:39:03,819 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.13 vs. limit=12.0 2023-10-04 14:39:04,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:39:07,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:39:08,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:39:10,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:39:12,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:39:12,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 14:39:12,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:39:16,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:39:16,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:39:19,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 14:39:21,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1689600.0, ans=0.125 2023-10-04 14:39:22,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 14:39:23,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1689600.0, ans=0.0 2023-10-04 14:39:24,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:39:24,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:39:27,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:39:31,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:39:32,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 14:39:32,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1689666.6666666667, ans=0.09899494936611666 2023-10-04 14:39:35,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 14:39:36,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:39:40,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:39:42,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:39:46,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:39:49,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 14:39:51,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:39:54,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:39:55,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:39:56,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:39:57,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1689733.3333333333, ans=0.125 2023-10-04 14:40:00,941 INFO [train.py:1046] (1/4) Epoch 48, batch 3800, loss[loss=0.1386, simple_loss=0.2158, pruned_loss=0.03067, over 23641.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2347, pruned_loss=0.0362, over 4723870.89 frames. ], batch size: 134, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:40:03,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:40:07,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:40:07,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 14:40:09,245 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 14:40:10,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:40:12,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:40:12,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 14:40:13,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1689800.0, ans=0.125 2023-10-04 14:40:14,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 14:40:14,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:40:16,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:40:17,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:40:17,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:40:17,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:40:18,859 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.051e+02 2.338e+02 2.944e+02 4.276e+02, threshold=4.676e+02, percent-clipped=0.0 2023-10-04 14:40:20,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 14:40:24,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 14:40:26,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:40:27,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:40:29,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:40:30,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:40:30,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:40:30,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:40:32,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:40:33,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:40:39,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:40:39,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 14:40:39,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1689933.3333333333, ans=0.2 2023-10-04 14:40:40,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:40:47,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:40:53,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:40:54,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1690000.0, ans=0.0 2023-10-04 14:40:54,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1690000.0, ans=0.2 2023-10-04 14:40:55,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 14:40:58,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 14:40:58,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:41:00,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:41:01,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:41:04,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 14:41:07,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 14:41:07,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1690066.6666666667, ans=0.125 2023-10-04 14:41:08,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 14:41:08,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:41:09,554 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.17 vs. limit=6.0 2023-10-04 14:41:10,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:41:14,183 INFO [train.py:1046] (1/4) Epoch 48, batch 3850, loss[loss=0.1332, simple_loss=0.1997, pruned_loss=0.03331, over 22704.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2337, pruned_loss=0.03614, over 4707973.54 frames. ], batch size: 322, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:41:14,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:41:14,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:41:19,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:41:19,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 14:41:21,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:41:21,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:41:25,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 14:41:27,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:41:27,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1690133.3333333333, ans=0.2 2023-10-04 14:41:29,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 14:41:29,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 14:41:36,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:38,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:41:39,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:41:41,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:41:42,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1690266.6666666667, ans=0.1 2023-10-04 14:41:44,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:45,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:41:46,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:41:46,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:41:46,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:41:50,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:41:51,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:51,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:41:52,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 14:41:52,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 14:41:53,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:41:53,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:56,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:41:56,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:56,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 14:41:56,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1690266.6666666667, ans=0.0 2023-10-04 14:41:59,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 14:42:00,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:42:01,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 14:42:03,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 14:42:08,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:42:08,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:42:13,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:42:15,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 14:42:16,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1690400.0, ans=0.2 2023-10-04 14:42:18,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 14:42:21,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:42:21,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:42:24,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:42:24,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:42:24,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:25,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:25,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:42:25,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 14:42:26,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:42:28,217 INFO [train.py:1046] (1/4) Epoch 48, batch 3900, loss[loss=0.1515, simple_loss=0.2246, pruned_loss=0.03915, over 22757.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.232, pruned_loss=0.03572, over 4713181.08 frames. ], batch size: 322, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:42:28,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 14:42:30,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:30,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:42:31,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:42:31,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:31,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1690466.6666666667, ans=0.125 2023-10-04 14:42:33,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:42:34,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:42:34,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:42:35,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:42:35,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 14:42:35,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:36,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1690466.6666666667, ans=0.125 2023-10-04 14:42:39,326 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.15 vs. limit=15.0 2023-10-04 14:42:40,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:42:41,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:42:41,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:42:42,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:42:43,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1690533.3333333333, ans=0.1 2023-10-04 14:42:44,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:42:44,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:45,739 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.750e+02 2.072e+02 2.246e+02 2.580e+02 4.191e+02, threshold=4.491e+02, percent-clipped=0.0 2023-10-04 14:42:47,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:42:47,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 14:42:47,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:42:49,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 14:42:50,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:51,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 14:42:53,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 14:42:56,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1690600.0, ans=0.125 2023-10-04 14:42:57,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:42:57,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:42:57,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:42:59,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:43:03,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:43:04,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:43:06,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:43:06,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:43:07,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:43:13,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:43:13,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:43:20,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:43:22,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:43:25,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1690733.3333333333, ans=0.125 2023-10-04 14:43:31,278 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:43:34,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:43:34,146 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 14:43:35,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 14:43:35,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:43:35,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 14:43:38,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:43:38,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 14:43:41,089 INFO [train.py:1046] (1/4) Epoch 48, batch 3950, loss[loss=0.1474, simple_loss=0.2364, pruned_loss=0.02921, over 24624.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2321, pruned_loss=0.03518, over 4720556.45 frames. ], batch size: 68, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:43:41,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1690800.0, ans=0.5 2023-10-04 14:43:45,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:43:45,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 14:43:47,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:43:48,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:43:49,228 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.58 vs. limit=15.0 2023-10-04 14:43:50,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:43:55,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1690866.6666666667, ans=0.1 2023-10-04 14:43:57,844 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 14:43:59,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:43:59,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 14:43:59,268 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 14:43:59,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:44:02,150 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.32 vs. limit=22.5 2023-10-04 14:44:02,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:44:02,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:44:03,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:44:05,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 14:44:08,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:44:09,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:44:09,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:44:09,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:44:09,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1690933.3333333333, ans=0.125 2023-10-04 14:44:11,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:44:21,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:44:21,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:44:26,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 14:44:30,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 14:44:30,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 14:44:31,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:44:31,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:44:39,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:44:40,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:44:40,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:44:40,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:44:40,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 14:44:46,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:44:46,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:44:49,925 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.80 vs. limit=15.0 2023-10-04 14:44:50,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 14:44:50,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1691066.6666666667, ans=0.2 2023-10-04 14:44:55,220 INFO [train.py:1046] (1/4) Epoch 48, batch 4000, loss[loss=0.1493, simple_loss=0.243, pruned_loss=0.02785, over 24428.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2332, pruned_loss=0.03577, over 4700613.19 frames. ], batch size: 69, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:44:59,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:45:03,308 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.92 vs. limit=15.0 2023-10-04 14:45:05,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:45:09,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:45:09,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:45:10,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:45:11,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 14:45:11,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:45:11,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 14:45:11,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:45:12,667 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 2.159e+02 2.640e+02 3.092e+02 4.998e+02, threshold=5.279e+02, percent-clipped=1.0 2023-10-04 14:45:12,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 14:45:15,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:45:17,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:45:17,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:45:17,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:45:17,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:45:17,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1691200.0, ans=0.0 2023-10-04 14:45:19,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 14:45:19,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:45:21,095 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 14:45:22,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:45:23,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:45:25,126 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 14:45:26,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:45:26,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:45:29,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1691266.6666666667, ans=0.0 2023-10-04 14:45:31,903 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 14:45:31,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:45:33,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:45:35,219 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 14:45:36,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:45:37,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 14:45:37,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:45:39,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:45:39,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:45:42,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:45:43,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:45:43,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:45:46,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 14:45:46,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:45:47,772 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 14:45:52,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:45:55,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 14:45:58,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:45:59,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:46:00,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:46:02,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:46:05,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:46:06,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 14:46:06,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 14:46:08,361 INFO [train.py:1046] (1/4) Epoch 48, batch 4050, loss[loss=0.1679, simple_loss=0.2432, pruned_loss=0.04628, over 23743.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2336, pruned_loss=0.03594, over 4710108.46 frames. ], batch size: 212, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:46:08,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:46:09,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:46:09,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:46:11,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:46:12,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:46:13,091 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.48 vs. limit=15.0 2023-10-04 14:46:14,381 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.88 vs. limit=15.0 2023-10-04 14:46:17,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:46:20,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:46:20,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:46:22,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:46:24,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:46:27,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:46:28,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:46:30,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 14:46:31,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 14:46:31,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1691533.3333333333, ans=0.125 2023-10-04 14:46:31,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1691533.3333333333, ans=0.2 2023-10-04 14:46:32,741 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 14:46:34,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:46:40,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 14:46:41,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:46:43,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:46:45,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:46:46,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:46:46,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:46:48,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1691600.0, ans=0.09899494936611666 2023-10-04 14:46:50,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:46:53,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 14:46:53,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:46:54,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:46:56,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 14:47:00,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:47:03,631 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.88 vs. limit=15.0 2023-10-04 14:47:07,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 14:47:08,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:47:08,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:47:10,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 14:47:10,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 14:47:10,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:47:11,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:47:13,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:13,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:47:18,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 14:47:18,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 14:47:21,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 14:47:21,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 14:47:21,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:47:23,032 INFO [train.py:1046] (1/4) Epoch 48, batch 4100, loss[loss=0.2027, simple_loss=0.2708, pruned_loss=0.06728, over 19639.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.235, pruned_loss=0.03692, over 4708285.96 frames. ], batch size: 388, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:47:23,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:23,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:23,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:47:23,229 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 14:47:25,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:47:27,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:47:27,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:47:28,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:47:33,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:47:33,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:47:33,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:47:33,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 14:47:36,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:36,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:47:36,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:47:36,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:47:36,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 14:47:36,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1691866.6666666667, ans=0.025 2023-10-04 14:47:39,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1691866.6666666667, ans=0.0 2023-10-04 14:47:40,385 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.712e+02 2.089e+02 2.278e+02 2.511e+02 3.603e+02, threshold=4.556e+02, percent-clipped=0.0 2023-10-04 14:47:41,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:47:42,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 14:47:43,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:47:47,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:47:47,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 14:47:47,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:47:48,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:47:48,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:47:51,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 14:47:52,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:47:54,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:47:56,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 14:47:56,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:58,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:48:00,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:48:05,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:48:10,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:48:10,797 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.28 vs. limit=15.0 2023-10-04 14:48:11,497 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:48:20,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:48:20,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:48:23,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:48:26,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:48:29,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1692066.6666666667, ans=0.1 2023-10-04 14:48:30,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:48:31,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:48:31,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:48:31,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:48:34,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 14:48:34,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1692133.3333333333, ans=0.2 2023-10-04 14:48:35,908 INFO [train.py:1046] (1/4) Epoch 48, batch 4150, loss[loss=0.1459, simple_loss=0.238, pruned_loss=0.02687, over 24581.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2345, pruned_loss=0.03679, over 4706225.74 frames. ], batch size: 71, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:48:35,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:48:36,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 14:48:37,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 14:48:37,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 14:48:37,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1692133.3333333333, ans=0.1 2023-10-04 14:48:39,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:48:39,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1692133.3333333333, ans=0.0 2023-10-04 14:48:43,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:48:43,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:48:48,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:48:49,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:48:49,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:48:50,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 14:48:50,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:48:52,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 14:48:56,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:48:56,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1692200.0, ans=0.95 2023-10-04 14:48:58,088 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.40 vs. limit=10.0 2023-10-04 14:49:00,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:49:01,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 14:49:03,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 14:49:03,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:49:04,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 14:49:04,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:49:04,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:49:08,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:49:10,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:49:13,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 14:49:15,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:49:17,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:49:17,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 14:49:17,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:49:19,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 14:49:22,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:49:22,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:49:24,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:49:24,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1692333.3333333333, ans=0.125 2023-10-04 14:49:25,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 14:49:25,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:49:25,539 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:49:28,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:49:31,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 14:49:31,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:49:31,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:49:31,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:49:31,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 14:49:31,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:49:31,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1692333.3333333333, ans=0.0 2023-10-04 14:49:32,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:49:32,631 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:49:33,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:49:33,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 14:49:34,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:49:41,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:49:42,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1692400.0, ans=0.1 2023-10-04 14:49:44,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 14:49:45,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:49:48,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:49:50,113 INFO [train.py:1046] (1/4) Epoch 48, batch 4200, loss[loss=0.1488, simple_loss=0.2319, pruned_loss=0.03286, over 24337.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2333, pruned_loss=0.03668, over 4702291.50 frames. ], batch size: 61, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:49:50,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:49:50,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:49:50,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:49:53,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 14:49:54,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 14:49:56,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:49:58,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:49:59,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1692466.6666666667, ans=0.025 2023-10-04 14:50:00,476 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.47 vs. limit=6.0 2023-10-04 14:50:01,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:50:03,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 14:50:06,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:50:06,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:50:06,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 14:50:06,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:50:07,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1692533.3333333333, ans=0.2 2023-10-04 14:50:07,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1692533.3333333333, ans=0.125 2023-10-04 14:50:08,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:50:08,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:50:08,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:50:09,685 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.019e+02 2.306e+02 2.735e+02 4.885e+02, threshold=4.613e+02, percent-clipped=1.0 2023-10-04 14:50:09,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:50:11,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 14:50:11,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:50:14,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1692533.3333333333, ans=0.1 2023-10-04 14:50:16,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:50:17,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:50:20,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:50:21,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:50:22,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:50:22,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 14:50:22,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:50:23,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1692600.0, ans=0.125 2023-10-04 14:50:24,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:50:28,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 14:50:30,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1692600.0, ans=0.2 2023-10-04 14:50:31,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:50:35,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1692666.6666666667, ans=0.125 2023-10-04 14:50:39,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:50:42,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 14:50:46,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:50:52,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:50:53,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:50:54,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 14:51:01,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:51:03,758 INFO [train.py:1046] (1/4) Epoch 48, batch 4250, loss[loss=0.1571, simple_loss=0.2436, pruned_loss=0.03524, over 24358.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2324, pruned_loss=0.03635, over 4701632.89 frames. ], batch size: 77, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:51:03,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:51:03,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:51:06,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:51:10,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:51:12,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 14:51:12,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:51:13,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:51:18,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:51:21,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:51:21,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:51:24,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:51:24,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:51:25,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:51:25,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:51:27,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:51:29,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:51:32,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:51:33,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 14:51:37,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 14:51:37,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:51:37,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:51:38,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:51:39,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:51:39,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:51:40,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:51:41,424 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.94 vs. limit=12.0 2023-10-04 14:51:42,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 14:51:43,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:51:47,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:51:47,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1693000.0, ans=0.125 2023-10-04 14:51:48,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:51:50,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 14:51:50,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:51:51,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 14:51:52,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:51:54,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:51:55,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:51:55,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:51:59,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 14:52:01,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:52:02,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:52:05,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:52:05,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1693066.6666666667, ans=0.025 2023-10-04 14:52:08,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1693066.6666666667, ans=0.2 2023-10-04 14:52:09,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:52:10,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:52:12,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:52:13,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:52:15,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:52:16,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:52:16,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 14:52:17,686 INFO [train.py:1046] (1/4) Epoch 48, batch 4300, loss[loss=0.1417, simple_loss=0.2194, pruned_loss=0.03198, over 24271.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2319, pruned_loss=0.03598, over 4706058.08 frames. ], batch size: 56, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:52:17,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:52:20,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1693133.3333333333, ans=0.125 2023-10-04 14:52:20,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1693133.3333333333, ans=0.125 2023-10-04 14:52:21,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:52:22,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:52:27,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:52:33,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:52:33,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 14:52:34,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1693200.0, ans=0.0 2023-10-04 14:52:34,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1693200.0, ans=0.1 2023-10-04 14:52:35,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:52:35,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:52:35,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1693200.0, ans=0.125 2023-10-04 14:52:36,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:52:36,720 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 14:52:37,909 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.751e+02 2.092e+02 2.342e+02 2.811e+02 4.039e+02, threshold=4.683e+02, percent-clipped=0.0 2023-10-04 14:52:40,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:52:42,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:52:44,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 14:52:44,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:52:45,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1693200.0, ans=0.125 2023-10-04 14:52:46,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 14:52:49,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 14:52:49,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:52:52,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:52:52,314 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:52:52,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1693266.6666666667, ans=0.0 2023-10-04 14:52:53,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:52:53,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:52:56,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:52:56,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 14:52:58,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 14:52:59,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:53:03,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:53:03,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 14:53:03,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:53:03,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:53:03,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 14:53:03,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 14:53:03,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 14:53:03,694 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.39 vs. limit=15.0 2023-10-04 14:53:04,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:53:05,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 14:53:05,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 14:53:09,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:53:11,424 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 14:53:11,515 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:53:12,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:53:12,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:53:16,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 14:53:16,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:53:16,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:53:17,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:53:17,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:53:17,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:53:21,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:53:24,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:53:25,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:53:25,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:53:30,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 14:53:30,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 14:53:32,113 INFO [train.py:1046] (1/4) Epoch 48, batch 4350, loss[loss=0.1467, simple_loss=0.234, pruned_loss=0.02974, over 24477.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2331, pruned_loss=0.03633, over 4702405.54 frames. ], batch size: 63, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:53:32,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1693466.6666666667, ans=0.125 2023-10-04 14:53:34,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:53:35,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1693466.6666666667, ans=0.0 2023-10-04 14:53:35,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1693466.6666666667, ans=0.125 2023-10-04 14:53:37,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:53:40,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:53:40,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:53:44,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:53:49,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:53:50,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:53:50,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:53:53,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:53:54,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:53:57,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:54:03,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 14:54:04,192 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:54:05,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:54:07,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:10,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:12,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 14:54:15,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:54:16,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 14:54:20,344 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 14:54:20,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1693666.6666666667, ans=0.125 2023-10-04 14:54:22,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:54:22,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:54:24,356 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 14:54:25,664 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 14:54:25,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:54:25,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:54:26,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:54:28,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:54:29,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:54:29,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:54:31,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 14:54:31,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:31,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:54:33,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:33,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 14:54:34,661 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 14:54:34,665 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 14:54:34,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 14:54:38,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1693733.3333333333, ans=0.0 2023-10-04 14:54:39,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:54:39,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:54:39,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:54:39,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1693733.3333333333, ans=0.04949747468305833 2023-10-04 14:54:40,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:54:41,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 14:54:43,438 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 14:54:43,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:44,705 INFO [train.py:1046] (1/4) Epoch 48, batch 4400, loss[loss=0.192, simple_loss=0.2662, pruned_loss=0.05894, over 19648.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2345, pruned_loss=0.0365, over 4708261.75 frames. ], batch size: 388, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:54:46,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:54:46,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:47,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:54:50,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 14:54:50,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 14:54:52,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 14:54:52,379 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 14:54:53,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 14:54:53,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:54:56,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 14:54:56,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1693800.0, ans=0.125 2023-10-04 14:54:58,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:59,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:54:59,099 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 14:55:00,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1693866.6666666667, ans=0.125 2023-10-04 14:55:03,067 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.088e+02 2.355e+02 2.772e+02 4.860e+02, threshold=4.710e+02, percent-clipped=1.0 2023-10-04 14:55:03,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:55:03,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 14:55:05,064 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 14:55:05,823 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.41 vs. limit=22.5 2023-10-04 14:55:08,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 14:55:08,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 14:55:09,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 14:55:09,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:55:10,049 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:55:11,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:55:11,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:55:12,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 14:55:12,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 14:55:13,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1693933.3333333333, ans=10.0 2023-10-04 14:55:14,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:55:17,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:55:17,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:55:19,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:55:19,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:55:19,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 14:55:21,108 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 14:55:22,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1693933.3333333333, ans=0.2 2023-10-04 14:55:25,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:55:31,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:55:33,063 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 14:55:36,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:55:39,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:55:43,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:55:43,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 14:55:44,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:55:44,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:55:44,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:55:45,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:55:48,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 14:55:50,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 14:55:51,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 14:55:51,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:55:51,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 14:55:52,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:55:56,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:55:57,899 INFO [train.py:1046] (1/4) Epoch 48, batch 4450, loss[loss=0.169, simple_loss=0.2556, pruned_loss=0.0412, over 24389.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2351, pruned_loss=0.03651, over 4707209.26 frames. ], batch size: 77, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:55:58,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 14:56:02,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:56:04,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:56:05,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:56:12,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:56:12,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:56:13,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1694200.0, ans=0.125 2023-10-04 14:56:14,420 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.12 vs. limit=15.0 2023-10-04 14:56:14,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:56:16,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:56:17,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:56:17,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:56:19,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 14:56:19,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:56:19,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:56:20,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:56:20,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:56:23,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:56:23,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1694200.0, ans=0.0 2023-10-04 14:56:27,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:56:28,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:56:30,672 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:56:30,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:56:32,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:56:37,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:56:37,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1694266.6666666667, ans=0.125 2023-10-04 14:56:38,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 14:56:38,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 14:56:38,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:56:40,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:56:41,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 14:56:47,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:56:48,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1694333.3333333333, ans=0.125 2023-10-04 14:56:50,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:56:50,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 14:56:51,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:56:51,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:56:51,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:56:51,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:56:54,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:56:56,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:56:56,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 14:56:58,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:57:01,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:57:01,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:57:03,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:57:04,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 14:57:04,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:57:09,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 14:57:09,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:57:11,981 INFO [train.py:1046] (1/4) Epoch 48, batch 4500, loss[loss=0.1751, simple_loss=0.2473, pruned_loss=0.05146, over 19818.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2355, pruned_loss=0.03674, over 4706861.68 frames. ], batch size: 389, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:57:14,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:57:15,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 14:57:15,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 14:57:18,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:57:22,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:57:22,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:57:23,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:57:23,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:57:25,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:57:25,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:57:30,302 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.084e+02 2.253e+02 2.565e+02 3.939e+02, threshold=4.505e+02, percent-clipped=0.0 2023-10-04 14:57:38,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:57:38,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:57:39,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1694600.0, ans=0.2 2023-10-04 14:57:41,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:57:41,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:57:42,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:57:49,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 14:57:52,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:57:56,475 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.80 vs. limit=15.0 2023-10-04 14:57:57,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:57:59,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:57:59,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 14:58:01,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:58:01,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:58:03,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:58:03,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:58:05,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:58:06,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 14:58:06,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 14:58:06,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:58:09,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:58:10,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:58:12,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:58:16,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:58:16,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:58:18,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 14:58:19,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 14:58:19,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 14:58:24,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 14:58:25,458 INFO [train.py:1046] (1/4) Epoch 48, batch 4550, loss[loss=0.1479, simple_loss=0.2238, pruned_loss=0.03599, over 19385.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2348, pruned_loss=0.0362, over 4699413.57 frames. ], batch size: 42, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:58:26,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 14:58:28,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:58:31,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:58:31,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:58:35,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:58:37,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:58:39,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:58:40,828 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:58:40,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:58:40,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:58:45,186 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:58:45,220 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:58:49,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:58:52,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 14:58:52,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 14:58:52,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:58:55,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 14:58:57,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 14:58:59,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:59:00,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 14:59:02,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:59:06,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:07,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:07,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:59:08,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 14:59:11,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:59:13,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:13,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:59:14,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:59:16,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 14:59:17,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 14:59:17,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:59:17,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 14:59:20,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 14:59:20,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:59:22,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:59:23,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:59:23,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:23,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:59:24,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:59:24,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1695066.6666666667, ans=0.125 2023-10-04 14:59:26,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 14:59:27,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:59:27,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 14:59:27,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 14:59:27,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:59:27,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 14:59:30,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:59:30,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:59:31,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:59:31,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:33,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:59:35,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:59:37,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:59:38,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:59:40,353 INFO [train.py:1046] (1/4) Epoch 48, batch 4600, loss[loss=0.1603, simple_loss=0.2349, pruned_loss=0.04284, over 23799.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2336, pruned_loss=0.03582, over 4709959.41 frames. ], batch size: 179, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:59:40,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:59:43,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:59:43,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:59:44,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:59:45,823 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 14:59:45,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:59:50,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:59:51,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:59:51,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1695133.3333333333, ans=0.2 2023-10-04 14:59:52,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:59:57,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1695200.0, ans=0.125 2023-10-04 14:59:58,277 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.701e+02 2.165e+02 2.492e+02 2.948e+02 4.714e+02, threshold=4.983e+02, percent-clipped=2.0 2023-10-04 14:59:58,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 14:59:59,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:02,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:06,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:00:06,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:00:10,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 15:00:10,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:00:13,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:00:16,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:16,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:00:17,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1695266.6666666667, ans=0.07 2023-10-04 15:00:19,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:00:21,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 15:00:24,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:00:25,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1695333.3333333333, ans=0.125 2023-10-04 15:00:28,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:00:28,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:00:31,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:00:31,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 15:00:31,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:33,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 15:00:33,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:00:33,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:00:33,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:00:35,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:00:36,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:00:37,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 15:00:37,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 15:00:37,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 15:00:37,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:00:39,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:00:39,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:00:40,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:00:50,840 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.09 vs. limit=22.5 2023-10-04 15:00:51,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:00:51,906 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.92 vs. limit=15.0 2023-10-04 15:00:52,550 INFO [train.py:1046] (1/4) Epoch 48, batch 4650, loss[loss=0.1478, simple_loss=0.2241, pruned_loss=0.03573, over 23786.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2336, pruned_loss=0.03547, over 4722188.36 frames. ], batch size: 164, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 15:00:54,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:00:54,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:54,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1695466.6666666667, ans=0.125 2023-10-04 15:00:55,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:00:55,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:00:55,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:00:56,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:58,602 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.66 vs. limit=15.0 2023-10-04 15:01:00,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 15:01:02,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:01:05,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 15:01:05,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:01:07,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 15:01:07,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:01:08,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 15:01:08,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 15:01:08,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:01:09,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:01:11,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:01:12,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:01:12,538 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 15:01:15,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:01:16,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 15:01:19,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:01:19,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:01:20,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 15:01:20,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:01:21,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1695600.0, ans=0.09899494936611666 2023-10-04 15:01:23,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:01:26,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:01:32,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:01:34,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1695600.0, ans=0.125 2023-10-04 15:01:35,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:01:37,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:01:37,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:01:39,991 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.51 vs. limit=15.0 2023-10-04 15:01:42,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 15:01:42,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 15:01:42,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 15:01:42,148 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 15:01:43,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:01:48,391 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.80 vs. limit=15.0 2023-10-04 15:01:50,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:01:50,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:01:50,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 15:01:51,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:01:52,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:01:52,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:01:54,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:01:57,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:01:57,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:01:57,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:02:00,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:02:00,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:02:00,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:02:01,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 15:02:01,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:02:03,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 15:02:06,563 INFO [train.py:1046] (1/4) Epoch 48, batch 4700, loss[loss=0.194, simple_loss=0.2659, pruned_loss=0.06104, over 19388.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2339, pruned_loss=0.03556, over 4725786.64 frames. ], batch size: 388, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 15:02:07,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1695800.0, ans=0.1 2023-10-04 15:02:08,234 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.28 vs. limit=22.5 2023-10-04 15:02:13,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:02:14,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:02:15,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:02:15,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1695800.0, ans=0.09899494936611666 2023-10-04 15:02:17,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:02:18,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:02:24,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 15:02:24,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 15:02:25,310 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.030e+02 2.257e+02 2.587e+02 3.872e+02, threshold=4.514e+02, percent-clipped=0.0 2023-10-04 15:02:26,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:02:28,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:02:28,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:02:28,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1695866.6666666667, ans=0.2 2023-10-04 15:02:31,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:02:37,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:02:38,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 15:02:41,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:02:45,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 15:02:46,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:02:49,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:02:51,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 15:02:54,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:02:56,036 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.77 vs. limit=15.0 2023-10-04 15:02:59,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:03:00,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 15:03:02,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:03:02,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:03:05,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:03:05,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:03:05,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 15:03:05,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1696066.6666666667, ans=0.95 2023-10-04 15:03:07,045 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 15:03:07,660 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.49 vs. limit=22.5 2023-10-04 15:03:08,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:03:09,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:03:09,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:03:09,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 15:03:12,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:03:15,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 15:03:18,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:03:20,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:03:21,227 INFO [train.py:1046] (1/4) Epoch 48, batch 4750, loss[loss=0.1429, simple_loss=0.2285, pruned_loss=0.02861, over 24477.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2345, pruned_loss=0.03551, over 4728740.56 frames. ], batch size: 63, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 15:03:25,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:03:25,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:03:25,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1696133.3333333333, ans=0.125 2023-10-04 15:03:26,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 15:03:26,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:03:30,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 15:03:32,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:03:33,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:03:33,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:03:35,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1696200.0, ans=0.0 2023-10-04 15:03:37,012 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.74 vs. limit=15.0 2023-10-04 15:03:38,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 15:03:40,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1696200.0, ans=0.0 2023-10-04 15:03:41,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:03:43,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 15:03:44,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:03:49,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:03:49,318 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:03:49,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:03:49,411 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 15:03:49,413 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 15:03:56,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 15:03:59,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:04:02,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:04:04,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:04:04,926 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 15:04:04,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:04:06,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:04:09,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:04:10,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 15:04:12,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 15:04:13,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:04:13,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:04:13,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:04:15,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 15:04:15,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 15:04:18,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 15:04:20,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:04:23,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1696400.0, ans=0.0 2023-10-04 15:04:24,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:04:24,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 15:04:25,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:04:27,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:04:28,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:04:29,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1696400.0, ans=0.125 2023-10-04 15:04:30,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:04:31,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:04:33,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:04:33,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 15:04:34,294 INFO [train.py:1046] (1/4) Epoch 48, batch 4800, loss[loss=0.1942, simple_loss=0.2682, pruned_loss=0.06013, over 19333.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2347, pruned_loss=0.03533, over 4734743.28 frames. ], batch size: 389, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:04:34,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 15:04:37,099 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 15:04:37,926 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.37 vs. limit=15.0 2023-10-04 15:04:38,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:04:38,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:04:40,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 15:04:41,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1696466.6666666667, ans=0.025 2023-10-04 15:04:44,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:04:44,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:04:46,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1696466.6666666667, ans=0.2 2023-10-04 15:04:49,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:04:49,785 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.84 vs. limit=12.0 2023-10-04 15:04:50,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:04:51,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:04:51,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 15:04:51,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:04:53,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:04:55,216 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.147e+02 2.492e+02 2.796e+02 5.306e+02, threshold=4.985e+02, percent-clipped=1.0 2023-10-04 15:04:55,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 15:04:59,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:01,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:01,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:05:01,962 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.58 vs. limit=22.5 2023-10-04 15:05:04,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:04,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 15:05:04,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:05:04,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:05:06,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:08,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:05:10,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:05:10,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:05:12,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 15:05:12,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:05:15,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 15:05:15,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 15:05:17,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:05:17,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:05:17,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:05:17,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:05:17,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:05:19,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1696666.6666666667, ans=0.125 2023-10-04 15:05:20,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:05:21,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:05:23,118 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:05:26,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:28,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:05:32,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 15:05:33,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:05:33,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:33,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:05:35,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:05:38,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:05:40,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:05:40,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:40,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:05:42,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:05:42,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:05:46,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:05:46,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:46,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:05:48,163 INFO [train.py:1046] (1/4) Epoch 48, batch 4850, loss[loss=0.1568, simple_loss=0.2423, pruned_loss=0.03568, over 24353.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2351, pruned_loss=0.03558, over 4735084.40 frames. ], batch size: 77, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:05:48,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 15:05:51,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 15:05:51,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:51,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:51,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:05:51,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:55,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:05:55,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1696800.0, ans=0.125 2023-10-04 15:05:55,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1696800.0, ans=0.0 2023-10-04 15:06:02,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 15:06:03,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:06:09,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:06:09,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:06:09,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1696866.6666666667, ans=0.125 2023-10-04 15:06:09,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1696866.6666666667, ans=0.1 2023-10-04 15:06:10,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:06:13,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:06:13,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:06:15,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:06:15,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 15:06:19,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:06:21,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:06:22,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:06:22,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:06:22,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 15:06:22,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1696933.3333333333, ans=0.125 2023-10-04 15:06:25,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:06:25,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:06:30,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:06:30,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 15:06:30,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 15:06:32,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:06:37,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:06:37,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 15:06:39,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:06:39,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:06:40,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:06:40,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 15:06:40,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:06:42,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 15:06:42,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1697000.0, ans=0.125 2023-10-04 15:06:44,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:06:45,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:06:45,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 15:06:54,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:06:58,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1697066.6666666667, ans=0.015 2023-10-04 15:07:00,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:07:00,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:07:03,391 INFO [train.py:1046] (1/4) Epoch 48, batch 4900, loss[loss=0.1441, simple_loss=0.223, pruned_loss=0.03265, over 22813.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2347, pruned_loss=0.03557, over 4736156.99 frames. ], batch size: 50, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:07:06,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 15:07:06,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:07:11,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:07:12,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:07:12,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:07:15,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 15:07:18,994 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.79 vs. limit=10.0 2023-10-04 15:07:20,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 15:07:24,147 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.053e+02 2.279e+02 2.584e+02 3.777e+02, threshold=4.559e+02, percent-clipped=0.0 2023-10-04 15:07:24,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 15:07:24,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 15:07:25,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:07:25,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:07:25,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:07:25,712 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:07:25,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:07:27,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 15:07:28,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 15:07:31,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 15:07:31,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:07:32,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:07:34,403 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.38 vs. limit=22.5 2023-10-04 15:07:35,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:07:35,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:07:36,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:07:36,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 15:07:37,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:07:39,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:07:39,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 15:07:39,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 15:07:44,311 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:07:45,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 15:07:48,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:07:48,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:07:49,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:07:49,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:07:50,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 15:07:50,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:07:51,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 15:07:53,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:07:56,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:07:57,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:07:58,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1697333.3333333333, ans=0.2 2023-10-04 15:08:00,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 15:08:01,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:08:01,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 15:08:03,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 15:08:08,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:08:08,982 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.46 vs. limit=15.0 2023-10-04 15:08:09,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:08:10,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 15:08:10,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 15:08:10,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:08:12,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:08:18,229 INFO [train.py:1046] (1/4) Epoch 48, batch 4950, loss[loss=0.1287, simple_loss=0.1809, pruned_loss=0.03824, over 18861.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2326, pruned_loss=0.03545, over 4716839.94 frames. ], batch size: 389, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:08:18,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:08:18,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:08:18,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:08:18,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 15:08:18,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1697466.6666666667, ans=0.125 2023-10-04 15:08:21,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 15:08:24,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:08:24,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 15:08:27,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 15:08:27,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 15:08:27,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:08:28,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 15:08:28,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:08:28,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:08:28,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:08:30,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:08:32,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:08:32,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:08:35,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:08:36,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:08:38,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:08:39,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:08:42,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:08:43,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1697533.3333333333, ans=0.125 2023-10-04 15:08:46,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:08:46,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:08:48,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:08:49,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:08:49,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:08:51,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 15:08:53,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 15:08:53,915 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.86 vs. limit=6.0 2023-10-04 15:08:55,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:08:57,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:08:57,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:08:57,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:08:57,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:08:58,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:09:01,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:09:05,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:09:06,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:09:08,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:09:08,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:09:09,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 15:09:10,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:09:11,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:09:13,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1697666.6666666667, ans=0.125 2023-10-04 15:09:16,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:09:17,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:09:17,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:09:17,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:09:18,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:09:20,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:09:20,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:09:20,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1697733.3333333333, ans=0.2 2023-10-04 15:09:22,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:09:22,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:09:23,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 15:09:27,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:09:32,407 INFO [train.py:1046] (1/4) Epoch 48, batch 5000, loss[loss=0.1331, simple_loss=0.2157, pruned_loss=0.02523, over 24566.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2316, pruned_loss=0.03525, over 4707770.64 frames. ], batch size: 60, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:09:32,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 15:09:32,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 15:09:38,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:09:38,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:09:39,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 15:09:41,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 15:09:45,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:09:46,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 15:09:46,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:09:46,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:09:47,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 15:09:49,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:09:49,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:09:51,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 15:09:51,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:09:51,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:09:52,450 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.773e+02 2.112e+02 2.381e+02 2.670e+02 4.654e+02, threshold=4.763e+02, percent-clipped=1.0 2023-10-04 15:09:52,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 15:09:52,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 15:09:52,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:09:53,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 15:09:54,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 15:09:54,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:09:54,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1697866.6666666667, ans=0.0 2023-10-04 15:09:55,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:09:55,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 15:09:55,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 15:09:57,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1697866.6666666667, ans=0.125 2023-10-04 15:09:58,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 15:09:58,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:09:59,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:09:59,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 15:09:59,792 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:10:01,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:10:02,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:10:04,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 15:10:05,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 15:10:05,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:10:07,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:10:08,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1697933.3333333333, ans=0.2 2023-10-04 15:10:10,207 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 15:10:12,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:10:15,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:10:15,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:16,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1698000.0, ans=0.2 2023-10-04 15:10:17,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 15:10:17,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:10:18,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:10:19,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:10:21,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 15:10:23,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:10:24,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:10:26,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:10:30,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 15:10:34,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:41,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:10:43,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:43,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:10:44,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:10:44,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:10:44,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:10:45,701 INFO [train.py:1046] (1/4) Epoch 48, batch 5050, loss[loss=0.1584, simple_loss=0.2347, pruned_loss=0.04104, over 23443.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2328, pruned_loss=0.03563, over 4714396.56 frames. ], batch size: 285, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:10:45,780 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:49,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:49,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 15:10:50,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:10:52,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:10:53,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:10:53,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 15:10:53,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1698133.3333333333, ans=0.0 2023-10-04 15:10:55,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:10:55,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:10:55,774 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.38 vs. limit=15.0 2023-10-04 15:10:57,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:10:59,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:10:59,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:11:04,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1698200.0, ans=0.0 2023-10-04 15:11:09,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 15:11:10,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:11:10,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:11:12,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 15:11:12,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:11:13,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:11:13,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:11:15,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:11:15,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 15:11:16,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 15:11:16,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:11:19,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:11:22,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:11:22,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 15:11:24,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:11:27,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 15:11:28,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:11:28,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:11:30,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:11:30,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:11:33,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:11:33,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1698333.3333333333, ans=0.125 2023-10-04 15:11:36,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:11:36,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:11:36,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:11:36,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:11:36,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 15:11:38,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:11:39,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:11:41,615 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.61 vs. limit=15.0 2023-10-04 15:11:43,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:11:43,861 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 15:11:43,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:11:45,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:11:46,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:11:46,702 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 15:11:49,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:11:49,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 15:11:49,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:11:54,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:11:54,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:11:54,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 15:11:55,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 15:11:57,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:11:57,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:11:59,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:12:00,422 INFO [train.py:1046] (1/4) Epoch 48, batch 5100, loss[loss=0.1549, simple_loss=0.2298, pruned_loss=0.04, over 23779.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2339, pruned_loss=0.03584, over 4727310.40 frames. ], batch size: 179, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:12:03,118 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 15:12:04,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:12:07,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 15:12:07,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1698466.6666666667, ans=0.0 2023-10-04 15:12:09,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 15:12:09,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:12:11,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:12:12,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:12:14,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 15:12:14,343 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 15:12:18,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:12:19,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:12:20,881 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.048e+02 2.253e+02 2.620e+02 3.978e+02, threshold=4.506e+02, percent-clipped=0.0 2023-10-04 15:12:22,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:12:25,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1698533.3333333333, ans=0.0 2023-10-04 15:12:26,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 15:12:28,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:12:28,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1698600.0, ans=0.125 2023-10-04 15:12:31,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:12:31,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 15:12:34,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:12:35,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:12:35,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 15:12:37,166 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 15:12:37,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:12:38,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 15:12:38,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 15:12:41,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:12:50,529 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:12:52,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 15:12:52,077 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 15:12:53,295 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 15:12:54,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 15:12:54,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:12:57,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 15:13:02,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 15:13:03,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 15:13:04,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1698733.3333333333, ans=0.125 2023-10-04 15:13:04,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1698733.3333333333, ans=0.0 2023-10-04 15:13:05,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:13:07,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 15:13:09,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:13:10,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 15:13:14,219 INFO [train.py:1046] (1/4) Epoch 48, batch 5150, loss[loss=0.1877, simple_loss=0.2546, pruned_loss=0.06038, over 19402.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2342, pruned_loss=0.0358, over 4722772.41 frames. ], batch size: 388, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:13:16,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:13:16,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:13:16,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:13:17,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:13:17,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:13:18,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:13:18,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 15:13:18,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 15:13:20,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 15:13:20,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:13:20,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 15:13:21,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:13:21,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 15:13:24,295 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:13:25,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:13:28,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 15:13:28,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 15:13:31,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:13:31,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1698866.6666666667, ans=0.125 2023-10-04 15:13:32,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:13:34,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:13:34,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:13:34,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:13:36,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:13:36,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:13:36,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 15:13:37,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:13:37,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:13:39,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 15:13:41,825 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 15:13:41,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:13:48,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:13:49,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 15:13:53,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:13:59,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:14:01,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:14:04,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:14:05,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:14:07,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 15:14:09,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:14:10,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:14:11,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:14:14,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:14:14,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:14:15,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 15:14:21,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:14:23,110 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 15:14:24,598 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:14:25,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:14:25,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:14:27,620 INFO [train.py:1046] (1/4) Epoch 48, batch 5200, loss[loss=0.1362, simple_loss=0.2163, pruned_loss=0.02807, over 20295.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2346, pruned_loss=0.03598, over 4730577.73 frames. ], batch size: 44, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:14:27,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:14:27,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:14:27,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:14:30,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:14:31,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:14:36,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:14:36,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1699133.3333333333, ans=0.2 2023-10-04 15:14:37,175 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.71 vs. limit=15.0 2023-10-04 15:14:39,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 15:14:39,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1699133.3333333333, ans=0.0 2023-10-04 15:14:41,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:14:41,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:14:42,216 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1699200.0, ans=0.125 2023-10-04 15:14:43,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:14:44,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:14:44,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:14:46,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 15:14:48,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 15:14:49,847 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.717e+02 2.200e+02 2.450e+02 2.998e+02 4.674e+02, threshold=4.900e+02, percent-clipped=1.0 2023-10-04 15:14:49,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:14:51,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 15:14:55,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:14:55,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 15:14:56,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 15:14:56,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 15:14:59,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 15:15:00,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:15:00,013 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 15:15:00,019 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:15:01,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:15:02,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:15:02,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 15:15:02,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:15:07,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:15:08,030 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:15:09,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 15:15:10,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 15:15:10,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 15:15:10,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1699333.3333333333, ans=0.125 2023-10-04 15:15:13,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 15:15:14,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:15:21,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:15:21,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:15:22,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 15:15:22,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:15:24,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 15:15:24,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:15:24,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:15:26,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:15:28,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:15:31,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:15:33,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:15:33,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:15:37,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:15:38,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 15:15:39,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:15:39,259 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:15:40,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:15:40,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:15:41,703 INFO [train.py:1046] (1/4) Epoch 48, batch 5250, loss[loss=0.1446, simple_loss=0.227, pruned_loss=0.03109, over 24420.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2337, pruned_loss=0.03601, over 4724446.37 frames. ], batch size: 58, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:15:41,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:15:41,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:15:44,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:15:45,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1699466.6666666667, ans=0.125 2023-10-04 15:15:46,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:15:48,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:15:49,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:15:54,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:15:55,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:15:59,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:16:00,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:16:04,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 15:16:05,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:16:07,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:16:41,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1699733.3333333333, ans=0.125 2023-10-04 15:16:50,663 INFO [train.py:1046] (1/4) Epoch 48, batch 5300, loss[loss=0.1589, simple_loss=0.2302, pruned_loss=0.04383, over 23867.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2323, pruned_loss=0.03583, over 4713864.09 frames. ], batch size: 195, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:17:04,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:17:04,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 15:17:04,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 15:17:04,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:17:05,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:17:05,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:17:05,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:17:05,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:17:05,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:05,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:17:05,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:17:05,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:17:05,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 15:17:05,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 15:17:05,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 15:17:06,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:17:06,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 15:17:06,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 15:17:06,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:17:06,709 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:17:06,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:17:06,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:17:06,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:17:07,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:17:07,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:17:07,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:17:07,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:17:07,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:17:07,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:17:07,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:17:07,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:17:07,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 15:17:07,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:17:08,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:17:08,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 15:17:08,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 15:17:08,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:17:08,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:17:08,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 15:17:09,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 15:17:09,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 15:17:09,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:17:09,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:17:09,800 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 15:17:09,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 15:17:09,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:17:09,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:17:10,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 15:17:10,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 15:17:10,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 15:17:10,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 15:17:16,762 INFO [train.py:1046] (1/4) Epoch 49, batch 0, loss[loss=0.1445, simple_loss=0.2239, pruned_loss=0.03253, over 23408.00 frames. ], tot_loss[loss=0.1445, simple_loss=0.2239, pruned_loss=0.03253, over 23408.00 frames. ], batch size: 134, lr: 2.08e-03, grad_scale: 32.0 2023-10-04 15:17:16,762 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 15:17:29,911 INFO [train.py:1078] (1/4) Epoch 49, validation: loss=0.3215, simple_loss=0.2741, pruned_loss=0.1844, over 1125622.00 frames. 2023-10-04 15:17:29,912 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 15:17:32,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 15:17:32,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:17:33,995 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.045e+02 2.321e+02 2.638e+02 8.969e+02, threshold=4.643e+02, percent-clipped=2.0 2023-10-04 15:17:34,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:17:38,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:17:38,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:17:39,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:39,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 15:17:41,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 15:17:41,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1699880.0, ans=0.125 2023-10-04 15:17:42,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1699946.6666666667, ans=0.0 2023-10-04 15:17:43,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:43,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:46,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1699946.6666666667, ans=0.125 2023-10-04 15:17:46,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1699946.6666666667, ans=0.125 2023-10-04 15:17:48,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:48,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:17:48,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1699946.6666666667, ans=0.125 2023-10-04 15:17:49,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:17:49,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:17:51,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 15:17:54,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:18:02,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:18:02,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:18:05,661 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 15:18:06,429 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.06 vs. limit=15.0 2023-10-04 15:18:11,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:18:11,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:18:12,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:18:15,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:18:16,150 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.07 vs. limit=10.0 2023-10-04 15:18:19,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:18:24,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 15:18:27,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 15:18:28,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:18:28,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:18:28,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:18:30,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:18:32,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 15:18:34,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:18:35,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:18:38,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1700146.6666666667, ans=0.09899494936611666 2023-10-04 15:18:39,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:18:42,451 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 15:18:42,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:18:43,770 INFO [train.py:1046] (1/4) Epoch 49, batch 50, loss[loss=0.1475, simple_loss=0.2243, pruned_loss=0.03533, over 23361.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2337, pruned_loss=0.03624, over 1071685.85 frames. ], batch size: 119, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:18:45,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:18:47,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:18:47,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 15:18:49,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:18:50,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:18:51,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:18:53,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:18:55,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:18:58,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 15:18:58,258 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:19:05,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:19:06,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 15:19:07,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 15:19:09,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:19:09,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:19:09,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:19:09,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1700280.0, ans=0.0 2023-10-04 15:19:10,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:19:10,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:19:10,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1700280.0, ans=0.125 2023-10-04 15:19:12,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 15:19:12,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:19:18,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:19:20,298 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:19:21,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:19:22,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 15:19:25,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:19:26,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:19:27,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 15:19:27,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:19:29,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 15:19:38,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:19:38,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:19:39,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:19:41,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:19:41,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:19:42,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 15:19:44,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 15:19:44,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:19:45,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:19:46,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:19:48,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:19:48,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 15:19:48,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 15:19:49,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 15:19:51,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:19:51,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:19:51,407 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:19:53,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 15:19:53,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 15:19:54,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:19:54,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1700480.0, ans=0.125 2023-10-04 15:19:55,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:19:56,870 INFO [train.py:1046] (1/4) Epoch 49, batch 100, loss[loss=0.1516, simple_loss=0.2371, pruned_loss=0.03308, over 24343.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2352, pruned_loss=0.03686, over 1888680.37 frames. ], batch size: 61, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:19:58,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 15:19:58,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:19:59,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1700546.6666666667, ans=0.1 2023-10-04 15:19:59,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1700546.6666666667, ans=0.125 2023-10-04 15:20:00,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:20:03,029 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.765e+02 2.158e+02 2.509e+02 3.581e+02 6.857e+02, threshold=5.017e+02, percent-clipped=12.0 2023-10-04 15:20:03,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:20:04,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:20:06,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 15:20:06,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:20:06,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1700546.6666666667, ans=0.125 2023-10-04 15:20:09,288 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:20:09,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:20:10,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:20:10,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:20:10,557 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:20:11,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 15:20:14,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:20:14,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:20:14,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:20:14,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:20:17,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 15:20:18,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:20:20,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:20:20,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:20:22,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:20:25,729 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 15:20:27,056 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 15:20:27,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:20:27,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:20:28,464 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.65 vs. limit=22.5 2023-10-04 15:20:30,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:20:33,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:20:33,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:20:38,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:20:38,959 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 15:20:41,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 15:20:43,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:20:44,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:20:48,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:20:51,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:20:55,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:20:56,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:21:00,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:21:01,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:21:02,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:21:03,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:21:03,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:21:04,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 15:21:06,213 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 15:21:06,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:21:06,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:21:08,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:08,125 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:21:08,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 15:21:08,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 15:21:10,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:21:10,036 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:10,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:21:11,384 INFO [train.py:1046] (1/4) Epoch 49, batch 150, loss[loss=0.1629, simple_loss=0.2432, pruned_loss=0.04127, over 23209.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2354, pruned_loss=0.03632, over 2511618.68 frames. ], batch size: 105, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:21:11,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:21:12,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:21:12,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:21:15,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:21:18,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:21:18,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:21:18,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:19,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:21:21,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:22,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:21:23,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:27,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 15:21:27,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 15:21:27,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 15:21:31,913 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:21:31,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:21:33,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:21:33,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:21:33,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:21:34,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:36,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:38,594 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 15:21:41,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:21:45,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:21:50,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:21:51,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 15:21:54,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:21:54,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:21:54,890 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:21:57,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:21:59,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:22:00,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:22:01,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:22:01,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 15:22:07,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:22:07,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:22:09,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:22:09,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:22:09,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:22:13,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 15:22:15,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:22:15,963 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:22:19,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:22:19,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1701146.6666666667, ans=0.125 2023-10-04 15:22:20,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:22:21,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:22:21,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 15:22:21,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:22:22,012 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 15:22:24,645 INFO [train.py:1046] (1/4) Epoch 49, batch 200, loss[loss=0.1578, simple_loss=0.2382, pruned_loss=0.0387, over 23464.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2358, pruned_loss=0.03639, over 3000577.31 frames. ], batch size: 134, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:22:26,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:22:28,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:22:28,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:22:31,468 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.785e+02 2.091e+02 2.483e+02 2.820e+02 4.218e+02, threshold=4.965e+02, percent-clipped=0.0 2023-10-04 15:22:32,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 15:22:34,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:22:34,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:22:37,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 15:22:38,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:22:41,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:22:41,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:22:44,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:22:44,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:22:46,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:22:57,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1701346.6666666667, ans=0.125 2023-10-04 15:23:00,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1701346.6666666667, ans=0.2 2023-10-04 15:23:01,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:23:01,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:23:02,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:23:04,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:23:04,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 15:23:04,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:23:05,422 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.48 vs. limit=15.0 2023-10-04 15:23:07,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:07,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:23:08,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:23:08,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:23:10,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 15:23:10,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 15:23:10,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:23:13,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:23:19,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:23:27,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:28,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:23:35,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:36,647 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.43 vs. limit=10.0 2023-10-04 15:23:37,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 15:23:38,499 INFO [train.py:1046] (1/4) Epoch 49, batch 250, loss[loss=0.1445, simple_loss=0.2322, pruned_loss=0.02842, over 23377.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2367, pruned_loss=0.03716, over 3368210.74 frames. ], batch size: 93, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:23:38,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:23:38,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:23:38,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:23:40,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:23:41,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 15:23:42,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:23:42,804 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 15:23:44,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:46,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:23:47,604 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:47,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:23:49,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:23:50,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:52,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:23:52,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1701613.3333333333, ans=0.125 2023-10-04 15:23:55,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:24:03,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:24:08,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:24:08,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:24:11,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1701680.0, ans=0.125 2023-10-04 15:24:14,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:24:16,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:24:16,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:24:16,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:24:17,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:24:17,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:24:17,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:24:19,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1701680.0, ans=0.07 2023-10-04 15:24:21,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:24:23,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 15:24:23,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:24:26,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:24:26,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:24:26,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:24:27,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:24:27,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:24:27,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:24:29,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:24:32,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:24:32,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:24:37,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:24:40,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:24:40,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1701813.3333333333, ans=0.125 2023-10-04 15:24:43,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:24:47,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:24:49,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:24:53,842 INFO [train.py:1046] (1/4) Epoch 49, batch 300, loss[loss=0.1499, simple_loss=0.2233, pruned_loss=0.03821, over 23825.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2338, pruned_loss=0.0367, over 3660493.96 frames. ], batch size: 164, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:24:53,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 15:24:54,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:24:54,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:24:58,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 15:24:58,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 15:24:59,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:24:59,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 15:25:00,860 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.248e+02 2.762e+02 3.176e+02 5.077e+02, threshold=5.525e+02, percent-clipped=1.0 2023-10-04 15:25:02,787 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.43 vs. limit=15.0 2023-10-04 15:25:03,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:25:03,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:25:09,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:25:10,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 15:25:10,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:25:11,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:25:11,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 15:25:11,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:25:16,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 15:25:20,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:25:20,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 15:25:24,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 15:25:25,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:26,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:25:29,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:29,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 15:25:29,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:25:30,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:25:32,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:25:33,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:25:37,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 15:25:37,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 15:25:39,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:25:41,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:43,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 15:25:43,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:25:47,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:25:50,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:25:50,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 15:25:55,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:55,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:25:55,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1702146.6666666667, ans=0.125 2023-10-04 15:25:56,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:59,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:25:59,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 15:25:59,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 15:26:00,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:26:01,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 15:26:02,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:26:02,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:03,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:26:03,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:03,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:08,423 INFO [train.py:1046] (1/4) Epoch 49, batch 350, loss[loss=0.1564, simple_loss=0.2456, pruned_loss=0.03364, over 24393.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.231, pruned_loss=0.03591, over 3866987.31 frames. ], batch size: 77, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:26:09,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:26:09,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 15:26:12,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:12,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1702213.3333333333, ans=0.0 2023-10-04 15:26:17,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:26:21,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:21,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:21,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1702213.3333333333, ans=0.125 2023-10-04 15:26:22,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 15:26:22,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1702280.0, ans=0.0 2023-10-04 15:26:23,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:26:25,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 15:26:26,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:28,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 15:26:28,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:26:32,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 15:26:33,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:26:35,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:26:37,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:26:37,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:26:37,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:26:38,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:26:38,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:38,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:26:41,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:26:41,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:46,484 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.20 vs. limit=12.0 2023-10-04 15:26:48,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:26:48,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:26:50,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:26:50,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:55,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 15:26:55,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:55,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1702413.3333333333, ans=0.125 2023-10-04 15:26:59,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:59,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:27:00,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:27:01,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 15:27:04,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:06,168 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 15:27:08,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 15:27:08,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:27:10,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:27:10,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 15:27:12,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:15,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:27:15,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:27:17,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:17,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:27:18,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:27:18,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1702480.0, ans=0.1 2023-10-04 15:27:23,452 INFO [train.py:1046] (1/4) Epoch 49, batch 400, loss[loss=0.1521, simple_loss=0.2307, pruned_loss=0.03673, over 23834.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2307, pruned_loss=0.03593, over 4054898.23 frames. ], batch size: 212, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:27:23,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:27:26,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:27:27,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 15:27:27,796 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:27,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:27:29,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:27:30,603 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.007e+02 2.289e+02 2.741e+02 4.265e+02, threshold=4.578e+02, percent-clipped=0.0 2023-10-04 15:27:30,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:27:32,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:27:33,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:27:34,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 15:27:36,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 15:27:36,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:27:38,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 15:27:38,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:27:40,178 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.11 vs. limit=15.0 2023-10-04 15:27:40,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:27:40,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:27:42,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 15:27:42,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:27:42,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:27:42,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:27:42,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:46,962 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 15:27:47,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 15:27:51,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:27:52,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1702680.0, ans=0.0 2023-10-04 15:27:54,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:27:55,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 15:27:56,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 15:27:59,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:27:59,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1702680.0, ans=0.125 2023-10-04 15:28:02,008 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:28:07,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 15:28:10,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:28:12,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 15:28:12,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1702746.6666666667, ans=0.125 2023-10-04 15:28:15,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:28:16,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:28:17,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 15:28:20,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:28:23,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:28:23,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:28:26,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:28:26,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 15:28:28,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 15:28:29,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 15:28:30,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:28:30,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:28:32,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 15:28:34,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:28:35,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:28:36,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 15:28:38,232 INFO [train.py:1046] (1/4) Epoch 49, batch 450, loss[loss=0.1611, simple_loss=0.2428, pruned_loss=0.03966, over 23235.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2324, pruned_loss=0.03667, over 4192068.52 frames. ], batch size: 105, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:28:38,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 15:28:38,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:28:39,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:28:39,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:28:39,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 15:28:41,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:28:41,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:28:43,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:28:54,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:28:54,837 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:28:56,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 15:28:58,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 15:28:59,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:29:02,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:29:03,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:29:05,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff3.min_abs, batch_count=1702946.6666666667, ans=0.2 2023-10-04 15:29:07,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:29:08,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:29:10,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 15:29:10,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 15:29:12,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 15:29:12,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:29:14,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:29:14,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:29:17,508 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 15:29:17,517 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 15:29:17,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:29:19,474 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.34 vs. limit=10.0 2023-10-04 15:29:20,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:29:20,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 15:29:23,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:29:24,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1703080.0, ans=0.1 2023-10-04 15:29:25,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:29:25,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 15:29:26,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 15:29:29,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:29:31,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:29:32,339 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:29:33,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 15:29:33,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1703080.0, ans=0.1 2023-10-04 15:29:37,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:29:37,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 15:29:39,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 15:29:41,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:29:45,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:29:46,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:29:48,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:29:49,410 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 15:29:52,632 INFO [train.py:1046] (1/4) Epoch 49, batch 500, loss[loss=0.1509, simple_loss=0.2332, pruned_loss=0.03434, over 23149.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2339, pruned_loss=0.03711, over 4296394.77 frames. ], batch size: 105, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:29:54,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:29:54,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:29:54,756 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.15 vs. limit=15.0 2023-10-04 15:29:56,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:29:56,118 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 15:29:58,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 15:29:58,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:30:00,588 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.999e+02 2.217e+02 2.687e+02 4.030e+02, threshold=4.434e+02, percent-clipped=0.0 2023-10-04 15:30:02,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:30:05,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 15:30:06,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:30:09,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:30:09,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:30:09,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:09,883 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.57 vs. limit=22.5 2023-10-04 15:30:19,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:30:19,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:30:19,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:30:19,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:30:20,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 15:30:20,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:30:22,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1703346.6666666667, ans=0.0 2023-10-04 15:30:23,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:30:23,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:30:23,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:30:25,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:30:27,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 15:30:30,605 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 15:30:33,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:30:33,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:33,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1703346.6666666667, ans=0.0 2023-10-04 15:30:34,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:34,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:36,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:30:37,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 15:30:42,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:30:42,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:30:42,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1703413.3333333333, ans=0.0 2023-10-04 15:30:45,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:30:47,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:49,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1703413.3333333333, ans=0.1 2023-10-04 15:30:53,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:30:55,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1703480.0, ans=0.2 2023-10-04 15:30:56,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 15:30:56,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:30:56,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:31:00,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 15:31:01,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 15:31:01,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:31:02,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1703480.0, ans=0.125 2023-10-04 15:31:07,458 INFO [train.py:1046] (1/4) Epoch 49, batch 550, loss[loss=0.1603, simple_loss=0.2373, pruned_loss=0.04167, over 23721.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2352, pruned_loss=0.03739, over 4379637.61 frames. ], batch size: 212, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:31:07,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 15:31:09,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 15:31:09,652 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:31:09,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 15:31:10,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:31:10,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:31:12,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:12,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:12,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:31:13,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:31:14,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:31:15,467 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.75 vs. limit=15.0 2023-10-04 15:31:16,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 15:31:16,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:31:20,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:31:20,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:23,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:31:24,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:24,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1703613.3333333333, ans=0.125 2023-10-04 15:31:26,971 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.00 vs. limit=15.0 2023-10-04 15:31:27,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 15:31:29,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 15:31:31,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1703613.3333333333, ans=0.0 2023-10-04 15:31:32,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:31:36,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:31:36,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:31:38,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:31:42,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:31:42,793 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 15:31:42,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:45,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 15:31:46,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:31:48,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:31:48,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:31:49,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:31:51,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 15:31:52,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 15:31:53,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:31:53,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:31:53,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:31:53,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:31:57,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:31:57,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:32:01,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:32:02,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:32:02,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 15:32:04,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:32:06,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:32:07,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:32:07,407 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:32:09,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:32:09,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 15:32:16,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 15:32:16,916 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.77 vs. limit=22.5 2023-10-04 15:32:20,231 INFO [train.py:1046] (1/4) Epoch 49, batch 600, loss[loss=0.1441, simple_loss=0.2235, pruned_loss=0.03232, over 17177.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2353, pruned_loss=0.03744, over 4444759.37 frames. ], batch size: 37, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:32:20,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 15:32:21,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:32:21,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:32:21,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:32:24,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1703880.0, ans=0.05 2023-10-04 15:32:27,154 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.684e+02 1.961e+02 2.215e+02 2.451e+02 3.698e+02, threshold=4.430e+02, percent-clipped=0.0 2023-10-04 15:32:27,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:32:29,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:32:31,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 15:32:33,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:32:37,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:32:38,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:32:41,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 15:32:41,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:32:43,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1703946.6666666667, ans=0.125 2023-10-04 15:32:47,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 15:32:49,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:32:49,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:32:49,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:32:50,937 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:32:54,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:32:54,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:32:55,265 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.82 vs. limit=15.0 2023-10-04 15:32:56,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:33:04,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1704080.0, ans=0.04949747468305833 2023-10-04 15:33:05,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:33:05,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1704080.0, ans=0.0 2023-10-04 15:33:09,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:33:09,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:33:09,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:33:09,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1704080.0, ans=0.125 2023-10-04 15:33:13,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1704080.0, ans=0.0 2023-10-04 15:33:16,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 15:33:18,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1704146.6666666667, ans=0.125 2023-10-04 15:33:22,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:33:22,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:33:24,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 15:33:24,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1704146.6666666667, ans=0.0 2023-10-04 15:33:25,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:33:28,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 15:33:29,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:33:29,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:33:35,027 INFO [train.py:1046] (1/4) Epoch 49, batch 650, loss[loss=0.1606, simple_loss=0.2521, pruned_loss=0.03456, over 24541.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2352, pruned_loss=0.03701, over 4523886.24 frames. ], batch size: 71, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:33:36,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 15:33:36,584 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 15:33:40,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:33:41,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:33:44,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:33:45,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 15:33:47,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:33:51,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:33:51,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:33:54,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:33:58,383 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 15:33:59,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:33:59,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:34:03,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:34:04,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 15:34:06,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:34:06,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:07,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 15:34:08,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:10,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:34:12,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:34:12,890 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 15:34:14,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:34:14,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:34:14,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1704346.6666666667, ans=0.09899494936611666 2023-10-04 15:34:16,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:17,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:34:18,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:34:18,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:34:18,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 15:34:19,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:34:19,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:34:21,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:34:21,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:34:22,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:34:25,457 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 15:34:26,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 15:34:26,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:26,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:34:26,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:34:26,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1704413.3333333333, ans=0.0 2023-10-04 15:34:28,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:34:29,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:34:36,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:37,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:34:39,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:34:41,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:34:41,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 15:34:41,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:34:46,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1704480.0, ans=0.125 2023-10-04 15:34:47,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:34:47,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:34:47,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:34:47,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:34:47,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1704480.0, ans=0.125 2023-10-04 15:34:49,852 INFO [train.py:1046] (1/4) Epoch 49, batch 700, loss[loss=0.1555, simple_loss=0.2362, pruned_loss=0.03735, over 24648.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2332, pruned_loss=0.03688, over 4565984.99 frames. ], batch size: 65, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:34:52,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 15:34:52,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 15:34:55,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 15:34:55,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:56,641 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.769e+02 2.097e+02 2.386e+02 2.709e+02 4.404e+02, threshold=4.772e+02, percent-clipped=0.0 2023-10-04 15:34:56,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:34:59,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 15:35:04,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:35:07,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:35:08,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:35:10,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:35:10,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:35:14,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:35:16,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 15:35:16,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:35:19,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 15:35:20,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 15:35:24,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:35:24,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:35:25,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:35:29,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:35:31,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 15:35:36,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:35:36,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:35:36,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 15:35:40,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:35:42,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:35:45,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:35:47,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1704813.3333333333, ans=0.125 2023-10-04 15:35:51,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:35:51,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 15:35:51,904 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.60 vs. limit=15.0 2023-10-04 15:35:53,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 15:35:54,841 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.43 vs. limit=15.0 2023-10-04 15:35:55,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 15:35:56,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:35:59,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:35:59,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:36:02,144 INFO [train.py:1046] (1/4) Epoch 49, batch 750, loss[loss=0.1714, simple_loss=0.2442, pruned_loss=0.0493, over 23902.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2331, pruned_loss=0.03662, over 4602349.09 frames. ], batch size: 179, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:36:02,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:36:02,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 15:36:02,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1704880.0, ans=0.125 2023-10-04 15:36:04,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 15:36:05,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 15:36:05,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 15:36:07,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 15:36:07,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 15:36:08,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:36:09,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 15:36:09,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:36:11,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:36:13,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:36:15,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:36:16,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:36:16,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:36:19,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:36:19,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:36:20,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:36:22,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:36:22,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:36:23,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 15:36:24,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:36:24,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:36:27,503 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:36:27,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 15:36:28,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 15:36:28,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:36:32,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 15:36:32,277 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 15:36:34,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 15:36:34,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:36:34,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 15:36:35,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:36:39,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1705013.3333333333, ans=0.125 2023-10-04 15:36:41,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:36:41,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:36:41,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:36:45,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:36:47,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:36:47,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 15:36:48,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:36:48,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 15:36:49,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:36:52,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:36:52,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 15:36:53,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:36:58,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:36:59,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:36:59,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:02,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:37:07,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 15:37:07,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:37:08,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:37:10,183 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:37:11,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:37:13,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:37:13,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:37:16,993 INFO [train.py:1046] (1/4) Epoch 49, batch 800, loss[loss=0.1592, simple_loss=0.2377, pruned_loss=0.04031, over 23433.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2334, pruned_loss=0.03675, over 4627737.56 frames. ], batch size: 285, lr: 2.08e-03, grad_scale: 32.0 2023-10-04 15:37:21,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:37:21,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:37:21,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1705213.3333333333, ans=0.125 2023-10-04 15:37:22,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:37:22,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:37:24,005 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.062e+02 2.313e+02 2.836e+02 4.157e+02, threshold=4.626e+02, percent-clipped=0.0 2023-10-04 15:37:24,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:37:24,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:25,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:37:28,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:37:29,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:37:32,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 15:37:34,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:36,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:37:36,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:37:37,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:37:37,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 15:37:37,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:37:37,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 15:37:42,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:37:45,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:37:48,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:37:48,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:37:51,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:51,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:55,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:37:56,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:37:56,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 15:37:59,296 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 15:37:59,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 15:37:59,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:37:59,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:38:00,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:38:01,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:38:06,629 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 15:38:06,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 15:38:07,455 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.97 vs. limit=15.0 2023-10-04 15:38:08,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:38:10,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:38:14,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1705480.0, ans=0.125 2023-10-04 15:38:16,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:38:19,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:38:19,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 15:38:21,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:38:23,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 15:38:28,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:38:30,717 INFO [train.py:1046] (1/4) Epoch 49, batch 850, loss[loss=0.159, simple_loss=0.2485, pruned_loss=0.03474, over 24688.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2342, pruned_loss=0.03671, over 4657781.64 frames. ], batch size: 73, lr: 2.08e-03, grad_scale: 32.0 2023-10-04 15:38:30,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:38:30,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 15:38:30,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:38:32,250 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:38:32,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 15:38:32,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:38:34,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:38:36,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:38:38,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:38:39,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:38:40,003 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=10.83 vs. limit=22.5 2023-10-04 15:38:41,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 15:38:41,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 15:38:41,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 15:38:44,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:38:45,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:38:47,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:38:47,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:38:48,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:38:53,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:38:54,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:38:54,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 15:38:56,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 15:39:00,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:39:02,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 15:39:04,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1705680.0, ans=0.0 2023-10-04 15:39:06,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 15:39:07,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 15:39:09,086 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 15:39:09,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:39:09,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:39:09,122 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 15:39:12,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:39:13,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:39:13,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 15:39:17,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:39:17,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:39:18,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:39:18,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:39:19,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1705746.6666666667, ans=0.125 2023-10-04 15:39:20,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:39:20,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:39:21,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 15:39:21,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1705746.6666666667, ans=0.125 2023-10-04 15:39:25,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:39:25,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:39:26,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:39:26,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:39:27,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:39:29,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:39:32,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:39:33,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:39:34,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:39:34,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:39:42,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 15:39:44,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:39:45,386 INFO [train.py:1046] (1/4) Epoch 49, batch 900, loss[loss=0.1536, simple_loss=0.2365, pruned_loss=0.03533, over 23275.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2344, pruned_loss=0.03683, over 4662112.66 frames. ], batch size: 93, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:39:45,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 15:39:45,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:39:46,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:39:48,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 15:39:53,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:39:55,910 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.096e+02 2.361e+02 2.721e+02 4.850e+02, threshold=4.722e+02, percent-clipped=1.0 2023-10-04 15:39:56,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:39:57,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 15:40:00,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:40:00,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 15:40:02,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 15:40:03,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:40:03,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:40:03,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:40:03,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:40:13,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:40:13,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:40:13,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:40:16,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:40:20,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 15:40:21,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:40:25,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1706013.3333333333, ans=0.0 2023-10-04 15:40:25,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1706013.3333333333, ans=0.125 2023-10-04 15:40:26,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:40:26,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:40:26,534 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 15:40:28,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 15:40:31,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:40:31,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:40:32,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:40:38,033 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:40:38,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:40:40,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1706080.0, ans=0.125 2023-10-04 15:40:43,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 15:40:43,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:40:45,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 15:40:46,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:40:46,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:40:46,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:40:46,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:40:51,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 15:40:51,059 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 15:40:52,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 15:40:52,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 15:40:56,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:40:59,276 INFO [train.py:1046] (1/4) Epoch 49, batch 950, loss[loss=0.1642, simple_loss=0.2466, pruned_loss=0.04096, over 23701.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2343, pruned_loss=0.03711, over 4668308.88 frames. ], batch size: 85, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:41:00,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 15:41:05,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:41:05,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1706213.3333333333, ans=0.2 2023-10-04 15:41:06,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:41:07,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:41:07,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:41:11,198 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 15:41:14,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:41:14,538 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:41:15,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:41:15,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:41:15,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 15:41:16,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1706280.0, ans=0.125 2023-10-04 15:41:17,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=1706280.0, ans=0.05 2023-10-04 15:41:18,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 15:41:19,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:41:22,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 15:41:22,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:41:25,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:41:25,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:41:25,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:41:28,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 15:41:31,580 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 15:41:31,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1706346.6666666667, ans=0.0 2023-10-04 15:41:32,124 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.93 vs. limit=22.5 2023-10-04 15:41:33,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:41:34,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:41:39,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:41:39,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:41:40,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1706346.6666666667, ans=0.125 2023-10-04 15:41:43,306 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 15:41:45,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 15:41:45,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:41:45,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:41:47,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:41:47,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:41:51,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 15:41:51,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:41:54,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1706413.3333333333, ans=0.07 2023-10-04 15:41:55,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:41:55,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:41:55,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 15:41:55,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:41:55,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1706413.3333333333, ans=0.0 2023-10-04 15:41:56,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:41:56,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 15:41:59,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1706480.0, ans=0.0 2023-10-04 15:42:00,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:42:02,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1706480.0, ans=0.0 2023-10-04 15:42:04,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:42:08,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:42:08,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 15:42:08,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 15:42:08,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1706480.0, ans=0.125 2023-10-04 15:42:10,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1706480.0, ans=0.1 2023-10-04 15:42:12,430 INFO [train.py:1046] (1/4) Epoch 49, batch 1000, loss[loss=0.1343, simple_loss=0.2189, pruned_loss=0.02489, over 24624.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2332, pruned_loss=0.03682, over 4673676.96 frames. ], batch size: 60, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:42:12,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:42:16,373 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 15:42:17,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:42:21,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:42:23,145 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.037e+02 2.263e+02 2.669e+02 4.122e+02, threshold=4.525e+02, percent-clipped=0.0 2023-10-04 15:42:23,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 15:42:23,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 15:42:26,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:42:27,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:42:28,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:42:31,169 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.39 vs. limit=6.0 2023-10-04 15:42:31,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 15:42:34,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 15:42:36,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 15:42:37,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:42:41,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 15:42:42,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 15:42:42,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 15:42:44,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:42:46,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:42:53,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:42:54,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1706680.0, ans=0.025 2023-10-04 15:42:55,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:42:56,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:42:56,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:42:56,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 15:42:56,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:42:58,119 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:42:58,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:42:59,526 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 15:43:03,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 15:43:04,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 15:43:06,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 15:43:08,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:43:15,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:43:15,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:43:16,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:43:17,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:43:17,665 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.71 vs. limit=15.0 2023-10-04 15:43:19,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 15:43:20,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:43:20,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 15:43:21,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 15:43:21,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:43:21,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:43:24,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:43:27,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:43:28,542 INFO [train.py:1046] (1/4) Epoch 49, batch 1050, loss[loss=0.1357, simple_loss=0.2093, pruned_loss=0.03107, over 23596.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2319, pruned_loss=0.03613, over 4686895.83 frames. ], batch size: 256, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:43:28,624 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:43:29,437 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.07 vs. limit=12.0 2023-10-04 15:43:31,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:43:32,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:43:35,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:43:35,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:43:38,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:43:41,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:43:43,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:43:45,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:43:45,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:43:46,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:43:46,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:43:46,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 15:43:48,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:43:48,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 15:43:51,325 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:43:51,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 15:43:51,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 15:43:58,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:43:59,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:43:59,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:44:02,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 15:44:02,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 15:44:02,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:44:02,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1707013.3333333333, ans=0.2 2023-10-04 15:44:06,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 15:44:09,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1707013.3333333333, ans=0.125 2023-10-04 15:44:10,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 15:44:10,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:44:14,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 15:44:16,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 15:44:16,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:44:16,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:44:16,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1707080.0, ans=0.125 2023-10-04 15:44:21,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:44:23,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1707080.0, ans=0.04949747468305833 2023-10-04 15:44:25,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 15:44:25,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 15:44:25,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1707080.0, ans=0.1 2023-10-04 15:44:26,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 15:44:27,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:44:27,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:44:29,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 15:44:30,691 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.79 vs. limit=15.0 2023-10-04 15:44:32,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:44:34,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:44:34,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:44:34,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1707146.6666666667, ans=0.0 2023-10-04 15:44:35,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:44:35,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:44:39,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:44:39,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 15:44:41,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:44:41,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 15:44:41,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 15:44:42,565 INFO [train.py:1046] (1/4) Epoch 49, batch 1100, loss[loss=0.1566, simple_loss=0.2334, pruned_loss=0.03986, over 23589.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2317, pruned_loss=0.03612, over 4681617.76 frames. ], batch size: 256, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:44:42,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:44:44,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:44:49,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:44:53,340 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.768e+02 2.066e+02 2.358e+02 2.772e+02 6.166e+02, threshold=4.716e+02, percent-clipped=1.0 2023-10-04 15:44:53,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:44:54,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:44:54,853 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:44:56,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 15:44:57,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:44:59,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:44:59,708 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.74 vs. limit=22.5 2023-10-04 15:45:03,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:45:05,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:45:07,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 15:45:07,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 15:45:08,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1707280.0, ans=0.125 2023-10-04 15:45:09,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:45:09,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:45:11,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:45:13,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:45:16,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1707346.6666666667, ans=0.0 2023-10-04 15:45:19,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:45:23,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 15:45:24,509 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 15:45:24,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:45:26,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:45:27,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:45:28,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:45:30,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 15:45:30,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:45:30,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:45:30,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:45:31,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:45:31,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 15:45:35,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:45:35,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 15:45:39,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:45:44,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:45:47,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 15:45:47,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 15:45:48,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:45:50,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:45:50,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:45:52,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 15:45:53,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:45:53,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:45:53,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 15:45:55,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:45:55,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 15:45:55,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1707546.6666666667, ans=0.1 2023-10-04 15:45:56,832 INFO [train.py:1046] (1/4) Epoch 49, batch 1150, loss[loss=0.1766, simple_loss=0.241, pruned_loss=0.05606, over 19509.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2323, pruned_loss=0.03607, over 4697032.67 frames. ], batch size: 388, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:45:56,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:45:56,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:45:58,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:46:03,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:46:05,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:46:06,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:46:08,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:46:08,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 15:46:08,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:46:10,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 15:46:12,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:46:12,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:46:18,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 15:46:20,121 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:46:23,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:46:24,595 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:46:24,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 15:46:26,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:46:26,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:46:29,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 15:46:30,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:46:32,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:46:32,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1707680.0, ans=0.125 2023-10-04 15:46:40,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:46:48,013 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:46:48,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 15:46:49,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:46:49,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:46:54,988 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 15:46:57,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:47:02,599 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 15:47:05,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:47:06,386 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.72 vs. limit=6.0 2023-10-04 15:47:06,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:47:06,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:47:08,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:47:09,613 INFO [train.py:1046] (1/4) Epoch 49, batch 1200, loss[loss=0.1596, simple_loss=0.2334, pruned_loss=0.04294, over 22760.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2331, pruned_loss=0.03601, over 4713558.37 frames. ], batch size: 322, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:47:11,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:47:17,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:47:17,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:47:19,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:47:19,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:47:19,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:47:20,502 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.013e+02 2.245e+02 2.650e+02 4.852e+02, threshold=4.489e+02, percent-clipped=1.0 2023-10-04 15:47:20,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:47:21,657 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.66 vs. limit=15.0 2023-10-04 15:47:22,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:47:22,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1707880.0, ans=0.1 2023-10-04 15:47:23,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:47:23,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:47:25,615 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 15:47:26,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1707946.6666666667, ans=0.125 2023-10-04 15:47:28,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 15:47:28,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1707946.6666666667, ans=0.2 2023-10-04 15:47:30,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1707946.6666666667, ans=0.1 2023-10-04 15:47:32,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:47:35,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:47:38,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:47:39,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:47:39,653 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 15:47:39,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:47:41,972 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.17 vs. limit=22.5 2023-10-04 15:47:44,578 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.22 vs. limit=15.0 2023-10-04 15:47:46,165 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.59 vs. limit=15.0 2023-10-04 15:47:48,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:47:48,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:47:48,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 15:47:49,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:47:53,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 15:47:56,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 15:47:56,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:47:58,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:47:59,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:48:01,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:48:01,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:48:02,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:48:02,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:48:02,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 15:48:02,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1708080.0, ans=0.0 2023-10-04 15:48:02,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1708080.0, ans=0.125 2023-10-04 15:48:03,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:48:05,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:48:05,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 15:48:08,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:48:08,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:48:11,184 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:48:13,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 15:48:15,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:48:16,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 15:48:20,009 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1708146.6666666667, ans=0.125 2023-10-04 15:48:22,512 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 15:48:23,777 INFO [train.py:1046] (1/4) Epoch 49, batch 1250, loss[loss=0.169, simple_loss=0.2449, pruned_loss=0.04655, over 23590.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2338, pruned_loss=0.03626, over 4717445.94 frames. ], batch size: 256, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:48:23,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:48:25,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:48:28,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:48:28,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1708213.3333333333, ans=0.05 2023-10-04 15:48:31,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:48:33,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 15:48:37,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:48:38,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:48:39,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 15:48:40,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:48:40,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1708280.0, ans=0.2 2023-10-04 15:48:41,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:48:44,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:48:44,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:48:46,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:48:46,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:48:47,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1708280.0, ans=0.07 2023-10-04 15:48:49,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:48:51,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 15:48:52,012 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:48:53,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:48:53,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:48:54,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:48:57,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:48:59,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 15:49:00,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1708346.6666666667, ans=0.125 2023-10-04 15:49:04,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 15:49:04,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:49:04,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1708346.6666666667, ans=0.125 2023-10-04 15:49:05,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1708346.6666666667, ans=0.125 2023-10-04 15:49:07,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:49:07,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 15:49:09,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:49:09,164 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 15:49:09,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:49:10,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:49:13,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:49:15,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:49:17,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:49:18,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 15:49:18,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 15:49:18,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 15:49:22,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:49:22,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1708480.0, ans=0.2 2023-10-04 15:49:23,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 15:49:23,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:49:25,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 15:49:25,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:49:28,337 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.96 vs. limit=10.0 2023-10-04 15:49:29,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 15:49:29,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:49:29,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:49:29,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 15:49:30,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:49:32,292 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.61 vs. limit=10.0 2023-10-04 15:49:32,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 15:49:35,537 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:49:36,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:49:37,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:49:38,256 INFO [train.py:1046] (1/4) Epoch 49, batch 1300, loss[loss=0.1455, simple_loss=0.2285, pruned_loss=0.03129, over 23512.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2338, pruned_loss=0.03621, over 4723212.99 frames. ], batch size: 120, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:49:41,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:49:42,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:49:43,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 15:49:45,703 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.62 vs. limit=12.0 2023-10-04 15:49:47,690 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 2.055e+02 2.239e+02 2.568e+02 3.936e+02, threshold=4.477e+02, percent-clipped=0.0 2023-10-04 15:49:49,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:49:49,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:49:49,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1708546.6666666667, ans=0.125 2023-10-04 15:49:51,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:49:51,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:49:52,660 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:49:53,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 15:49:54,439 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.02 vs. limit=15.0 2023-10-04 15:49:58,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:49:59,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:50:01,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 15:50:04,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:50:07,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:50:08,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:50:10,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:50:10,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:50:10,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1708680.0, ans=0.09899494936611666 2023-10-04 15:50:11,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:50:11,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 15:50:13,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 15:50:13,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1708680.0, ans=0.125 2023-10-04 15:50:17,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:50:17,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:50:19,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 15:50:19,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:50:22,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:50:23,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:50:24,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 15:50:26,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:50:26,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 15:50:27,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:50:29,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1708746.6666666667, ans=0.0 2023-10-04 15:50:31,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:50:31,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:50:34,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 15:50:35,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 15:50:36,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 15:50:40,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:50:43,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 15:50:45,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:50:51,100 INFO [train.py:1046] (1/4) Epoch 49, batch 1350, loss[loss=0.1588, simple_loss=0.2455, pruned_loss=0.03606, over 24043.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2334, pruned_loss=0.03601, over 4714936.28 frames. ], batch size: 80, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:50:52,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 15:50:55,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:50:57,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:51:00,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:51:00,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:51:03,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:51:03,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:51:08,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:51:09,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 15:51:10,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:51:11,619 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.52 vs. limit=15.0 2023-10-04 15:51:12,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:51:13,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1708946.6666666667, ans=0.125 2023-10-04 15:51:15,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 15:51:15,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:51:16,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:51:16,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 15:51:19,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 15:51:21,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 15:51:21,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:51:21,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 15:51:31,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:51:36,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1709080.0, ans=0.1 2023-10-04 15:51:37,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1709080.0, ans=0.125 2023-10-04 15:51:41,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:51:43,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:51:43,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 15:51:44,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:51:44,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 15:51:46,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:51:46,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:51:47,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:51:47,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1709080.0, ans=0.1 2023-10-04 15:51:50,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 15:51:51,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1709146.6666666667, ans=0.125 2023-10-04 15:51:52,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:51:58,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 15:51:59,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 15:51:59,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1709146.6666666667, ans=0.125 2023-10-04 15:52:03,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1709213.3333333333, ans=0.125 2023-10-04 15:52:04,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1709213.3333333333, ans=0.125 2023-10-04 15:52:05,608 INFO [train.py:1046] (1/4) Epoch 49, batch 1400, loss[loss=0.1327, simple_loss=0.2177, pruned_loss=0.02381, over 20427.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2328, pruned_loss=0.03598, over 4708957.17 frames. ], batch size: 44, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:52:07,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 15:52:08,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:52:11,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:52:12,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:52:14,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1709213.3333333333, ans=0.2 2023-10-04 15:52:16,623 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.734e+02 2.032e+02 2.354e+02 2.703e+02 4.112e+02, threshold=4.708e+02, percent-clipped=0.0 2023-10-04 15:52:18,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 15:52:19,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 15:52:24,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1709280.0, ans=0.5 2023-10-04 15:52:27,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1709280.0, ans=0.125 2023-10-04 15:52:30,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:52:30,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:52:33,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:52:33,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 15:52:36,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:52:38,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 15:52:46,570 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:52:46,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:52:49,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 15:52:49,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1709413.3333333333, ans=0.0 2023-10-04 15:52:51,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:52:52,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:52:54,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:52:54,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:52:55,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:52:55,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:52:55,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:52:56,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1709413.3333333333, ans=0.025 2023-10-04 15:52:58,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 15:52:58,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:53:02,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:04,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1709480.0, ans=0.125 2023-10-04 15:53:06,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:53:11,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 15:53:12,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1709480.0, ans=0.125 2023-10-04 15:53:13,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:53:13,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:53:16,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 15:53:16,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:53:18,650 INFO [train.py:1046] (1/4) Epoch 49, batch 1450, loss[loss=0.1398, simple_loss=0.1935, pruned_loss=0.0431, over 18900.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2318, pruned_loss=0.03589, over 4709453.05 frames. ], batch size: 388, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:53:18,710 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:53:20,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1709546.6666666667, ans=0.125 2023-10-04 15:53:23,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:53:23,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:53:23,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:23,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 15:53:31,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:53:31,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:53:32,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:53:32,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 15:53:34,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:53:35,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 15:53:35,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:35,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:53:35,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 15:53:37,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:53:38,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:53:39,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 15:53:39,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:53:40,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:53:42,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:44,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:53:47,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:53:47,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:53:48,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:53:48,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:50,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:53:52,120 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:53:52,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:52,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:53:56,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 15:53:58,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:54:02,376 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 15:54:03,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:54:05,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:54:06,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:54:06,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 15:54:11,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:54:12,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 15:54:13,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 15:54:13,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:54:17,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:54:19,740 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:54:21,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 15:54:22,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 15:54:23,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 15:54:25,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:54:25,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:54:32,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1709880.0, ans=0.2 2023-10-04 15:54:32,984 INFO [train.py:1046] (1/4) Epoch 49, batch 1500, loss[loss=0.1537, simple_loss=0.2328, pruned_loss=0.03726, over 22813.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.233, pruned_loss=0.03567, over 4723689.79 frames. ], batch size: 322, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:54:37,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 15:54:37,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:54:37,225 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:54:38,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:54:39,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:54:41,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:54:41,347 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 15:54:43,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:54:44,516 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.778e+02 2.098e+02 2.317e+02 2.688e+02 4.133e+02, threshold=4.633e+02, percent-clipped=0.0 2023-10-04 15:54:44,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 15:54:44,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:54:44,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:54:46,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:54:47,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:54:53,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:54:53,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 15:54:54,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:54:54,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:54:56,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:54:57,138 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=3.96 vs. limit=15.0 2023-10-04 15:54:57,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 15:55:00,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 15:55:03,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:55:04,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 15:55:06,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 15:55:09,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1710013.3333333333, ans=0.035 2023-10-04 15:55:10,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:55:10,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:55:10,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:55:12,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 15:55:13,575 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:55:13,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:55:14,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 15:55:15,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:55:15,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1710080.0, ans=0.09899494936611666 2023-10-04 15:55:21,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:55:21,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 15:55:25,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:55:27,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:55:31,830 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 15:55:31,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:55:31,873 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 15:55:33,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:55:35,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:55:35,240 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 15:55:36,659 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:55:40,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 15:55:41,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:55:44,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:55:44,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:55:45,894 INFO [train.py:1046] (1/4) Epoch 49, batch 1550, loss[loss=0.1614, simple_loss=0.2508, pruned_loss=0.03604, over 24348.00 frames. ], tot_loss[loss=0.153, simple_loss=0.234, pruned_loss=0.03597, over 4717519.66 frames. ], batch size: 77, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:55:45,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:55:46,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:55:46,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:55:47,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 15:55:49,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 15:55:49,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:55:49,568 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 15:55:50,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 15:55:53,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:55:55,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:55:55,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:55:56,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:55:57,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:55:58,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:56:01,148 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 15:56:01,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:01,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:56:01,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:56:04,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:56:04,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 15:56:04,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1710280.0, ans=0.1 2023-10-04 15:56:05,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:56:05,970 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 15:56:07,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 15:56:07,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 15:56:07,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:09,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:56:11,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1710280.0, ans=0.125 2023-10-04 15:56:12,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:56:13,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 15:56:13,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 15:56:22,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:56:25,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:56:25,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:56:25,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:56:26,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 15:56:31,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:56:33,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:34,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:56:36,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1710413.3333333333, ans=10.0 2023-10-04 15:56:37,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:56:37,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:56:37,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 15:56:37,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:56:40,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:56:40,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:40,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 15:56:40,459 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 15:56:44,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:56:45,227 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.80 vs. limit=10.0 2023-10-04 15:56:47,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 15:56:52,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:56:54,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:54,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 15:56:55,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:56:55,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1710480.0, ans=0.0 2023-10-04 15:56:56,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:56:56,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:56:57,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:56:58,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:57:01,265 INFO [train.py:1046] (1/4) Epoch 49, batch 1600, loss[loss=0.1633, simple_loss=0.2384, pruned_loss=0.04406, over 23840.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2347, pruned_loss=0.03626, over 4718287.12 frames. ], batch size: 212, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 15:57:03,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:57:03,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 15:57:03,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 15:57:06,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 15:57:10,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:57:11,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 15:57:11,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:57:13,214 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.000e+02 2.241e+02 2.538e+02 3.324e+02, threshold=4.482e+02, percent-clipped=0.0 2023-10-04 15:57:14,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:57:16,455 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:57:18,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1710613.3333333333, ans=0.125 2023-10-04 15:57:19,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:57:22,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 15:57:25,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:57:25,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 15:57:25,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:57:26,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 15:57:33,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 15:57:44,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:57:44,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 15:57:45,003 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.72 vs. limit=15.0 2023-10-04 15:57:45,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:57:45,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:57:45,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:57:48,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 15:57:51,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 15:57:52,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:57:52,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:57:53,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1710746.6666666667, ans=0.2 2023-10-04 15:57:54,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:57:56,048 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:57:58,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:57:59,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:58:00,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:58:06,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:58:08,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:58:09,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 15:58:09,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:58:11,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 15:58:15,896 INFO [train.py:1046] (1/4) Epoch 49, batch 1650, loss[loss=0.156, simple_loss=0.232, pruned_loss=0.04005, over 23695.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2352, pruned_loss=0.0365, over 4705395.86 frames. ], batch size: 149, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:58:16,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1710880.0, ans=0.0 2023-10-04 15:58:17,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:58:17,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:58:18,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:58:18,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 15:58:18,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 15:58:18,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 15:58:20,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 15:58:20,958 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.40 vs. limit=15.0 2023-10-04 15:58:23,511 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.43 vs. limit=22.5 2023-10-04 15:58:24,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:58:24,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:58:26,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:58:26,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:58:27,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:58:29,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 15:58:32,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:58:32,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:58:32,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:58:32,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:58:35,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 15:58:35,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 15:58:39,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:58:41,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:58:50,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 15:58:51,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:58:52,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 15:58:55,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:59:00,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:59:00,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:59:02,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:59:03,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:59:03,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:59:05,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:59:06,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:59:06,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:59:06,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:59:06,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1711080.0, ans=0.125 2023-10-04 15:59:07,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:59:09,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:59:12,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:59:13,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 15:59:15,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:59:15,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 15:59:17,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 15:59:17,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 15:59:18,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:59:19,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:59:19,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:59:20,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:59:20,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 15:59:24,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:59:25,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:59:26,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:59:28,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 15:59:29,737 INFO [train.py:1046] (1/4) Epoch 49, batch 1700, loss[loss=0.1304, simple_loss=0.207, pruned_loss=0.02687, over 21519.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2339, pruned_loss=0.03653, over 4699244.01 frames. ], batch size: 47, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:59:33,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:59:33,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:59:33,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 15:59:33,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1711213.3333333333, ans=0.125 2023-10-04 15:59:35,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:59:35,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:59:35,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:59:37,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:59:37,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:59:37,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 15:59:40,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:59:44,087 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.787e+02 2.122e+02 2.407e+02 2.844e+02 4.213e+02, threshold=4.814e+02, percent-clipped=0.0 2023-10-04 15:59:47,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:59:50,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:59:50,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1711280.0, ans=0.0 2023-10-04 15:59:50,959 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.24 vs. limit=22.5 2023-10-04 15:59:51,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1711280.0, ans=0.1 2023-10-04 15:59:55,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:59:55,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:59:55,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:59:55,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:59:57,728 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.77 vs. limit=15.0 2023-10-04 15:59:58,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 16:00:00,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:00:00,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:01,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:00:02,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:00:02,635 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.95 vs. limit=15.0 2023-10-04 16:00:05,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 16:00:05,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 16:00:08,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:09,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 16:00:10,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1711346.6666666667, ans=0.0 2023-10-04 16:00:11,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:00:13,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1711346.6666666667, ans=0.125 2023-10-04 16:00:15,311 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.98 vs. limit=15.0 2023-10-04 16:00:18,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:00:20,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:00:20,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:00:22,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:00:22,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 16:00:22,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:00:25,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:25,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 16:00:26,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:00:26,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:00:28,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:28,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:00:30,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:00:30,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:00:31,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:00:31,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:00:32,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:00:35,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:00:35,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 16:00:39,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:00:41,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:00:42,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 16:00:45,051 INFO [train.py:1046] (1/4) Epoch 49, batch 1750, loss[loss=0.1564, simple_loss=0.2496, pruned_loss=0.0316, over 24560.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2324, pruned_loss=0.03606, over 4702754.19 frames. ], batch size: 71, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:00:48,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:00:48,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1711546.6666666667, ans=0.125 2023-10-04 16:00:49,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:00:50,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:00:50,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 16:00:51,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1711546.6666666667, ans=0.125 2023-10-04 16:00:52,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:54,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:00:55,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:01:00,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 16:01:01,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:01:02,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1711613.3333333333, ans=0.125 2023-10-04 16:01:03,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 16:01:03,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:01:07,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:01:09,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 16:01:10,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 16:01:11,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:01:13,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 16:01:20,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:01:21,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:01:21,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:01:24,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:01:24,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:01:27,366 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:01:27,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:01:30,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:01:31,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:01:32,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 16:01:34,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:01:36,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 16:01:37,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:01:40,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:01:40,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:01:40,552 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.33 vs. limit=15.0 2023-10-04 16:01:46,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:01:46,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 16:01:47,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:01:48,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:01:53,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:01:55,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:01:57,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:01:57,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 16:01:57,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:01:58,778 INFO [train.py:1046] (1/4) Epoch 49, batch 1800, loss[loss=0.1604, simple_loss=0.2384, pruned_loss=0.04115, over 23716.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2317, pruned_loss=0.03581, over 4701262.99 frames. ], batch size: 212, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:01:58,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 16:01:58,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:01:58,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:01:58,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:01:58,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:01:59,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1711880.0, ans=0.0 2023-10-04 16:02:03,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:02:04,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:02:06,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:02:09,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:02:11,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:02:12,223 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.071e+02 2.300e+02 2.752e+02 3.980e+02, threshold=4.601e+02, percent-clipped=0.0 2023-10-04 16:02:12,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:02:13,564 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.56 vs. limit=15.0 2023-10-04 16:02:14,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:02:15,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:02:15,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:02:17,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:02:20,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:02:20,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 16:02:21,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:02:21,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1711946.6666666667, ans=0.0 2023-10-04 16:02:25,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:02:29,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 16:02:32,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 16:02:32,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 16:02:32,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:02:34,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:02:34,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:02:35,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:02:43,272 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 16:02:44,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:02:46,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:02:48,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 16:02:48,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 16:02:49,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:02:49,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:02:49,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1712080.0, ans=0.1 2023-10-04 16:02:51,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:02:55,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 16:03:00,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:03:00,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 16:03:02,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:03:02,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:03:02,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:03:03,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 16:03:06,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:03:06,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:03:09,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 16:03:09,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:03:12,564 INFO [train.py:1046] (1/4) Epoch 49, batch 1850, loss[loss=0.1636, simple_loss=0.2377, pruned_loss=0.04476, over 23844.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2324, pruned_loss=0.03585, over 4699369.23 frames. ], batch size: 179, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:03:12,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:03:12,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:03:12,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:03:14,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1712213.3333333333, ans=0.125 2023-10-04 16:03:15,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:03:16,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:03:18,441 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:03:18,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:03:21,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:03:22,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:03:28,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:03:28,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 16:03:31,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 16:03:35,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 16:03:38,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:03:38,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 16:03:38,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 16:03:39,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1712346.6666666667, ans=0.125 2023-10-04 16:03:42,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1712346.6666666667, ans=0.0 2023-10-04 16:03:45,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:03:47,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 16:03:50,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:03:50,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:03:54,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 16:03:54,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:03:54,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:03:56,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:03:58,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:04:00,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:04:04,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:04:04,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:04:05,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 16:04:05,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:04:07,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:04:08,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:04:11,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 16:04:11,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:04:14,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:04:14,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:04:14,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 16:04:14,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 16:04:17,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1712480.0, ans=0.125 2023-10-04 16:04:18,368 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 16:04:18,450 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 16:04:19,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:04:19,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:04:21,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:04:21,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:04:21,369 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 16:04:22,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:04:22,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:04:23,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 16:04:25,284 INFO [train.py:1046] (1/4) Epoch 49, batch 1900, loss[loss=0.1546, simple_loss=0.2486, pruned_loss=0.03024, over 24451.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2333, pruned_loss=0.03606, over 4719520.96 frames. ], batch size: 69, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:04:25,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:04:26,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:04:26,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 16:04:28,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:04:29,922 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 16:04:29,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:04:31,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:04:35,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:04:38,345 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.723e+02 2.070e+02 2.219e+02 2.494e+02 3.485e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-04 16:04:38,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:04:39,853 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 16:04:41,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 16:04:42,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:04:42,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:04:43,930 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 16:04:43,963 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 16:04:48,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 16:04:50,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:04:52,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 16:04:52,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1712613.3333333333, ans=0.125 2023-10-04 16:04:55,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 16:05:03,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 16:05:05,647 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.29 vs. limit=15.0 2023-10-04 16:05:06,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 16:05:06,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:05:07,805 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 16:05:07,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 16:05:09,265 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 16:05:09,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 16:05:09,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:05:12,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1712746.6666666667, ans=0.125 2023-10-04 16:05:13,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 16:05:15,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1712746.6666666667, ans=0.1 2023-10-04 16:05:16,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:05:21,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:05:21,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 16:05:21,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:05:24,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 16:05:25,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1712813.3333333333, ans=0.1 2023-10-04 16:05:25,081 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1712813.3333333333, ans=0.125 2023-10-04 16:05:25,189 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:05:26,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:05:31,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:05:31,653 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:05:33,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:05:33,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:05:34,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:05:34,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:05:36,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:05:39,002 INFO [train.py:1046] (1/4) Epoch 49, batch 1950, loss[loss=0.1502, simple_loss=0.2387, pruned_loss=0.03084, over 24465.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2352, pruned_loss=0.03646, over 4718880.68 frames. ], batch size: 66, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:05:39,061 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:05:39,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:05:40,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:05:40,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:05:40,542 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:05:40,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1712880.0, ans=0.0 2023-10-04 16:05:41,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:05:44,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:05:46,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:05:48,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:05:48,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:05:50,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 16:05:52,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 16:05:52,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:05:53,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:05:55,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:05:57,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:05:57,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:06:00,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:06:02,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:06:02,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:06:02,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:06:02,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:06:06,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:06:06,434 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1712946.6666666667, ans=0.1 2023-10-04 16:06:07,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:06:07,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:06:07,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:06:07,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 16:06:09,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:06:09,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:06:10,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:06:14,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:06:15,397 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.84 vs. limit=10.0 2023-10-04 16:06:15,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:06:23,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:06:25,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:06:25,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:06:26,384 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 16:06:26,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:06:31,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:06:32,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:06:32,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:06:38,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:06:40,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:06:40,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1713146.6666666667, ans=0.125 2023-10-04 16:06:41,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:06:43,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:06:44,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:06:45,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:06:47,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 16:06:47,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:06:48,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:06:48,679 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 16:06:51,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:06:53,178 INFO [train.py:1046] (1/4) Epoch 49, batch 2000, loss[loss=0.1588, simple_loss=0.2392, pruned_loss=0.03922, over 24608.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2361, pruned_loss=0.03681, over 4707760.50 frames. ], batch size: 68, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:06:54,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1713213.3333333333, ans=0.1 2023-10-04 16:06:55,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:06:57,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:06:57,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:06:58,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:07:00,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:07:04,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 16:07:05,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:07:07,143 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 2.055e+02 2.276e+02 2.688e+02 4.506e+02, threshold=4.553e+02, percent-clipped=2.0 2023-10-04 16:07:07,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1713280.0, ans=0.1 2023-10-04 16:07:08,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:07:08,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 16:07:10,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:07:10,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:07:12,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:07:14,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 16:07:15,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:16,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:17,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:17,757 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.63 vs. limit=15.0 2023-10-04 16:07:19,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 16:07:19,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:07:20,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 16:07:20,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:07:25,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:07:27,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 16:07:27,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:27,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:07:28,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:07:29,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 16:07:32,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 16:07:32,630 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:07:32,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:07:38,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:07:40,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:07:40,234 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:07:40,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:07:41,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:07:41,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:07:43,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:07:43,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:07:44,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:48,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:07:48,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 16:07:53,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:07:56,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:07:59,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:07:59,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:08:02,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:04,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:08:04,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:06,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:08:06,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:08:08,078 INFO [train.py:1046] (1/4) Epoch 49, batch 2050, loss[loss=0.1668, simple_loss=0.2557, pruned_loss=0.03889, over 24415.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2355, pruned_loss=0.03658, over 4709136.24 frames. ], batch size: 69, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:08:08,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:08:09,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:12,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:08:12,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1713546.6666666667, ans=0.0 2023-10-04 16:08:13,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:18,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:08:20,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:08:20,763 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:22,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:08:22,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 16:08:22,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:08:25,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:08:25,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:08:34,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:08:34,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:08:37,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 16:08:38,413 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.73 vs. limit=6.0 2023-10-04 16:08:39,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:08:40,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 16:08:40,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:08:42,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:08:44,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:08:44,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 16:08:44,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1713680.0, ans=0.1 2023-10-04 16:08:46,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:08:47,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:08:48,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:08:48,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:08:52,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:08:53,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:08:55,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:08:57,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:09:00,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:09:06,132 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:09:07,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 16:09:13,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:09:13,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:09:16,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:09:18,150 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.21 vs. limit=15.0 2023-10-04 16:09:18,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 16:09:21,479 INFO [train.py:1046] (1/4) Epoch 49, batch 2100, loss[loss=0.1518, simple_loss=0.2305, pruned_loss=0.0365, over 23667.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2329, pruned_loss=0.0362, over 4693301.15 frames. ], batch size: 149, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:09:21,611 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 16:09:21,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:09:23,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:09:23,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:09:26,802 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:09:26,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 16:09:26,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 16:09:27,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=1713880.0, ans=0.05 2023-10-04 16:09:29,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:09:32,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:09:32,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:09:35,079 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.36 vs. limit=6.0 2023-10-04 16:09:35,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:09:35,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:09:35,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 16:09:37,105 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 2.123e+02 2.389e+02 2.758e+02 4.259e+02, threshold=4.778e+02, percent-clipped=0.0 2023-10-04 16:09:37,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:09:37,271 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 16:09:37,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 16:09:40,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:09:40,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:09:40,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 16:09:40,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 16:09:46,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 16:09:46,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:09:47,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.whiten.whitening_limit, batch_count=1713946.6666666667, ans=12.0 2023-10-04 16:09:48,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:09:48,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:09:51,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:09:51,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 16:09:53,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:09:53,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 16:09:56,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 16:09:57,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:09:57,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 16:09:57,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 16:09:57,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 16:09:59,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:09:59,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1714013.3333333333, ans=0.0 2023-10-04 16:10:00,674 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.91 vs. limit=10.0 2023-10-04 16:10:02,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:10:05,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:10:06,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:10:06,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:08,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:10:08,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 16:10:08,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:10:10,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:10:10,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:10,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 16:10:11,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 16:10:13,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 16:10:15,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:10:18,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:10:18,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 16:10:23,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:10:24,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1714146.6666666667, ans=0.125 2023-10-04 16:10:25,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:10:25,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:10:25,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:10:27,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 16:10:27,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:10:31,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:10:31,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:10:32,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:10:32,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:33,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 16:10:34,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 16:10:34,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:10:35,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1714213.3333333333, ans=0.015 2023-10-04 16:10:36,847 INFO [train.py:1046] (1/4) Epoch 49, batch 2150, loss[loss=0.1492, simple_loss=0.1998, pruned_loss=0.04933, over 19147.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2322, pruned_loss=0.03585, over 4700477.36 frames. ], batch size: 389, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:10:36,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:10:36,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:10:36,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:10:36,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:10:42,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 16:10:44,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=1714213.3333333333, ans=0.5 2023-10-04 16:10:45,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:10:45,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:48,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:10:48,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:10:48,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:10:53,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:53,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:10:53,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:10:57,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:10:58,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 16:11:03,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:11:03,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:11:05,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:05,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:11:05,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:06,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:11:07,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:11:07,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:11:07,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:11:09,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 16:11:10,497 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.80 vs. limit=15.0 2023-10-04 16:11:12,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:11:13,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:11:13,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:11:15,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:11:16,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:11:19,404 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:11:19,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:11:20,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:11:20,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 16:11:20,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:11:24,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:11:24,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:25,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:11:26,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:11:28,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:11:29,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:29,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 16:11:31,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 16:11:32,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:11:32,385 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 16:11:34,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:11:34,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:11:37,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 16:11:37,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:11:37,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 16:11:37,099 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 16:11:37,099 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 16:11:37,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 16:11:38,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:11:39,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1714480.0, ans=0.2 2023-10-04 16:11:40,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:11:40,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:11:40,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:41,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:11:43,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:11:43,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:44,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1714480.0, ans=0.0 2023-10-04 16:11:50,538 INFO [train.py:1046] (1/4) Epoch 49, batch 2200, loss[loss=0.1442, simple_loss=0.2288, pruned_loss=0.02983, over 24347.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2319, pruned_loss=0.03546, over 4707878.93 frames. ], batch size: 61, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:11:52,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:11:52,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 16:11:56,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:11:59,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:12:00,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:12:00,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:12:02,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:12:03,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:12:03,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:12:03,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 16:12:05,037 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.091e+02 2.261e+02 2.558e+02 4.417e+02, threshold=4.522e+02, percent-clipped=0.0 2023-10-04 16:12:07,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1714613.3333333333, ans=0.0 2023-10-04 16:12:10,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 16:12:10,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:12:15,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1714613.3333333333, ans=0.125 2023-10-04 16:12:16,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 16:12:18,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:12:19,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:12:19,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:12:24,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:12:24,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 16:12:25,958 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:12:28,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:12:31,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:12:31,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 16:12:35,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:12:37,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:12:38,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1714746.6666666667, ans=0.0 2023-10-04 16:12:40,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:12:40,693 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.18 vs. limit=22.5 2023-10-04 16:12:42,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:12:44,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 16:12:44,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:12:46,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 16:12:49,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:12:49,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:12:49,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:12:52,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:12:52,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:12:52,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:12:52,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:12:53,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 16:12:53,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:12:55,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:12:58,165 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 16:12:58,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:13:00,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:13:01,635 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 16:13:02,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:13:04,315 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 16:13:05,470 INFO [train.py:1046] (1/4) Epoch 49, batch 2250, loss[loss=0.1625, simple_loss=0.243, pruned_loss=0.04101, over 23739.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2324, pruned_loss=0.03572, over 4723731.32 frames. ], batch size: 212, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:13:05,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 16:13:05,646 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 16:13:07,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:13:08,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:13:10,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:13:11,916 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 16:13:13,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1714880.0, ans=0.0 2023-10-04 16:13:14,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:13:17,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:13:21,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:13:23,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:13:26,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:13:26,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:13:28,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:13:30,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 16:13:30,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:13:31,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:13:31,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1714946.6666666667, ans=0.1 2023-10-04 16:13:32,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 16:13:32,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:13:34,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:13:35,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:13:37,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff2.min_abs, batch_count=1715013.3333333333, ans=0.1 2023-10-04 16:13:38,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:13:40,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 16:13:40,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 16:13:43,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 16:13:45,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:13:48,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:13:54,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:13:54,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:13:55,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:13:55,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:13:57,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1715080.0, ans=0.2 2023-10-04 16:13:58,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:13:58,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:14:04,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:14:06,070 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:14:09,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:14:09,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:14:10,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:14:15,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 16:14:16,442 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.19 vs. limit=15.0 2023-10-04 16:14:18,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:14:18,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 16:14:18,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:14:18,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:14:21,494 INFO [train.py:1046] (1/4) Epoch 49, batch 2300, loss[loss=0.1661, simple_loss=0.2428, pruned_loss=0.04471, over 23812.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2332, pruned_loss=0.03589, over 4723472.07 frames. ], batch size: 195, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:14:21,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 16:14:24,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:14:24,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:14:29,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:14:30,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:14:33,182 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 16:14:34,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:14:35,847 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.800e+02 2.150e+02 2.484e+02 2.874e+02 4.816e+02, threshold=4.968e+02, percent-clipped=1.0 2023-10-04 16:14:41,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:14:41,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 16:14:43,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:14:43,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:14:43,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 16:14:43,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:14:46,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:14:47,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:14:51,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:14:53,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:14:56,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:15:02,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:15:03,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:15:03,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:15:06,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:15:10,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:15:10,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:15:12,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:15:12,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 16:15:15,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 16:15:15,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:15:15,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:15:17,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:15:17,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:15:19,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 16:15:19,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:15:20,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 16:15:20,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:15:20,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:15:20,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 16:15:26,453 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:15:29,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:15:32,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1715480.0, ans=0.0 2023-10-04 16:15:33,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:15:33,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:15:35,179 INFO [train.py:1046] (1/4) Epoch 49, batch 2350, loss[loss=0.1255, simple_loss=0.1998, pruned_loss=0.02559, over 22733.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2343, pruned_loss=0.03656, over 4710266.00 frames. ], batch size: 322, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:15:35,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:15:36,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:15:36,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:15:36,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:15:38,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 16:15:45,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:15:45,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 16:15:49,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 16:15:53,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:15:54,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:15:54,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:15:54,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:15:54,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:15:56,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 16:15:58,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:16:03,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 16:16:04,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:16:05,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1715680.0, ans=0.125 2023-10-04 16:16:08,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:16:09,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:16:11,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:16:13,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 16:16:13,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:16:15,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:16:15,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:16:15,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:16:18,735 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.52 vs. limit=22.5 2023-10-04 16:16:20,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:16:21,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1715746.6666666667, ans=0.2 2023-10-04 16:16:23,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 16:16:23,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:16:24,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:16:24,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:16:27,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 16:16:28,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:16:30,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 16:16:31,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:16:36,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 16:16:40,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 16:16:41,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:16:41,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 16:16:41,883 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 16:16:41,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 16:16:45,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 16:16:46,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:16:46,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1715813.3333333333, ans=0.125 2023-10-04 16:16:49,294 INFO [train.py:1046] (1/4) Epoch 49, batch 2400, loss[loss=0.1477, simple_loss=0.238, pruned_loss=0.02871, over 24560.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2345, pruned_loss=0.03673, over 4715447.36 frames. ], batch size: 71, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:16:52,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:16:55,495 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:16:56,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:16:57,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 16:16:57,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1715880.0, ans=0.125 2023-10-04 16:16:58,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 16:16:59,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1715880.0, ans=0.04949747468305833 2023-10-04 16:17:03,994 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.122e+02 2.325e+02 2.661e+02 3.983e+02, threshold=4.649e+02, percent-clipped=0.0 2023-10-04 16:17:05,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:17:05,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:17:08,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 16:17:08,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:17:09,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:17:09,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 16:17:13,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:17:14,823 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.78 vs. limit=15.0 2023-10-04 16:17:16,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 16:17:21,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:17:24,628 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 16:17:27,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:17:27,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:17:31,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:17:33,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 16:17:33,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:17:40,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:17:41,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:17:45,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:17:45,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:17:45,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 16:17:46,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:17:46,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:17:46,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:17:46,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:17:52,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:17:52,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:17:53,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 16:17:55,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 16:17:55,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:17:57,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:17:57,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 16:17:58,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 16:17:58,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 16:17:58,510 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 16:17:58,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 16:17:59,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:18:01,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:18:01,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:18:02,549 INFO [train.py:1046] (1/4) Epoch 49, batch 2450, loss[loss=0.1432, simple_loss=0.2235, pruned_loss=0.03144, over 23693.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2324, pruned_loss=0.03616, over 4702694.67 frames. ], batch size: 149, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:18:02,642 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 16:18:02,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:18:02,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1716213.3333333333, ans=0.05 2023-10-04 16:18:04,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 16:18:07,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:18:07,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:18:11,395 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:18:11,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:18:12,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 16:18:19,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:18:19,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:18:22,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:18:22,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:18:22,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:18:22,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1716280.0, ans=0.125 2023-10-04 16:18:23,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 16:18:26,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:18:28,385 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:18:29,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:18:33,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:18:33,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:18:35,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:18:35,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:18:37,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 16:18:38,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:18:47,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:18:47,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:18:49,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:18:50,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:18:50,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:18:51,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:18:51,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 16:18:53,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:18:55,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:18:58,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:18:58,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:19:02,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:19:02,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 16:19:02,885 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:19:04,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:19:04,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 16:19:04,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:19:06,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:19:08,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:19:09,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1716480.0, ans=0.0 2023-10-04 16:19:11,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:19:11,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:19:16,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 16:19:17,558 INFO [train.py:1046] (1/4) Epoch 49, batch 2500, loss[loss=0.1663, simple_loss=0.2466, pruned_loss=0.04296, over 23293.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2311, pruned_loss=0.03585, over 4692815.72 frames. ], batch size: 93, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:19:17,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:19:23,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:19:32,425 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.795e+02 2.204e+02 2.495e+02 3.004e+02 4.519e+02, threshold=4.991e+02, percent-clipped=0.0 2023-10-04 16:19:32,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:19:32,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:19:33,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:19:33,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 16:19:34,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1716613.3333333333, ans=0.2 2023-10-04 16:19:41,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:19:41,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:19:43,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 16:19:43,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 16:19:44,633 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 16:19:46,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:19:46,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:19:47,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 16:19:47,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:19:47,433 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 16:19:47,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:19:49,585 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.02 vs. limit=22.5 2023-10-04 16:19:52,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:19:53,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:19:56,565 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:19:56,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 16:19:57,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:19:59,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:19:59,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1716680.0, ans=0.0 2023-10-04 16:20:02,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:20:06,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:20:08,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1716746.6666666667, ans=0.125 2023-10-04 16:20:10,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:20:15,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:20:17,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 16:20:17,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:20:17,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:20:20,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:20:20,545 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:20:21,822 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 16:20:21,822 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 16:20:21,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 16:20:21,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1716813.3333333333, ans=0.125 2023-10-04 16:20:26,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:20:27,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 16:20:27,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 16:20:27,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:20:29,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 16:20:31,906 INFO [train.py:1046] (1/4) Epoch 49, batch 2550, loss[loss=0.1486, simple_loss=0.2358, pruned_loss=0.03071, over 23628.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2317, pruned_loss=0.03574, over 4701441.86 frames. ], batch size: 149, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:20:32,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 16:20:36,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:20:36,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:20:38,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:20:38,199 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:20:39,947 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 16:20:41,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:20:41,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1716880.0, ans=0.125 2023-10-04 16:20:43,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 16:20:45,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:20:48,585 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:20:50,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:20:51,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 16:20:51,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:20:51,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:20:52,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:20:54,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:20:54,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 16:20:55,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:20:55,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:20:55,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 16:20:58,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1716946.6666666667, ans=0.125 2023-10-04 16:21:05,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1717013.3333333333, ans=0.1 2023-10-04 16:21:08,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:21:10,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1717013.3333333333, ans=0.04949747468305833 2023-10-04 16:21:13,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:21:13,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:21:13,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:21:15,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:21:21,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:21:24,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:21:24,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:21:24,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:21:24,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 16:21:24,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:21:28,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:21:28,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:21:32,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:21:32,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 16:21:32,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:21:32,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:21:34,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:21:36,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:21:37,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:21:39,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1717146.6666666667, ans=0.125 2023-10-04 16:21:43,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:21:45,173 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:21:46,535 INFO [train.py:1046] (1/4) Epoch 49, batch 2600, loss[loss=0.1523, simple_loss=0.2441, pruned_loss=0.03021, over 24679.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2324, pruned_loss=0.03586, over 4701661.23 frames. ], batch size: 68, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:21:47,887 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 16:21:49,329 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 16:21:49,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:21:51,057 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 16:21:51,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 16:21:51,145 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 16:21:53,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:21:53,887 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 16:21:56,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 16:21:57,116 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.68 vs. limit=15.0 2023-10-04 16:21:57,918 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 16:21:59,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:22:00,746 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 2.147e+02 2.547e+02 3.024e+02 6.453e+02, threshold=5.093e+02, percent-clipped=2.0 2023-10-04 16:22:00,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 16:22:02,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 16:22:02,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1717280.0, ans=0.1 2023-10-04 16:22:03,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:22:03,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 16:22:06,375 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 16:22:06,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 16:22:12,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:22:12,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:22:13,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:22:13,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 16:22:13,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1717280.0, ans=0.125 2023-10-04 16:22:13,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1717280.0, ans=0.125 2023-10-04 16:22:16,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:22:18,164 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.01 vs. limit=15.0 2023-10-04 16:22:21,692 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 16:22:29,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:22:29,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:22:29,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 16:22:29,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1717413.3333333333, ans=0.0 2023-10-04 16:22:30,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:22:30,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:22:32,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 16:22:34,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:22:34,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:22:36,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:22:41,044 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 16:22:41,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:22:41,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:22:48,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:22:48,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:22:48,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 16:22:48,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:22:51,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:22:51,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:22:56,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 16:22:56,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:22:59,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:23:00,774 INFO [train.py:1046] (1/4) Epoch 49, batch 2650, loss[loss=0.1515, simple_loss=0.2333, pruned_loss=0.03483, over 24354.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2333, pruned_loss=0.03606, over 4707183.80 frames. ], batch size: 61, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:23:02,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 16:23:02,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:23:03,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:23:04,893 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 16:23:04,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:23:06,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:23:09,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:23:10,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:23:12,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:23:14,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 16:23:14,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:23:14,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:23:18,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 16:23:20,354 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 16:23:23,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:23:25,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 16:23:25,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:23:27,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 16:23:29,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.10 vs. limit=15.0 2023-10-04 16:23:30,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:23:30,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:23:31,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:23:32,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:23:34,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1717680.0, ans=0.0 2023-10-04 16:23:37,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 16:23:37,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 16:23:40,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:23:42,800 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 16:23:42,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:23:44,184 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:23:44,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:23:44,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:23:44,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:23:46,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1717746.6666666667, ans=0.0 2023-10-04 16:23:47,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:23:48,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:23:48,927 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:23:49,183 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:23:50,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:23:50,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:23:53,028 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.53 vs. limit=15.0 2023-10-04 16:23:53,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:23:53,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:23:53,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1717746.6666666667, ans=0.0 2023-10-04 16:23:55,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:23:55,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1717746.6666666667, ans=0.125 2023-10-04 16:23:56,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:23:56,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 16:23:59,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:24:01,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:24:01,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:24:01,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 16:24:02,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1717813.3333333333, ans=0.125 2023-10-04 16:24:05,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:24:06,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:24:08,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:24:09,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:09,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:24:10,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:13,976 INFO [train.py:1046] (1/4) Epoch 49, batch 2700, loss[loss=0.1634, simple_loss=0.2519, pruned_loss=0.03743, over 23380.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2337, pruned_loss=0.03623, over 4710879.01 frames. ], batch size: 105, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:24:14,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:24:14,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 16:24:15,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:24:16,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 16:24:20,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:24:20,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:20,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:22,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:24:22,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:24:22,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:24:22,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1717880.0, ans=0.1 2023-10-04 16:24:23,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 16:24:23,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 16:24:24,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:24:26,634 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.04 vs. limit=12.0 2023-10-04 16:24:27,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:24:28,700 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.037e+02 2.270e+02 2.571e+02 4.005e+02, threshold=4.540e+02, percent-clipped=0.0 2023-10-04 16:24:28,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:24:30,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:24:33,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:24:34,075 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.50 vs. limit=15.0 2023-10-04 16:24:34,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 16:24:35,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1717946.6666666667, ans=0.125 2023-10-04 16:24:36,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:24:40,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:24:40,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:24:46,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:24:46,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:24:46,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:24:46,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:24:48,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1718013.3333333333, ans=0.125 2023-10-04 16:24:49,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:24:51,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:24:51,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:24:51,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:24:57,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:57,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:25:04,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:25:05,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:25:07,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1718080.0, ans=0.125 2023-10-04 16:25:09,711 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:25:09,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:25:12,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:25:13,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:25:15,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:25:15,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:16,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:25:16,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:25:20,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:25:21,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:25:21,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:25:24,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 16:25:24,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:25:26,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:25:26,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 16:25:27,705 INFO [train.py:1046] (1/4) Epoch 49, batch 2750, loss[loss=0.1348, simple_loss=0.2172, pruned_loss=0.02621, over 19464.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2337, pruned_loss=0.0362, over 4702422.09 frames. ], batch size: 42, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:25:27,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 16:25:27,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:25:30,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:25:32,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:25:35,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:36,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:25:36,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:36,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1718213.3333333333, ans=0.0 2023-10-04 16:25:39,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:25:39,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 16:25:40,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:25:40,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:40,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 16:25:40,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:25:40,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:25:46,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 16:25:46,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1718280.0, ans=0.125 2023-10-04 16:25:47,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:25:47,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:49,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:25:49,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 16:25:49,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:25:51,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:25:53,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:25:53,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:25:55,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:25:56,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 16:25:56,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:25:57,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:58,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:26:05,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:26:08,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:26:08,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:26:11,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:26:11,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:26:11,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:26:11,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1718413.3333333333, ans=0.125 2023-10-04 16:26:18,746 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:26:18,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:26:18,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 16:26:19,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1718413.3333333333, ans=0.125 2023-10-04 16:26:23,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:26:24,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 16:26:25,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1718480.0, ans=0.0 2023-10-04 16:26:29,092 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 16:26:31,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:26:31,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 16:26:33,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:26:35,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:26:35,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 16:26:36,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:26:38,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1718480.0, ans=0.0 2023-10-04 16:26:39,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 16:26:39,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:26:40,616 INFO [train.py:1046] (1/4) Epoch 49, batch 2800, loss[loss=0.1501, simple_loss=0.2374, pruned_loss=0.0314, over 24627.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2329, pruned_loss=0.03614, over 4706799.29 frames. ], batch size: 65, lr: 2.07e-03, grad_scale: 32.0 2023-10-04 16:26:40,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:26:40,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 16:26:41,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:26:41,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:26:44,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:26:45,879 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 16:26:45,880 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 16:26:46,613 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.34 vs. limit=15.0 2023-10-04 16:26:47,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:26:51,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:26:52,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:26:54,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1718613.3333333333, ans=0.125 2023-10-04 16:26:55,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:26:57,062 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.030e+02 2.375e+02 2.876e+02 4.786e+02, threshold=4.750e+02, percent-clipped=2.0 2023-10-04 16:26:58,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 16:27:00,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 16:27:00,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 16:27:01,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:27:02,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:27:02,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:27:07,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:27:07,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:27:08,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 16:27:09,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:27:15,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:27:15,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:27:19,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:27:20,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:27:20,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:27:26,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:27:26,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 16:27:27,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:27:27,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1718746.6666666667, ans=0.1 2023-10-04 16:27:28,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:27:28,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:27:31,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:27:32,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:27:37,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:27:38,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:27:38,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:27:38,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:27:40,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:27:40,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:27:41,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:27:42,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 16:27:42,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:27:44,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:27:44,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:27:45,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 16:27:46,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:27:46,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:27:48,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:27:49,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 16:27:52,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:27:52,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:27:54,846 INFO [train.py:1046] (1/4) Epoch 49, batch 2850, loss[loss=0.1578, simple_loss=0.2413, pruned_loss=0.03717, over 23983.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2331, pruned_loss=0.0359, over 4708551.66 frames. ], batch size: 80, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:27:54,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:27:58,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:28:00,550 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.09 vs. limit=15.0 2023-10-04 16:28:00,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:28:00,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:28:02,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:28:04,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:28:04,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:28:06,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:28:07,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 16:28:11,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1718946.6666666667, ans=0.125 2023-10-04 16:28:13,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 16:28:13,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:28:15,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 16:28:16,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:28:18,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1718946.6666666667, ans=0.2 2023-10-04 16:28:19,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 16:28:19,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 16:28:20,483 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.74 vs. limit=15.0 2023-10-04 16:28:20,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:28:27,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1719013.3333333333, ans=0.0 2023-10-04 16:28:28,439 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=1.95 vs. limit=12.0 2023-10-04 16:28:32,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:28:33,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:28:34,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:28:35,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:28:35,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:28:35,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:28:38,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:28:39,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 16:28:39,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1719080.0, ans=0.05 2023-10-04 16:28:40,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:28:40,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:28:40,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:28:42,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:28:43,020 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.42 vs. limit=12.0 2023-10-04 16:28:45,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:28:45,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:28:46,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:28:48,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:28:49,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:28:49,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:28:52,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:28:54,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:28:58,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:29:01,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 16:29:01,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 16:29:03,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:29:04,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:29:04,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 16:29:04,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:29:06,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:29:06,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:29:06,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:29:06,686 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 16:29:06,728 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 16:29:06,731 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:29:08,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:29:09,290 INFO [train.py:1046] (1/4) Epoch 49, batch 2900, loss[loss=0.1535, simple_loss=0.2369, pruned_loss=0.03504, over 23708.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2333, pruned_loss=0.03579, over 4718756.92 frames. ], batch size: 149, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:29:12,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 16:29:12,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:29:13,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:29:14,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 16:29:17,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:29:17,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 16:29:19,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 16:29:21,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:29:21,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:29:23,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:29:24,588 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.683e+02 2.064e+02 2.211e+02 2.560e+02 4.990e+02, threshold=4.422e+02, percent-clipped=1.0 2023-10-04 16:29:24,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:29:28,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:29:28,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:29:30,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1719280.0, ans=0.2 2023-10-04 16:29:31,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:29:31,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 16:29:32,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1719280.0, ans=0.125 2023-10-04 16:29:33,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:29:33,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:29:33,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1719280.0, ans=0.1 2023-10-04 16:29:36,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 16:29:38,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 16:29:40,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:29:40,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 16:29:40,708 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:29:43,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:29:43,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 16:29:46,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:29:47,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:29:51,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:29:53,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1719413.3333333333, ans=0.125 2023-10-04 16:29:54,469 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:29:55,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 16:29:57,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 16:29:57,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:30:00,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:30:02,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 16:30:03,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:30:07,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1719480.0, ans=0.125 2023-10-04 16:30:10,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:30:16,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:30:16,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:30:18,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 16:30:21,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:30:21,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 16:30:22,476 INFO [train.py:1046] (1/4) Epoch 49, batch 2950, loss[loss=0.1344, simple_loss=0.215, pruned_loss=0.02688, over 24298.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.234, pruned_loss=0.03576, over 4717808.10 frames. ], batch size: 61, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:30:22,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:30:23,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:30:24,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1719546.6666666667, ans=0.125 2023-10-04 16:30:28,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:30:32,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 16:30:33,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:30:33,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:30:33,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:30:34,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:30:36,813 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 16:30:38,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 16:30:38,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:30:38,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:30:44,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:30:46,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:30:48,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:30:48,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:30:51,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:30:51,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:30:51,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1719680.0, ans=0.125 2023-10-04 16:30:52,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:30:54,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:30:54,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:30:55,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 16:31:01,581 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 16:31:02,860 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 16:31:02,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:31:04,732 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 16:31:06,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 16:31:06,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:31:07,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:31:07,987 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 16:31:07,991 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 16:31:08,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1719746.6666666667, ans=0.125 2023-10-04 16:31:09,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 16:31:10,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:31:10,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:31:12,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:31:14,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:31:14,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:31:15,583 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 16:31:15,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:31:15,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 16:31:21,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:31:22,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:31:22,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 16:31:22,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:31:24,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 16:31:24,709 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.43 vs. limit=15.0 2023-10-04 16:31:26,935 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:31:28,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:31:30,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:31:31,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:31:31,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 16:31:32,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:31:32,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:31:32,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:31:34,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:31:36,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:31:36,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:31:37,774 INFO [train.py:1046] (1/4) Epoch 49, batch 3000, loss[loss=0.1548, simple_loss=0.235, pruned_loss=0.03731, over 24597.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2348, pruned_loss=0.03615, over 4713933.36 frames. ], batch size: 60, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:31:37,774 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 16:31:49,877 INFO [train.py:1078] (1/4) Epoch 49, validation: loss=0.3542, simple_loss=0.2825, pruned_loss=0.2129, over 1125622.00 frames. 2023-10-04 16:31:49,878 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 16:31:50,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:31:50,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 16:31:51,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:31:54,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:31:54,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:31:57,317 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 16:31:58,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 16:32:00,204 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:32:00,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:32:02,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 16:32:02,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:32:06,780 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.107e+02 2.317e+02 2.666e+02 4.231e+02, threshold=4.633e+02, percent-clipped=0.0 2023-10-04 16:32:08,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:32:08,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1719946.6666666667, ans=0.0 2023-10-04 16:32:17,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:32:22,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 16:32:26,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:32:27,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:32:28,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:32:28,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:32:31,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:32:31,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 16:32:34,475 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 16:32:36,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:32:36,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 16:32:38,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1720080.0, ans=0.125 2023-10-04 16:32:39,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:32:39,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:32:41,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:32:41,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:32:41,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1720080.0, ans=0.125 2023-10-04 16:32:43,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:32:44,662 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.64 vs. limit=15.0 2023-10-04 16:32:45,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:32:45,277 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:32:45,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:32:46,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 16:32:48,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:32:49,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:32:49,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:32:53,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:32:53,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:32:56,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 16:32:56,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 16:32:56,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:32:56,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 16:32:58,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:33:00,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 16:33:02,504 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:33:03,800 INFO [train.py:1046] (1/4) Epoch 49, batch 3050, loss[loss=0.2056, simple_loss=0.2763, pruned_loss=0.06749, over 19317.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2351, pruned_loss=0.03644, over 4715872.78 frames. ], batch size: 388, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:33:03,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:33:03,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 16:33:03,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 16:33:03,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:33:04,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1720213.3333333333, ans=0.125 2023-10-04 16:33:05,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:33:05,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:33:05,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:33:07,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:07,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:33:10,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 16:33:13,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:33:15,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:33:15,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1720213.3333333333, ans=0.1 2023-10-04 16:33:16,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:33:20,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:22,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 16:33:29,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 16:33:29,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 16:33:29,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:33:31,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1720280.0, ans=0.1 2023-10-04 16:33:32,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:33:34,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:34,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:33:35,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:33:38,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:33:38,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:33:38,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:33:38,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:33:38,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:33:40,454 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:40,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:33:43,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:33:43,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 16:33:45,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:45,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:33:45,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1720346.6666666667, ans=0.125 2023-10-04 16:33:48,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:33:49,040 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:33:50,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:33:50,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:33:53,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1720413.3333333333, ans=0.125 2023-10-04 16:33:54,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:33:55,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:33:58,017 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.74 vs. limit=12.0 2023-10-04 16:34:00,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:01,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:34:01,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:34:05,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:34:05,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 16:34:05,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:34:06,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 16:34:08,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:34:09,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:11,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 16:34:13,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:34:13,597 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.80 vs. limit=15.0 2023-10-04 16:34:17,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:34:18,654 INFO [train.py:1046] (1/4) Epoch 49, batch 3100, loss[loss=0.1688, simple_loss=0.2437, pruned_loss=0.04697, over 23744.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2347, pruned_loss=0.03654, over 4714377.80 frames. ], batch size: 164, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:34:18,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:34:21,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:34:22,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 16:34:25,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 16:34:25,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 16:34:25,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1720546.6666666667, ans=0.125 2023-10-04 16:34:26,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:34:29,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1720546.6666666667, ans=0.125 2023-10-04 16:34:30,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:34:30,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:33,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 16:34:34,930 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 2.156e+02 2.470e+02 2.977e+02 4.757e+02, threshold=4.941e+02, percent-clipped=2.0 2023-10-04 16:34:36,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:36,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1720613.3333333333, ans=0.2 2023-10-04 16:34:41,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 16:34:45,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 16:34:46,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:34:46,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:34:46,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:34:46,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 16:34:48,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:34:48,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 16:34:48,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:34:50,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:52,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 16:34:54,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:34:58,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:34:58,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 16:34:59,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 16:35:01,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:01,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:35:02,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:02,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:02,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:35:03,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:35:03,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:35:04,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1720746.6666666667, ans=0.1 2023-10-04 16:35:09,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:35:09,340 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:35:09,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:09,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 16:35:13,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:35:14,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 16:35:15,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1720746.6666666667, ans=0.0 2023-10-04 16:35:17,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:35:18,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 16:35:18,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:19,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:20,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 16:35:30,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 16:35:31,991 INFO [train.py:1046] (1/4) Epoch 49, batch 3150, loss[loss=0.1452, simple_loss=0.2236, pruned_loss=0.03339, over 19675.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2338, pruned_loss=0.03603, over 4725003.43 frames. ], batch size: 43, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:35:32,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:35:33,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:33,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:35:33,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:35:34,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 16:35:36,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:35:37,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 16:35:39,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 16:35:41,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:42,880 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 16:35:45,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 16:35:45,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:35:47,048 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 16:35:48,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 16:35:48,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1720946.6666666667, ans=0.2 2023-10-04 16:35:49,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 16:35:49,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 16:35:49,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 16:35:51,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:51,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:35:51,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:53,883 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 16:35:55,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:35:55,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:35:55,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:35:57,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:36:01,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 16:36:01,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:36:02,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:36:02,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:36:04,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 16:36:04,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1721013.3333333333, ans=0.09899494936611666 2023-10-04 16:36:07,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 16:36:09,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:36:09,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 16:36:09,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 16:36:10,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:36:10,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:36:12,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:36:12,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 16:36:13,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 16:36:15,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:36:15,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:16,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:36:16,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:36:17,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 16:36:19,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:36:20,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 16:36:20,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:21,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 16:36:23,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 16:36:26,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:36:26,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:36:27,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 16:36:28,914 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 16:36:30,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:36:32,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:36:33,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:33,815 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:36:39,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:36:40,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:42,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 16:36:43,267 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1721146.6666666667, ans=0.5 2023-10-04 16:36:45,688 INFO [train.py:1046] (1/4) Epoch 49, batch 3200, loss[loss=0.1374, simple_loss=0.2245, pruned_loss=0.02516, over 24452.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2329, pruned_loss=0.03555, over 4732400.85 frames. ], batch size: 63, lr: 2.07e-03, grad_scale: 32.0 2023-10-04 16:36:48,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:36:48,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 16:36:51,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:51,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:36:51,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 16:36:52,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:36:56,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:37:00,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:37:01,621 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.995e+02 2.234e+02 2.632e+02 4.209e+02, threshold=4.468e+02, percent-clipped=0.0 2023-10-04 16:37:07,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:37:07,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1721280.0, ans=0.0 2023-10-04 16:37:16,801 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.06 vs. limit=10.0 2023-10-04 16:37:18,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 16:37:18,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:37:24,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 16:37:24,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 16:37:27,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1721346.6666666667, ans=0.0 2023-10-04 16:37:28,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:37:28,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:37:29,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:37:33,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 16:37:34,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 16:37:35,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 16:37:38,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 16:37:41,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:37:42,698 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.56 vs. limit=15.0 2023-10-04 16:37:46,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:37:48,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:37:48,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:37:48,088 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 16:37:48,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:37:50,406 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.72 vs. limit=10.0 2023-10-04 16:37:51,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:37:52,447 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 16:37:52,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 16:37:53,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 16:37:56,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 16:37:57,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:37:59,222 INFO [train.py:1046] (1/4) Epoch 49, batch 3250, loss[loss=0.1565, simple_loss=0.2315, pruned_loss=0.04077, over 23547.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.233, pruned_loss=0.03563, over 4719856.80 frames. ], batch size: 120, lr: 2.07e-03, grad_scale: 32.0 2023-10-04 16:37:59,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:37:59,370 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 16:37:59,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:37:59,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:01,396 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 16:38:04,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:38:07,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:38:12,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:38:12,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 16:38:14,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:38:14,230 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:38:14,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:38:15,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:38:16,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:38:18,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:19,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:38:19,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:38:19,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:19,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:19,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:38:21,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:38:22,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:38:25,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:38:25,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:26,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:38:28,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:38:28,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:38:33,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 16:38:35,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:38:35,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:38:35,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:38:37,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:38:42,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:38:44,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1721746.6666666667, ans=0.2 2023-10-04 16:38:50,007 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:38:51,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:38:51,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 16:38:51,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:38:51,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 16:38:52,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:38:54,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 16:38:55,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 16:38:55,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:38:56,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:38:57,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1721813.3333333333, ans=0.125 2023-10-04 16:38:58,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:38:59,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 16:38:59,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:39:04,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:39:04,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:39:05,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 16:39:05,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:39:08,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:39:08,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 16:39:12,834 INFO [train.py:1046] (1/4) Epoch 49, batch 3300, loss[loss=0.1556, simple_loss=0.2464, pruned_loss=0.03241, over 24315.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2334, pruned_loss=0.03564, over 4724481.96 frames. ], batch size: 74, lr: 2.07e-03, grad_scale: 32.0 2023-10-04 16:39:12,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:39:12,925 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 16:39:15,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 16:39:15,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 16:39:18,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:39:18,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1721880.0, ans=0.2 2023-10-04 16:39:19,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:39:21,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:39:21,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:39:22,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 16:39:23,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:39:25,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1721880.0, ans=0.0 2023-10-04 16:39:26,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:39:27,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:39:29,113 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.089e+02 2.371e+02 2.866e+02 4.389e+02, threshold=4.743e+02, percent-clipped=0.0 2023-10-04 16:39:30,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 16:39:30,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:39:30,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:39:32,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:39:34,028 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 16:39:35,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:39:35,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 16:39:36,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:39:36,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:39:36,687 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 16:39:36,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1721946.6666666667, ans=0.0 2023-10-04 16:39:41,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:39:41,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:39:42,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:39:42,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 16:39:44,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 16:39:44,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:39:45,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:39:47,612 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 16:39:50,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 16:39:50,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:39:51,539 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.61 vs. limit=15.0 2023-10-04 16:39:53,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 16:39:56,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:39:57,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:39:58,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:39:59,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:40:00,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:40:00,521 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:40:00,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:40:02,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:40:02,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:40:04,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:40:05,399 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 16:40:05,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 16:40:08,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 16:40:08,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:40:08,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:40:11,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:40:11,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:40:13,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:40:14,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:14,379 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:40:15,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:40:17,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:40:21,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1722146.6666666667, ans=0.0 2023-10-04 16:40:22,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 16:40:22,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:40:22,470 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:25,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:40:25,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:40:26,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:40:28,009 INFO [train.py:1046] (1/4) Epoch 49, batch 3350, loss[loss=0.1533, simple_loss=0.2313, pruned_loss=0.0377, over 23504.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2337, pruned_loss=0.03607, over 4724677.76 frames. ], batch size: 134, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:40:28,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:40:28,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:40:32,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:40:33,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:40:33,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:40:36,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:38,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:40:39,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:40:39,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1722213.3333333333, ans=0.09899494936611666 2023-10-04 16:40:41,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:40:42,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 16:40:44,374 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 16:40:44,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:40:47,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 16:40:47,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 16:40:47,906 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:40:47,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:40:49,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:40:50,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 16:40:50,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:50,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:40:53,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:56,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:40:56,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:56,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:41:00,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:41:03,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:41:04,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:41:07,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:41:09,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:41:09,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1722346.6666666667, ans=0.0 2023-10-04 16:41:10,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:41:10,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:41:13,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:41:16,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 16:41:16,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:41:16,394 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 16:41:16,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:41:17,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 16:41:19,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:41:21,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:41:26,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:41:26,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 16:41:28,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:41:29,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:41:30,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:41:35,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:41:35,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 16:41:36,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:41:36,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:41:38,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:41:40,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 16:41:40,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:41:40,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 16:41:41,487 INFO [train.py:1046] (1/4) Epoch 49, batch 3400, loss[loss=0.149, simple_loss=0.2402, pruned_loss=0.02888, over 24649.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2351, pruned_loss=0.03676, over 4722437.62 frames. ], batch size: 73, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:41:41,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:41:41,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:41:42,980 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:41:44,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:41:44,870 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 16:41:47,872 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.72 vs. limit=6.0 2023-10-04 16:41:49,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 16:41:49,702 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 16:41:49,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:41:53,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:41:53,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:41:55,189 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:41:57,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:41:59,747 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 2.147e+02 2.480e+02 2.893e+02 4.349e+02, threshold=4.960e+02, percent-clipped=0.0 2023-10-04 16:42:02,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:42:02,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 16:42:07,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1722613.3333333333, ans=0.1 2023-10-04 16:42:08,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:42:11,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:42:11,174 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:42:12,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 16:42:18,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:42:22,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 16:42:26,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:42:28,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:42:28,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 16:42:28,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:42:28,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:42:29,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:42:31,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:42:34,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:42:34,569 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1722746.6666666667, ans=0.1 2023-10-04 16:42:38,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:42:38,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:42:42,856 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:42:44,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 16:42:48,621 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.80 vs. limit=15.0 2023-10-04 16:42:49,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 16:42:50,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1722813.3333333333, ans=0.125 2023-10-04 16:42:55,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 16:42:56,865 INFO [train.py:1046] (1/4) Epoch 49, batch 3450, loss[loss=0.1591, simple_loss=0.2193, pruned_loss=0.04948, over 19623.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2347, pruned_loss=0.03687, over 4713877.22 frames. ], batch size: 389, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:42:57,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1722880.0, ans=0.125 2023-10-04 16:42:58,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 16:42:58,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:42:59,759 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:43:01,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 16:43:01,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:43:04,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:43:08,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:43:10,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:43:11,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:43:11,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:43:13,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:43:19,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 16:43:21,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1722946.6666666667, ans=0.125 2023-10-04 16:43:26,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 16:43:26,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 16:43:26,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:43:28,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:43:35,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 16:43:35,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:43:37,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1723013.3333333333, ans=0.05 2023-10-04 16:43:39,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:43:39,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:43:42,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:43:43,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:43:45,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 16:43:45,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:43:45,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1723080.0, ans=0.125 2023-10-04 16:43:47,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:43:51,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:43:51,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1723080.0, ans=0.125 2023-10-04 16:43:52,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 16:43:57,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:44:00,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:44:01,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:02,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:44:07,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:07,615 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:44:08,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:44:10,146 INFO [train.py:1046] (1/4) Epoch 49, batch 3500, loss[loss=0.1452, simple_loss=0.217, pruned_loss=0.03665, over 23720.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2337, pruned_loss=0.03654, over 4715356.73 frames. ], batch size: 232, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:44:10,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:44:12,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:44:17,083 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:44:17,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 16:44:18,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 16:44:22,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 16:44:25,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:44:25,517 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 16:44:27,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1723280.0, ans=0.0 2023-10-04 16:44:28,204 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.140e+02 2.425e+02 2.856e+02 5.490e+02, threshold=4.850e+02, percent-clipped=2.0 2023-10-04 16:44:28,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:44:29,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:44:31,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:44:31,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:44:31,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:44:32,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:33,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:44:33,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 16:44:37,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:37,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:44:38,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:44:42,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:42,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 16:44:44,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:44:45,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1723346.6666666667, ans=0.125 2023-10-04 16:44:47,361 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:44:47,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:44:48,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:50,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:44:50,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:44:53,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 16:44:55,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 16:44:55,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 16:44:55,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:44:56,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:56,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:44:56,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:45:00,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:45:01,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:45:06,920 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:45:08,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 16:45:08,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 16:45:08,392 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:45:11,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:45:11,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:45:12,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:45:12,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1723480.0, ans=0.07 2023-10-04 16:45:14,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 16:45:15,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:45:17,257 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:45:17,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 16:45:19,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 16:45:19,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1723480.0, ans=0.125 2023-10-04 16:45:22,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:45:23,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:45:23,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:45:23,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:45:25,089 INFO [train.py:1046] (1/4) Epoch 49, batch 3550, loss[loss=0.144, simple_loss=0.2165, pruned_loss=0.03578, over 23610.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2316, pruned_loss=0.03602, over 4712803.66 frames. ], batch size: 256, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:45:26,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:45:33,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1723546.6666666667, ans=0.2 2023-10-04 16:45:34,329 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.51 vs. limit=15.0 2023-10-04 16:45:35,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:45:37,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 16:45:40,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:45:41,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:45:42,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:45:44,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:45:44,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:45:47,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:45:47,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:45:48,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:45:48,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:45:48,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:45:53,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:45:53,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:45:54,758 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.21 vs. limit=15.0 2023-10-04 16:45:55,353 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:45:55,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:45:55,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1723680.0, ans=0.0 2023-10-04 16:45:56,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:45:58,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 16:45:58,047 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:45:58,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:45:59,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 16:46:03,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:03,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1723680.0, ans=0.125 2023-10-04 16:46:05,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:46:05,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:07,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 16:46:07,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:46:09,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 16:46:09,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:46:12,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:46:12,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:46:14,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1723746.6666666667, ans=0.125 2023-10-04 16:46:15,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 16:46:15,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:46:18,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1723746.6666666667, ans=0.0 2023-10-04 16:46:23,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:46:23,589 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 16:46:24,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:46:28,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1723813.3333333333, ans=0.2 2023-10-04 16:46:29,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:46:30,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 16:46:37,420 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 16:46:37,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:46:37,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:46:38,829 INFO [train.py:1046] (1/4) Epoch 49, batch 3600, loss[loss=0.1474, simple_loss=0.2417, pruned_loss=0.02656, over 24671.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2318, pruned_loss=0.03554, over 4720801.73 frames. ], batch size: 73, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:46:38,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:46:40,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:46:40,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:46:40,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1723880.0, ans=0.125 2023-10-04 16:46:44,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:46:45,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:47,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:46:48,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:46:48,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:48,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 16:46:48,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1723880.0, ans=0.125 2023-10-04 16:46:53,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:46:54,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:57,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:46:58,548 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.39 vs. limit=15.0 2023-10-04 16:46:59,093 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.052e+02 2.311e+02 2.640e+02 4.130e+02, threshold=4.623e+02, percent-clipped=0.0 2023-10-04 16:47:00,617 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:47:01,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:47:03,259 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:47:04,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 16:47:06,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:47:07,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:47:07,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1724013.3333333333, ans=0.07 2023-10-04 16:47:08,989 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:47:10,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:47:10,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1724013.3333333333, ans=0.0 2023-10-04 16:47:11,915 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:47:13,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:47:14,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 16:47:20,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:47:20,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1724013.3333333333, ans=0.0 2023-10-04 16:47:22,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 16:47:23,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 16:47:23,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1724080.0, ans=0.2 2023-10-04 16:47:25,976 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.68 vs. limit=22.5 2023-10-04 16:47:28,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:47:32,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:47:33,374 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.19 vs. limit=15.0 2023-10-04 16:47:34,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:47:40,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:47:40,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:47:40,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 16:47:42,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 16:47:44,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 16:47:45,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:47:45,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:47:47,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 16:47:47,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:47:48,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:47:48,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:47:50,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 16:47:50,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 16:47:53,502 INFO [train.py:1046] (1/4) Epoch 49, batch 3650, loss[loss=0.1708, simple_loss=0.2478, pruned_loss=0.04694, over 23910.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2329, pruned_loss=0.03565, over 4735522.27 frames. ], batch size: 195, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:47:53,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:47:53,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 16:47:59,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 16:48:01,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:48:03,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 16:48:05,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 16:48:08,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:48:08,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:48:08,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:48:11,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:48:11,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:48:13,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 16:48:13,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:48:13,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:48:15,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 16:48:15,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:48:15,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:48:15,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:48:18,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:48:19,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1724280.0, ans=0.125 2023-10-04 16:48:20,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 16:48:22,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 16:48:22,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:48:24,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 16:48:24,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1724346.6666666667, ans=0.0 2023-10-04 16:48:26,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:48:26,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:48:32,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:48:35,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:48:35,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:48:36,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:48:36,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:48:38,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:48:40,842 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:48:42,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:48:42,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:48:43,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:48:45,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:48:45,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:48:48,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1724413.3333333333, ans=0.0 2023-10-04 16:48:48,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1724413.3333333333, ans=0.125 2023-10-04 16:48:51,616 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 16:48:55,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:48:56,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:48:58,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:48:58,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:48:59,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:49:00,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:49:02,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 16:49:02,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:49:03,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1724480.0, ans=0.2 2023-10-04 16:49:04,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:49:06,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:49:07,602 INFO [train.py:1046] (1/4) Epoch 49, batch 3700, loss[loss=0.1439, simple_loss=0.229, pruned_loss=0.02943, over 24493.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2336, pruned_loss=0.03622, over 4720830.29 frames. ], batch size: 63, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:49:07,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:49:10,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:49:10,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 16:49:10,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:49:10,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1724546.6666666667, ans=0.125 2023-10-04 16:49:11,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 16:49:11,777 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:49:14,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 16:49:19,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:49:19,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:49:19,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:49:19,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:49:19,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1724546.6666666667, ans=0.1 2023-10-04 16:49:19,778 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.50 vs. limit=15.0 2023-10-04 16:49:21,155 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:49:22,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:49:23,998 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 16:49:26,956 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 2.043e+02 2.235e+02 2.541e+02 4.177e+02, threshold=4.470e+02, percent-clipped=0.0 2023-10-04 16:49:31,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:49:32,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 16:49:34,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:49:34,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 16:49:34,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:49:38,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:49:39,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 16:49:41,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:49:42,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:49:43,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:49:44,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:49:46,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 16:49:51,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:49:51,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 16:49:51,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:49:51,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 16:49:57,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:49:57,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:49:59,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:50:01,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 16:50:04,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:50:04,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:50:04,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:50:04,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:50:09,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:50:11,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 16:50:12,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 16:50:12,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:50:13,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:50:15,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:50:15,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:50:18,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:50:19,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:50:19,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1724880.0, ans=0.125 2023-10-04 16:50:20,893 INFO [train.py:1046] (1/4) Epoch 49, batch 3750, loss[loss=0.1448, simple_loss=0.225, pruned_loss=0.03226, over 24300.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2345, pruned_loss=0.03625, over 4724297.24 frames. ], batch size: 61, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:50:20,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:50:22,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 16:50:22,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 16:50:23,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1724880.0, ans=0.2 2023-10-04 16:50:26,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:50:27,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 16:50:27,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:50:29,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:50:29,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:50:31,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:50:33,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:50:36,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:50:37,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:50:40,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:50:43,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:50:44,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 16:50:44,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:50:46,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:50:46,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:50:50,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 16:50:52,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1725013.3333333333, ans=0.0 2023-10-04 16:50:55,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 16:50:56,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:50:56,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:50:57,025 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:50:58,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:51:03,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:51:04,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:51:08,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 16:51:11,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:51:14,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:51:14,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:51:14,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1725080.0, ans=0.125 2023-10-04 16:51:17,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:51:21,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:51:23,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 16:51:25,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:51:28,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:51:29,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:51:36,223 INFO [train.py:1046] (1/4) Epoch 49, batch 3800, loss[loss=0.1591, simple_loss=0.2401, pruned_loss=0.03907, over 23272.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2344, pruned_loss=0.03612, over 4732931.17 frames. ], batch size: 105, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:51:37,773 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:51:42,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:51:43,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:51:43,415 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 16:51:44,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:51:47,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:51:47,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:51:50,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 16:51:50,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:51:51,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:51:53,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:51:53,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:51:54,677 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 2.120e+02 2.292e+02 2.756e+02 3.883e+02, threshold=4.583e+02, percent-clipped=0.0 2023-10-04 16:51:54,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:51:54,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 16:51:54,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1725280.0, ans=0.125 2023-10-04 16:51:59,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 16:51:59,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:52:02,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:52:03,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1725280.0, ans=0.125 2023-10-04 16:52:05,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:52:05,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 16:52:07,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 16:52:07,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:52:10,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:52:11,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:52:16,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:52:16,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 16:52:17,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:52:23,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:52:28,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:52:31,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 16:52:33,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 16:52:33,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1725413.3333333333, ans=0.2 2023-10-04 16:52:34,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:52:35,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:52:37,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:52:38,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 16:52:42,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 16:52:42,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 16:52:43,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:52:44,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:52:47,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1725480.0, ans=0.0 2023-10-04 16:52:49,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:52:50,410 INFO [train.py:1046] (1/4) Epoch 49, batch 3850, loss[loss=0.1441, simple_loss=0.2369, pruned_loss=0.0256, over 24308.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2338, pruned_loss=0.03568, over 4721485.09 frames. ], batch size: 74, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:52:50,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:52:54,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:52:55,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 16:52:57,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:52:57,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:52:58,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1725546.6666666667, ans=0.1 2023-10-04 16:53:01,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:53:05,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:53:05,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1725613.3333333333, ans=0.0 2023-10-04 16:53:06,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:53:07,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 16:53:13,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:14,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:53:16,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:53:17,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:53:20,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:20,821 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.02 vs. limit=15.0 2023-10-04 16:53:21,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:53:21,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:53:21,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:53:23,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:53:24,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:53:24,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:24,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:53:25,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 16:53:25,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 16:53:25,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:53:27,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:30,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:53:30,685 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:30,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 16:53:35,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 16:53:36,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:53:38,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 16:53:40,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:53:44,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:53:45,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:49,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:53:51,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 16:53:53,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 16:53:55,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:53:55,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:53:57,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:53:57,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:53:59,077 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:53:59,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:53:59,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:53:59,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 16:54:00,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:54:02,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 16:54:02,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:02,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:54:04,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:54:05,388 INFO [train.py:1046] (1/4) Epoch 49, batch 3900, loss[loss=0.1553, simple_loss=0.2425, pruned_loss=0.03404, over 24464.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2328, pruned_loss=0.03565, over 4712470.18 frames. ], batch size: 69, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:54:05,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:06,787 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:54:06,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:54:06,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:54:08,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:54:08,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 16:54:08,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:09,827 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:54:11,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:54:12,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:54:13,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:54:13,840 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:54:15,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:54:15,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:16,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:54:16,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 16:54:16,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:54:19,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 16:54:19,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:20,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 16:54:23,188 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.722e+02 2.321e+02 2.754e+02 3.506e+02 6.937e+02, threshold=5.508e+02, percent-clipped=5.0 2023-10-04 16:54:23,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 16:54:27,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:54:28,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:54:28,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:54:30,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:54:33,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:54:36,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:54:37,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:54:37,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:54:37,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:54:39,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1726013.3333333333, ans=0.125 2023-10-04 16:54:43,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:54:44,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:54:50,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1726080.0, ans=0.2 2023-10-04 16:54:51,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:54:52,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:55:02,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:55:06,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:55:06,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 16:55:08,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 16:55:08,231 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:55:09,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 16:55:10,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:55:12,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 16:55:16,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:55:17,736 INFO [train.py:1046] (1/4) Epoch 49, batch 3950, loss[loss=0.1479, simple_loss=0.2245, pruned_loss=0.03563, over 23965.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2328, pruned_loss=0.03572, over 4700516.83 frames. ], batch size: 196, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 16:55:17,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 16:55:17,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:55:20,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:55:22,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1726213.3333333333, ans=0.025 2023-10-04 16:55:23,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:55:27,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1726213.3333333333, ans=0.125 2023-10-04 16:55:29,281 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 16:55:30,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:55:30,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 16:55:33,071 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 16:55:33,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:55:36,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:55:36,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 16:55:36,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:55:37,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1726280.0, ans=0.0 2023-10-04 16:55:40,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 16:55:41,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:55:42,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:55:43,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:55:44,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:55:44,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:55:44,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1726280.0, ans=0.1 2023-10-04 16:55:55,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:55:55,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:55:58,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1726346.6666666667, ans=0.125 2023-10-04 16:56:01,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 16:56:05,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 16:56:05,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 16:56:07,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:56:08,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:56:16,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:56:16,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:56:17,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:56:17,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:56:18,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 16:56:22,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:56:24,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:56:27,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 16:56:31,259 INFO [train.py:1046] (1/4) Epoch 49, batch 4000, loss[loss=0.1457, simple_loss=0.2358, pruned_loss=0.02776, over 24500.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2333, pruned_loss=0.03598, over 4698274.45 frames. ], batch size: 66, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 16:56:36,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:56:43,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:56:46,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:56:48,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:56:48,163 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:56:48,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 16:56:49,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:56:49,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 16:56:50,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:56:50,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 16:56:52,171 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 1.984e+02 2.160e+02 2.381e+02 3.286e+02, threshold=4.320e+02, percent-clipped=0.0 2023-10-04 16:56:52,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:56:54,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1726613.3333333333, ans=0.125 2023-10-04 16:56:55,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:56:55,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:56:55,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:56:56,448 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:56:56,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 16:56:58,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:57:00,732 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 16:57:00,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:57:00,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:57:03,984 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 16:57:05,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:57:05,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:57:07,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1726680.0, ans=0.0 2023-10-04 16:57:08,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=1726680.0, ans=0.025 2023-10-04 16:57:13,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 16:57:13,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:57:16,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:57:17,876 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 16:57:19,294 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:57:19,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 16:57:19,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:57:20,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:57:21,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:57:23,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:57:24,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 16:57:24,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:57:27,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 16:57:27,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:57:28,996 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 16:57:30,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1726813.3333333333, ans=0.125 2023-10-04 16:57:31,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:57:35,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 16:57:36,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:57:37,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:57:37,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:57:39,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:57:44,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:57:45,690 INFO [train.py:1046] (1/4) Epoch 49, batch 4050, loss[loss=0.1593, simple_loss=0.2392, pruned_loss=0.03972, over 23697.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2338, pruned_loss=0.03619, over 4711844.94 frames. ], batch size: 232, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 16:57:47,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 16:57:47,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 16:57:48,509 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:57:48,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:57:48,752 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:57:49,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:57:52,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:57:54,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:57:55,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1726880.0, ans=0.1 2023-10-04 16:57:56,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:57:59,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1726946.6666666667, ans=0.0 2023-10-04 16:58:00,900 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:58:00,939 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:58:02,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:58:03,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:58:05,238 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.64 vs. limit=15.0 2023-10-04 16:58:06,484 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.99 vs. limit=15.0 2023-10-04 16:58:07,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:58:09,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:58:09,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1726946.6666666667, ans=0.125 2023-10-04 16:58:13,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1727013.3333333333, ans=0.0 2023-10-04 16:58:14,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 16:58:14,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 16:58:14,376 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 16:58:17,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:58:24,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 16:58:24,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:58:27,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:58:28,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1727080.0, ans=0.125 2023-10-04 16:58:29,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.35 vs. limit=15.0 2023-10-04 16:58:30,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:58:30,180 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:58:30,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:58:34,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:58:36,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1727080.0, ans=0.125 2023-10-04 16:58:37,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 16:58:37,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:58:38,857 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:58:40,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 16:58:40,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1727080.0, ans=0.2 2023-10-04 16:58:46,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:58:52,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 16:58:54,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:58:54,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:58:55,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1727146.6666666667, ans=0.125 2023-10-04 16:58:56,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 16:58:56,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 16:58:56,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:58:58,168 INFO [train.py:1046] (1/4) Epoch 49, batch 4100, loss[loss=0.1632, simple_loss=0.2488, pruned_loss=0.03878, over 24307.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2345, pruned_loss=0.03691, over 4712399.56 frames. ], batch size: 77, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 16:58:59,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:59:01,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:01,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:59:07,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 16:59:08,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 16:59:08,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1727213.3333333333, ans=0.125 2023-10-04 16:59:09,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 16:59:10,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 16:59:10,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:59:11,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:11,430 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:11,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:59:12,825 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 16:59:17,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:59:17,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:59:17,378 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:59:17,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:59:20,692 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.809e+02 2.160e+02 2.515e+02 2.947e+02 5.173e+02, threshold=5.031e+02, percent-clipped=2.0 2023-10-04 16:59:23,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:59:24,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:59:25,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:59:26,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 16:59:26,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:26,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:59:26,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:59:27,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:59:27,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 16:59:30,583 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:59:31,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 16:59:33,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:59:34,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:59:34,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 16:59:35,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:59:35,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1727346.6666666667, ans=0.125 2023-10-04 16:59:36,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:59:38,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:59:39,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 16:59:40,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 16:59:40,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:59:42,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 16:59:42,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:43,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:59:45,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:59:50,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:59:53,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:59:54,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:00:01,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:00:01,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:00:05,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:00:08,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:00:11,547 INFO [train.py:1046] (1/4) Epoch 49, batch 4150, loss[loss=0.1525, simple_loss=0.2276, pruned_loss=0.03874, over 23752.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2348, pruned_loss=0.03671, over 4714936.22 frames. ], batch size: 179, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:00:11,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:00:13,501 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:00:14,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:00:14,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:00:18,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 17:00:19,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:00:19,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 17:00:19,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 17:00:19,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 17:00:21,457 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:00:25,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:00:25,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:00:26,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1727613.3333333333, ans=0.125 2023-10-04 17:00:29,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:00:31,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:00:32,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:00:33,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:00:33,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:00:33,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 17:00:38,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:00:43,403 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:00:43,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 17:00:43,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1727680.0, ans=0.1 2023-10-04 17:00:45,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1727680.0, ans=0.125 2023-10-04 17:00:46,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 17:00:46,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:00:47,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 17:00:47,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:00:47,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:00:49,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:00:51,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:00:55,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 17:00:58,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:00:59,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:01:01,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 17:01:01,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:01:02,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 17:01:04,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:01:05,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:01:07,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:01:07,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 17:01:07,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:07,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 17:01:09,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:01:12,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 17:01:12,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:01:12,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:01:12,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 17:01:14,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 17:01:14,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:01:14,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 17:01:14,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:01:18,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:01:18,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 17:01:18,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:01:18,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1727813.3333333333, ans=0.125 2023-10-04 17:01:19,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1727813.3333333333, ans=0.05 2023-10-04 17:01:24,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:01:25,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 17:01:26,914 INFO [train.py:1046] (1/4) Epoch 49, batch 4200, loss[loss=0.1592, simple_loss=0.2493, pruned_loss=0.03461, over 24640.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2343, pruned_loss=0.03664, over 4715138.92 frames. ], batch size: 73, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:01:27,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:01:29,830 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:01:31,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:01:31,240 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:01:31,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:01:32,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 17:01:34,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1727880.0, ans=0.2 2023-10-04 17:01:37,053 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 17:01:38,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:39,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:01:42,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:01:44,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 17:01:45,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:01:45,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:47,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 17:01:47,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:01:47,966 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.62 vs. limit=22.5 2023-10-04 17:01:49,212 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.800e+02 2.163e+02 2.357e+02 2.679e+02 4.452e+02, threshold=4.714e+02, percent-clipped=0.0 2023-10-04 17:01:49,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:50,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:01:50,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:01:51,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:01:52,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1727946.6666666667, ans=0.2 2023-10-04 17:01:53,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 17:01:54,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:59,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:01:59,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:02:00,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1728013.3333333333, ans=0.1 2023-10-04 17:02:01,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:02:04,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:02:04,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1728013.3333333333, ans=0.07 2023-10-04 17:02:06,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:02:06,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 17:02:06,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:02:08,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:02:12,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:02:13,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:02:18,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:02:22,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 17:02:24,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:02:28,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:02:29,386 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.03 vs. limit=15.0 2023-10-04 17:02:29,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:02:31,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1728146.6666666667, ans=0.0 2023-10-04 17:02:32,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 17:02:36,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:02:40,201 INFO [train.py:1046] (1/4) Epoch 49, batch 4250, loss[loss=0.1629, simple_loss=0.2521, pruned_loss=0.03682, over 24683.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2329, pruned_loss=0.03604, over 4727935.85 frames. ], batch size: 73, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:02:40,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:02:40,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:02:42,125 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=9.23 vs. limit=15.0 2023-10-04 17:02:42,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:02:48,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:02:48,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 17:02:48,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:02:48,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1728213.3333333333, ans=0.0 2023-10-04 17:02:51,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:02:54,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:02:58,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:02:58,696 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:03:01,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:03:01,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:03:02,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:03:04,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:03:05,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:03:05,786 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:03:09,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1728346.6666666667, ans=0.125 2023-10-04 17:03:10,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:03:11,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:03:13,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 17:03:16,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 17:03:16,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:03:16,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:03:17,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:03:19,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:03:19,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:03:19,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:03:24,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:03:24,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:03:28,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:03:30,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:03:30,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 17:03:30,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1728413.3333333333, ans=0.125 2023-10-04 17:03:31,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:03:31,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 17:03:31,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1728413.3333333333, ans=0.1 2023-10-04 17:03:34,201 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:03:35,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:03:37,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:03:37,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:03:39,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 17:03:41,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:03:42,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:03:44,091 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.28 vs. limit=15.0 2023-10-04 17:03:46,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:03:48,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:03:48,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:03:52,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:03:53,971 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.35 vs. limit=12.0 2023-10-04 17:03:54,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:03:54,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:03:54,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:03:54,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 17:03:55,781 INFO [train.py:1046] (1/4) Epoch 49, batch 4300, loss[loss=0.1598, simple_loss=0.2338, pruned_loss=0.04291, over 23813.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2322, pruned_loss=0.03563, over 4730138.54 frames. ], batch size: 179, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:03:57,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:04:02,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:04:02,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:04:04,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:04:12,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:04:12,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 17:04:13,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:04:14,008 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:04:14,427 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.25 vs. limit=10.0 2023-10-04 17:04:15,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:04:16,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:04:16,485 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 17:04:17,833 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.775e+02 2.072e+02 2.260e+02 2.526e+02 3.398e+02, threshold=4.520e+02, percent-clipped=0.0 2023-10-04 17:04:17,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:04:19,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:04:23,521 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.30 vs. limit=15.0 2023-10-04 17:04:24,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 17:04:24,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:04:24,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 17:04:26,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 17:04:28,875 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:04:30,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:04:30,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:04:31,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:04:33,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:04:33,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:04:34,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 17:04:35,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 17:04:37,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:04:40,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:04:40,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 17:04:40,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:04:40,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:04:40,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 17:04:40,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 17:04:41,876 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 17:04:41,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:04:43,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 17:04:43,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 17:04:46,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:04:49,336 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 17:04:50,703 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:04:52,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:04:52,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:04:54,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 17:04:54,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:04:54,139 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:04:55,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:04:56,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:04:57,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:04:59,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:05:03,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:05:04,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:05:04,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:05:08,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 17:05:08,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:05:09,993 INFO [train.py:1046] (1/4) Epoch 49, batch 4350, loss[loss=0.1518, simple_loss=0.2367, pruned_loss=0.03344, over 24079.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.233, pruned_loss=0.0359, over 4722379.49 frames. ], batch size: 86, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:05:14,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:05:17,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:05:21,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:05:21,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:05:26,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:05:29,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:05:32,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:05:34,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:05:37,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:05:38,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:05:38,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:05:43,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 17:05:45,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:05:45,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:05:46,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1729013.3333333333, ans=0.125 2023-10-04 17:05:50,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:05:51,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 17:05:53,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1729080.0, ans=0.0 2023-10-04 17:05:55,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:05:56,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:06:00,751 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 17:06:02,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:06:03,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:06:03,885 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 17:06:03,944 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 17:06:03,955 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:06:05,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:06:05,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:06:05,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:06:06,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:06:07,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:06:09,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1729146.6666666667, ans=0.125 2023-10-04 17:06:10,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 17:06:10,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:10,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:06:12,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:13,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 17:06:16,119 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 17:06:16,123 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 17:06:16,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 17:06:19,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:06:19,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:06:20,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:06:20,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:06:23,589 INFO [train.py:1046] (1/4) Epoch 49, batch 4400, loss[loss=0.1536, simple_loss=0.245, pruned_loss=0.03112, over 24628.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2336, pruned_loss=0.03592, over 4724159.22 frames. ], batch size: 68, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:06:25,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 17:06:27,031 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 17:06:27,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:27,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1729213.3333333333, ans=0.2 2023-10-04 17:06:30,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:06:30,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:31,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:06:33,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 17:06:33,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 17:06:34,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 17:06:34,705 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 17:06:35,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:06:36,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:06:37,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 17:06:39,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1729280.0, ans=0.0 2023-10-04 17:06:40,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:40,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:06:40,723 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 17:06:43,418 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:06:43,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 17:06:43,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1729280.0, ans=0.0 2023-10-04 17:06:44,835 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 17:06:46,254 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.847e+02 2.222e+02 2.528e+02 3.076e+02 4.813e+02, threshold=5.055e+02, percent-clipped=1.0 2023-10-04 17:06:46,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 17:06:46,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 17:06:47,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 17:06:47,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:06:49,704 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:06:51,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:06:51,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:06:52,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 17:06:52,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 17:06:54,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:06:54,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1729346.6666666667, ans=0.125 2023-10-04 17:06:55,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:06:55,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:57,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:06:57,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:06:57,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 17:06:58,791 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 17:07:00,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1729346.6666666667, ans=0.125 2023-10-04 17:07:03,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:07:08,003 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:07:11,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 17:07:14,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:07:17,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:07:18,798 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:07:18,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 17:07:18,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:07:18,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:07:18,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:07:20,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:07:26,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 17:07:29,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 17:07:30,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 17:07:32,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:07:32,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 17:07:32,726 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:07:35,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:07:38,763 INFO [train.py:1046] (1/4) Epoch 49, batch 4450, loss[loss=0.1393, simple_loss=0.2208, pruned_loss=0.0289, over 24664.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2344, pruned_loss=0.03626, over 4714536.42 frames. ], batch size: 60, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:07:38,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 17:07:41,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:07:44,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:07:44,594 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:07:51,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:07:51,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:07:53,282 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.28 vs. limit=15.0 2023-10-04 17:07:54,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:07:56,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1729613.3333333333, ans=0.125 2023-10-04 17:07:57,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:08:00,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:08:00,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:08:02,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 17:08:02,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:08:03,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:08:03,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:08:03,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:08:06,799 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 17:08:09,019 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.07 vs. limit=15.0 2023-10-04 17:08:11,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:08:11,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:08:12,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:08:12,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:08:15,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:08:18,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 17:08:19,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 17:08:19,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 17:08:19,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:08:22,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:08:23,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 17:08:28,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:08:31,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:08:32,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 17:08:32,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:08:32,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:08:32,056 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:08:32,066 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:08:33,802 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:08:36,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:08:36,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 17:08:38,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:08:41,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:08:41,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:08:42,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:08:44,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 17:08:45,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:08:48,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 17:08:49,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:08:52,475 INFO [train.py:1046] (1/4) Epoch 49, batch 4500, loss[loss=0.1535, simple_loss=0.2222, pruned_loss=0.04238, over 23977.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2351, pruned_loss=0.03646, over 4715435.20 frames. ], batch size: 196, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:08:54,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1729880.0, ans=0.1 2023-10-04 17:08:55,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:08:57,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 17:08:57,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 17:08:59,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:09:00,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1729880.0, ans=0.125 2023-10-04 17:09:05,016 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:09:06,270 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:09:07,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:09:07,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:09:07,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:09:08,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:09:15,228 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.757e+02 2.089e+02 2.391e+02 2.805e+02 4.651e+02, threshold=4.782e+02, percent-clipped=0.0 2023-10-04 17:09:15,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1729946.6666666667, ans=0.2 2023-10-04 17:09:20,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:09:20,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:09:22,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:09:23,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:09:25,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:09:28,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 17:09:31,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1730013.3333333333, ans=0.0 2023-10-04 17:09:33,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:09:37,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:09:41,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:09:41,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 17:09:41,755 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:09:43,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:09:45,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:09:45,111 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:09:47,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:09:49,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 17:09:49,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:09:49,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:09:53,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:09:53,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:09:56,213 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:09:56,706 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.85 vs. limit=15.0 2023-10-04 17:09:57,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:09:58,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:10:00,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 17:10:04,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 17:10:04,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 17:10:07,352 INFO [train.py:1046] (1/4) Epoch 49, batch 4550, loss[loss=0.1239, simple_loss=0.2014, pruned_loss=0.0232, over 24459.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2334, pruned_loss=0.03596, over 4730670.78 frames. ], batch size: 58, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:10:07,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 17:10:10,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 17:10:10,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:10:13,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:10:13,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:10:16,375 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:10:20,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:10:21,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:10:23,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:10:23,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:10:23,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:10:24,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:10:26,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:10:29,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:10:33,308 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 17:10:33,356 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 17:10:34,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:10:35,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 17:10:40,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 17:10:41,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:10:42,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 17:10:44,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:10:48,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:10:50,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:10:50,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:10:51,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 17:10:54,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:10:57,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:10:57,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:10:58,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:10:59,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 17:10:59,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 17:11:00,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:11:01,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 17:11:02,371 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.49 vs. limit=15.0 2023-10-04 17:11:03,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 17:11:03,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:11:03,366 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:11:05,152 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:05,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:11:06,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:11:06,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:11:07,975 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:11:09,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 17:11:10,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:11:10,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 17:11:10,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 17:11:10,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:11:10,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 17:11:14,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:11:14,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:11:18,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:11:18,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:11:19,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:11:20,946 INFO [train.py:1046] (1/4) Epoch 49, batch 4600, loss[loss=0.1387, simple_loss=0.2206, pruned_loss=0.02838, over 23659.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2319, pruned_loss=0.0357, over 4714655.42 frames. ], batch size: 149, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:11:20,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:11:21,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1730546.6666666667, ans=0.125 2023-10-04 17:11:22,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:11:24,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1730546.6666666667, ans=0.07 2023-10-04 17:11:25,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:25,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:11:27,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:11:29,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:11:29,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:11:31,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 17:11:33,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:11:37,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:11:37,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:11:40,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:43,298 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 2.059e+02 2.232e+02 2.627e+02 4.263e+02, threshold=4.464e+02, percent-clipped=0.0 2023-10-04 17:11:46,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 17:11:47,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:51,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:52,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:11:52,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:11:59,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 17:11:59,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 17:12:00,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:12:04,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1730746.6666666667, ans=0.125 2023-10-04 17:12:05,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:12:05,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1730746.6666666667, ans=0.2 2023-10-04 17:12:07,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:12:08,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:12:12,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 17:12:13,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 17:12:17,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:18,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:12:20,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:20,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 17:12:20,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:12:21,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 17:12:21,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:22,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1730813.3333333333, ans=0.0 2023-10-04 17:12:23,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:12:24,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:26,054 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:12:26,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:12:27,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 17:12:27,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 17:12:28,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 17:12:28,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:12:29,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:12:30,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:12:31,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:12:34,807 INFO [train.py:1046] (1/4) Epoch 49, batch 4650, loss[loss=0.148, simple_loss=0.2318, pruned_loss=0.0321, over 23441.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2316, pruned_loss=0.03553, over 4721182.92 frames. ], batch size: 134, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:12:37,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1730880.0, ans=0.2 2023-10-04 17:12:40,188 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:12:44,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:12:44,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:12:44,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:12:44,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:12:44,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:12:45,874 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:12:47,430 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 17:12:50,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1730946.6666666667, ans=0.2 2023-10-04 17:12:52,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:12:54,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 17:12:54,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:12:54,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 17:12:56,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:12:56,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 17:12:56,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 17:12:56,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:56,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1730946.6666666667, ans=0.125 2023-10-04 17:12:57,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:13:00,621 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:13:00,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1730946.6666666667, ans=0.07 2023-10-04 17:13:02,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:13:02,531 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 17:13:05,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:13:05,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 17:13:09,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:13:09,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:13:10,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 17:13:12,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:13:14,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:13:17,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:13:19,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1731080.0, ans=0.125 2023-10-04 17:13:22,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1731080.0, ans=0.2 2023-10-04 17:13:23,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:13:25,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:13:26,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:13:26,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:13:29,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 17:13:29,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 17:13:30,482 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 17:13:30,483 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 17:13:31,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:13:34,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1731146.6666666667, ans=0.1 2023-10-04 17:13:35,931 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.92 vs. limit=15.0 2023-10-04 17:13:39,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:13:39,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:13:41,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 17:13:41,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:13:42,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:13:42,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:13:43,819 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:13:46,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:13:46,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:13:46,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1731146.6666666667, ans=0.0 2023-10-04 17:13:48,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:13:49,405 INFO [train.py:1046] (1/4) Epoch 49, batch 4700, loss[loss=0.1567, simple_loss=0.2279, pruned_loss=0.0427, over 22775.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2323, pruned_loss=0.03564, over 4717758.80 frames. ], batch size: 322, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:13:50,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:13:50,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:13:50,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:13:52,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 17:13:52,836 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.77 vs. limit=15.0 2023-10-04 17:13:53,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:13:53,685 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 17:14:02,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:14:03,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:14:03,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:14:03,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:14:07,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:14:11,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 17:14:11,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 17:14:12,380 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.767e+02 2.015e+02 2.160e+02 2.418e+02 3.879e+02, threshold=4.320e+02, percent-clipped=0.0 2023-10-04 17:14:12,678 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1731280.0, ans=0.125 2023-10-04 17:14:13,979 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:14:15,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:14:16,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:14:19,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:14:23,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:14:25,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 17:14:25,327 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1731346.6666666667, ans=0.125 2023-10-04 17:14:27,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:14:33,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 17:14:35,592 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:14:37,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:14:39,515 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.69 vs. limit=15.0 2023-10-04 17:14:41,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 17:14:43,450 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:14:46,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:14:47,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 17:14:47,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:14:47,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:14:50,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:14:50,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:14:50,947 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:14:52,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 17:14:52,116 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 17:14:53,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:14:56,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:14:56,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:14:56,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 17:14:56,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:15:00,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 17:15:02,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:15:03,609 INFO [train.py:1046] (1/4) Epoch 49, batch 4750, loss[loss=0.1795, simple_loss=0.2537, pruned_loss=0.05268, over 19566.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2337, pruned_loss=0.03603, over 4721097.71 frames. ], batch size: 388, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:15:03,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:15:08,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:15:09,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:15:10,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 17:15:12,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:15:12,755 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:15:15,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 17:15:16,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:15:16,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:15:18,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:15:22,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1731613.3333333333, ans=0.2 2023-10-04 17:15:24,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 17:15:29,650 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:15:31,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 17:15:31,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:15:33,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:15:33,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:15:35,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:15:35,210 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 17:15:35,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 17:15:41,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 17:15:44,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:15:45,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1731680.0, ans=0.125 2023-10-04 17:15:46,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:15:47,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:15:47,892 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 17:15:47,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:15:50,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:15:53,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:15:54,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 17:15:54,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 17:15:56,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:15:56,276 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:15:57,069 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.08 vs. limit=6.0 2023-10-04 17:15:57,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:15:57,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 17:15:57,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 17:16:00,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 17:16:03,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:05,124 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:16:05,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 17:16:06,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:16:08,300 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:16:10,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:16:11,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:16:11,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:16:16,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:16:16,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 17:16:18,820 INFO [train.py:1046] (1/4) Epoch 49, batch 4800, loss[loss=0.1629, simple_loss=0.2438, pruned_loss=0.04099, over 23613.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.234, pruned_loss=0.03611, over 4715748.64 frames. ], batch size: 232, lr: 2.06e-03, grad_scale: 32.0 2023-10-04 17:16:18,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 17:16:18,951 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 17:16:20,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:16:21,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:16:23,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 17:16:27,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:16:27,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:31,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:16:34,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:16:34,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:16:34,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 17:16:36,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:16:38,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:16:38,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:16:40,915 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.792e+02 2.105e+02 2.314e+02 2.580e+02 3.922e+02, threshold=4.627e+02, percent-clipped=0.0 2023-10-04 17:16:43,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:16:45,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:16:45,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:16:46,665 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.52 vs. limit=15.0 2023-10-04 17:16:47,735 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:16:49,010 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 17:16:49,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:50,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:16:51,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:16:54,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:56,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:56,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:16:57,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 17:16:58,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:17:00,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 17:17:00,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 17:17:01,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:17:01,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:17:03,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:17:03,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:17:03,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:17:03,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:17:04,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:17:07,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:17:09,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1732080.0, ans=0.0 2023-10-04 17:17:10,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:13,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:17:18,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 17:17:18,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:17:18,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:19,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:17:19,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:17:24,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:17:25,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:17:25,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:26,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:17:26,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:17:28,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:17:31,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:17:31,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:31,466 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:17:32,788 INFO [train.py:1046] (1/4) Epoch 49, batch 4850, loss[loss=0.1616, simple_loss=0.2508, pruned_loss=0.03617, over 24274.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2335, pruned_loss=0.03593, over 4727552.67 frames. ], batch size: 74, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:17:32,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 17:17:33,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1732213.3333333333, ans=0.125 2023-10-04 17:17:34,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 17:17:34,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:17:34,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:17:35,788 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:17:35,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:39,137 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:17:41,276 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.18 vs. limit=15.0 2023-10-04 17:17:45,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1732213.3333333333, ans=0.125 2023-10-04 17:17:46,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 17:17:47,977 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:17:53,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:17:54,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1732280.0, ans=0.2 2023-10-04 17:17:55,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:17:55,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:55,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1732280.0, ans=0.125 2023-10-04 17:17:56,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:17:58,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:17:58,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:17:59,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 17:18:02,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:18:04,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:18:04,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 17:18:05,408 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:18:05,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 17:18:08,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:18:08,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:18:13,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:18:13,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 17:18:13,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 17:18:13,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:18:20,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1732413.3333333333, ans=0.0 2023-10-04 17:18:21,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:18:21,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 17:18:24,506 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:18:24,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:18:25,323 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.84 vs. limit=15.0 2023-10-04 17:18:26,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:18:27,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 17:18:27,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:18:28,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 17:18:28,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:18:30,210 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:18:30,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 17:18:38,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:18:44,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:18:44,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:18:48,043 INFO [train.py:1046] (1/4) Epoch 49, batch 4900, loss[loss=0.1545, simple_loss=0.2258, pruned_loss=0.04156, over 23783.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2331, pruned_loss=0.03574, over 4734094.57 frames. ], batch size: 164, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:18:50,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 17:18:50,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:18:51,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1732546.6666666667, ans=0.0 2023-10-04 17:18:54,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:18:56,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:18:56,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:18:59,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 17:19:03,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 17:19:03,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1732613.3333333333, ans=0.125 2023-10-04 17:19:06,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 17:19:07,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 17:19:07,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:19:07,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:19:07,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:19:07,558 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:19:07,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:19:09,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 17:19:11,283 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.726e+02 2.088e+02 2.355e+02 2.729e+02 5.318e+02, threshold=4.711e+02, percent-clipped=2.0 2023-10-04 17:19:12,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 17:19:12,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1732613.3333333333, ans=0.0 2023-10-04 17:19:14,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:19:14,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1732613.3333333333, ans=0.04949747468305833 2023-10-04 17:19:15,098 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.30 vs. limit=22.5 2023-10-04 17:19:15,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:19:17,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:19:20,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:19:22,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:19:22,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:19:22,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 17:19:23,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:19:23,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1732680.0, ans=0.0 2023-10-04 17:19:24,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:19:24,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 17:19:24,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 17:19:29,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 17:19:30,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:19:31,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:19:31,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:19:33,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:19:33,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 17:19:33,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:19:33,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 17:19:36,533 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:19:37,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:19:39,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:19:41,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 17:19:41,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:19:41,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1732746.6666666667, ans=0.0 2023-10-04 17:19:42,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 17:19:42,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 17:19:50,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:19:50,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1732813.3333333333, ans=0.2 2023-10-04 17:19:51,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:19:51,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 17:19:51,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 17:19:51,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:19:54,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:19:58,164 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.16 vs. limit=10.0 2023-10-04 17:19:58,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:19:58,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:19:58,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:20:00,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 17:20:00,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1732880.0, ans=0.125 2023-10-04 17:20:01,469 INFO [train.py:1046] (1/4) Epoch 49, batch 4950, loss[loss=0.151, simple_loss=0.2388, pruned_loss=0.03156, over 24396.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2326, pruned_loss=0.03521, over 4738126.44 frames. ], batch size: 77, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:20:01,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:20:03,161 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:20:03,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 17:20:07,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 17:20:07,689 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 17:20:07,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:20:09,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 17:20:09,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:09,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:20:10,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:20:10,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:20:11,691 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.75 vs. limit=22.5 2023-10-04 17:20:12,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:20:13,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:20:15,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:20:16,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:20:17,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:17,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:20:21,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:20:22,906 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.99 vs. limit=15.0 2023-10-04 17:20:26,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1732946.6666666667, ans=0.0 2023-10-04 17:20:27,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:28,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:20:30,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:31,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:20:32,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:20:32,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1733013.3333333333, ans=0.125 2023-10-04 17:20:34,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 17:20:35,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 17:20:36,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:20:38,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:20:38,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:20:40,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:20:41,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:20:41,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:20:42,385 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=11.09 vs. limit=15.0 2023-10-04 17:20:43,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:20:44,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:20:46,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:20:47,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:49,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:20:51,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 17:20:51,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:20:52,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:20:57,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:20:59,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:20:59,962 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:20:59,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:21:00,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:21:01,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:21:03,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1733146.6666666667, ans=0.125 2023-10-04 17:21:04,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:21:04,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:21:04,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:21:05,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 17:21:08,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:21:14,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 17:21:14,910 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 17:21:16,193 INFO [train.py:1046] (1/4) Epoch 49, batch 5000, loss[loss=0.1492, simple_loss=0.2074, pruned_loss=0.04554, over 19301.00 frames. ], tot_loss[loss=0.1509, simple_loss=0.232, pruned_loss=0.03492, over 4731523.75 frames. ], batch size: 388, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:21:21,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:21:21,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:21:23,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 17:21:24,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 17:21:25,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:21:28,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 17:21:28,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:21:28,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:21:30,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 17:21:30,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:21:31,239 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:21:31,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 17:21:31,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:21:31,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:21:34,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 17:21:34,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 17:21:35,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:21:36,508 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.64 vs. limit=10.0 2023-10-04 17:21:37,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 17:21:37,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:21:37,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:21:38,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:21:38,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 17:21:38,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 17:21:39,928 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.001e+02 2.201e+02 2.789e+02 6.311e+02, threshold=4.402e+02, percent-clipped=4.0 2023-10-04 17:21:40,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 17:21:40,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:21:41,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:21:46,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 17:21:46,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:21:46,776 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:21:47,974 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:21:50,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 17:21:52,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 17:21:54,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:21:55,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:21:55,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1733346.6666666667, ans=0.1 2023-10-04 17:21:58,733 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 17:22:01,852 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:22:03,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:22:03,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:06,054 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 17:22:06,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:22:06,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:22:06,674 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.32 vs. limit=15.0 2023-10-04 17:22:07,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:22:10,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 17:22:10,694 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:22:13,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:22:14,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:22:20,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 17:22:23,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:33,225 INFO [train.py:1046] (1/4) Epoch 49, batch 5050, loss[loss=0.148, simple_loss=0.2305, pruned_loss=0.03278, over 23270.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2326, pruned_loss=0.03487, over 4732392.98 frames. ], batch size: 93, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:22:33,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:22:34,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:34,767 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:22:34,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:22:34,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:22:34,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:22:34,892 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:35,749 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.17 vs. limit=6.0 2023-10-04 17:22:38,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:38,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 17:22:40,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:22:42,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:22:43,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:22:43,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 17:22:46,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:22:46,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:22:48,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:22:50,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:22:50,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:22:59,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 17:22:59,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:22:59,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:23:00,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 17:23:00,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1733613.3333333333, ans=0.125 2023-10-04 17:23:01,567 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:23:02,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:23:02,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:23:04,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:23:04,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 17:23:05,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 17:23:07,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:23:08,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:23:11,761 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:23:12,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1733680.0, ans=0.025 2023-10-04 17:23:13,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 17:23:14,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:23:17,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 17:23:18,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:23:18,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:23:18,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:23:19,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:23:20,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:23:22,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:23:24,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:23:24,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:23:24,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:23:24,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 17:23:25,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:23:26,507 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.52 vs. limit=10.0 2023-10-04 17:23:27,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:23:31,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:23:31,299 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 17:23:31,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 17:23:33,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:23:33,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:23:33,787 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 17:23:37,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:23:37,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 17:23:37,816 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:23:40,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:23:40,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:23:40,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 17:23:44,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 17:23:46,732 INFO [train.py:1046] (1/4) Epoch 49, batch 5100, loss[loss=0.1476, simple_loss=0.2325, pruned_loss=0.03141, over 23470.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2331, pruned_loss=0.03501, over 4736718.27 frames. ], batch size: 134, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:23:46,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:23:46,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:23:46,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:23:50,860 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 17:23:52,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:23:55,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 17:23:55,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 17:23:57,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:23:58,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:23:59,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:24:01,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 17:24:01,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 17:24:06,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:24:06,762 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:24:10,750 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.757e+02 2.155e+02 2.470e+02 3.044e+02 5.202e+02, threshold=4.940e+02, percent-clipped=2.0 2023-10-04 17:24:10,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:24:14,052 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:24:15,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 17:24:15,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:24:18,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:24:18,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 17:24:20,845 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:24:22,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:24:22,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 17:24:24,989 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 17:24:25,590 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.07 vs. limit=15.0 2023-10-04 17:24:26,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:24:26,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 17:24:26,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 17:24:29,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:24:39,167 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:24:40,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 17:24:41,868 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 17:24:41,885 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 17:24:43,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 17:24:43,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:24:46,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 17:24:49,540 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 17:24:52,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 17:24:53,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:24:55,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 17:24:57,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:24:57,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 17:25:00,570 INFO [train.py:1046] (1/4) Epoch 49, batch 5150, loss[loss=0.1469, simple_loss=0.2277, pruned_loss=0.03304, over 24587.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2335, pruned_loss=0.03522, over 4730063.97 frames. ], batch size: 60, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:25:03,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:25:03,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:25:03,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:25:04,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:25:05,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:25:06,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:25:07,052 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:25:08,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 17:25:08,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 17:25:08,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 17:25:08,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:25:08,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 17:25:10,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:25:11,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 17:25:13,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:25:14,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:25:19,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:25:19,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 17:25:22,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:25:22,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:25:23,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:25:23,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:25:23,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:25:23,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:25:23,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:25:23,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 17:25:25,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:25:25,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:25:28,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:25:28,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1734280.0, ans=0.125 2023-10-04 17:25:29,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 17:25:29,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:25:33,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:25:35,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 17:25:40,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:25:43,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:25:45,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:25:49,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:25:49,179 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:25:51,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 17:25:55,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:25:56,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:25:56,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:26:00,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:26:00,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:26:01,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 17:26:07,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:26:09,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:26:10,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:26:10,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:26:12,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 17:26:12,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:26:12,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:26:12,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:26:15,746 INFO [train.py:1046] (1/4) Epoch 49, batch 5200, loss[loss=0.1565, simple_loss=0.2265, pruned_loss=0.04322, over 23762.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.234, pruned_loss=0.03576, over 4724430.48 frames. ], batch size: 164, lr: 2.06e-03, grad_scale: 32.0 2023-10-04 17:26:17,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:26:18,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:26:21,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:26:24,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 17:26:24,941 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.47 vs. limit=15.0 2023-10-04 17:26:25,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:26:27,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:26:28,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:26:29,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:26:29,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:26:32,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 17:26:37,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:26:37,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:26:39,111 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 2.105e+02 2.384e+02 2.750e+02 4.154e+02, threshold=4.767e+02, percent-clipped=0.0 2023-10-04 17:26:41,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 17:26:42,293 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.61 vs. limit=15.0 2023-10-04 17:26:42,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:26:44,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:26:44,332 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 17:26:45,687 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 17:26:48,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 17:26:48,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:26:48,967 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 17:26:48,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:26:51,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:26:51,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:26:51,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 17:26:51,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:26:54,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:26:55,958 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 17:26:55,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 17:26:57,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 17:26:58,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1734746.6666666667, ans=0.125 2023-10-04 17:27:00,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 17:27:01,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:27:07,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:27:07,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:27:09,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 17:27:10,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:27:11,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 17:27:11,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:27:12,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:27:15,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:27:16,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:27:19,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:27:21,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:27:21,425 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:27:22,191 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=6.99 vs. limit=15.0 2023-10-04 17:27:25,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:27:25,613 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 17:27:26,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:27:26,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:27:26,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:27:28,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 17:27:29,644 INFO [train.py:1046] (1/4) Epoch 49, batch 5250, loss[loss=0.1448, simple_loss=0.2329, pruned_loss=0.02838, over 24481.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2337, pruned_loss=0.03565, over 4724528.46 frames. ], batch size: 63, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:27:29,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:27:31,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:27:34,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:27:35,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:27:37,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:27:39,823 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.01 vs. limit=10.0 2023-10-04 17:27:42,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:27:44,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:27:47,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:27:48,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:27:50,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 17:27:51,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:27:51,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1734946.6666666667, ans=0.05 2023-10-04 17:27:52,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:27:53,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1734946.6666666667, ans=0.1 2023-10-04 17:27:56,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1734946.6666666667, ans=0.1 2023-10-04 17:28:17,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1735080.0, ans=0.0 2023-10-04 17:28:17,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1735080.0, ans=0.125 2023-10-04 17:28:38,915 INFO [train.py:1046] (1/4) Epoch 49, batch 5300, loss[loss=0.1537, simple_loss=0.226, pruned_loss=0.04073, over 23899.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2324, pruned_loss=0.03528, over 4710627.33 frames. ], batch size: 195, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:28:45,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1735213.3333333333, ans=0.125 2023-10-04 17:28:47,345 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.70 vs. limit=10.0 2023-10-04 17:28:47,547 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.08 vs. limit=15.0 2023-10-04 17:28:53,061 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:28:53,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 17:28:53,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 17:28:53,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:28:53,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:28:53,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:28:53,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:28:53,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:28:53,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:28:53,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:28:53,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:28:53,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:28:53,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 17:28:54,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 17:28:54,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 17:28:54,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:28:54,205 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 17:28:54,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 17:28:54,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:28:55,069 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:28:55,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:28:55,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:28:55,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:28:55,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:28:55,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:28:55,637 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:28:55,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:28:55,747 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:28:55,751 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:28:55,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:28:55,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:28:56,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 17:28:56,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:28:56,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:28:56,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 17:28:56,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 17:28:56,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:28:56,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:28:56,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 17:28:57,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 17:28:57,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:28:57,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:28:58,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:28:58,160 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 17:28:58,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 17:28:58,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:28:58,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:28:58,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 17:28:58,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 17:28:58,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 17:28:58,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:29:05,612 INFO [train.py:1046] (1/4) Epoch 50, batch 0, loss[loss=0.1456, simple_loss=0.2306, pruned_loss=0.03027, over 23438.00 frames. ], tot_loss[loss=0.1456, simple_loss=0.2306, pruned_loss=0.03027, over 23438.00 frames. ], batch size: 119, lr: 2.04e-03, grad_scale: 32.0 2023-10-04 17:29:05,613 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 17:29:16,879 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.7903, 3.3161, 4.4954, 4.0137], device='cuda:1') 2023-10-04 17:29:18,994 INFO [train.py:1078] (1/4) Epoch 50, validation: loss=0.3435, simple_loss=0.2762, pruned_loss=0.2054, over 1125622.00 frames. 2023-10-04 17:29:18,995 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 17:29:21,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 17:29:21,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:29:23,178 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:29:25,816 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.807e+02 2.206e+02 2.557e+02 2.974e+02 5.162e+02, threshold=5.113e+02, percent-clipped=2.0 2023-10-04 17:29:27,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1735293.3333333333, ans=0.125 2023-10-04 17:29:28,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:29:28,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:29:30,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:29:31,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 17:29:32,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 17:29:34,253 WARNING [train.py:1204] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:29:35,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:29:40,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:29:40,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:29:40,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:29:40,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:29:41,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 17:29:43,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:29:51,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:29:51,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:29:54,014 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 17:29:55,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1735426.6666666667, ans=0.125 2023-10-04 17:29:56,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:29:56,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:29:59,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:30:00,198 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.13 vs. limit=15.0 2023-10-04 17:30:03,764 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:30:08,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:30:14,042 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 17:30:15,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1735560.0, ans=0.0 2023-10-04 17:30:17,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 17:30:17,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:30:17,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:30:17,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1735560.0, ans=0.0 2023-10-04 17:30:18,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:30:20,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:30:21,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 17:30:23,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:30:23,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:30:27,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:30:29,277 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 17:30:29,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:30:32,019 INFO [train.py:1046] (1/4) Epoch 50, batch 50, loss[loss=0.1502, simple_loss=0.2268, pruned_loss=0.03683, over 23824.00 frames. ], tot_loss[loss=0.1492, simple_loss=0.2308, pruned_loss=0.03381, over 1068744.07 frames. ], batch size: 212, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:30:33,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:30:34,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:30:36,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 17:30:36,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:30:37,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:30:39,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:30:40,829 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:30:42,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:30:45,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 17:30:46,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:30:52,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:30:54,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 17:30:56,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 17:30:57,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:30:59,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:30:59,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:31:00,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:31:00,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:31:00,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:31:00,526 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:31:02,549 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.40 vs. limit=22.5 2023-10-04 17:31:07,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:31:09,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:31:09,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:31:10,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 17:31:12,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:31:13,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:31:13,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 17:31:13,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:31:15,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 17:31:21,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:31:21,555 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:31:24,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:31:24,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:31:26,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:31:28,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 17:31:28,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 17:31:31,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:31:31,678 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:31:33,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:31:34,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:31:34,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 17:31:34,477 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 17:31:35,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 17:31:37,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:31:38,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:31:38,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 17:31:38,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 17:31:39,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:31:39,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:31:41,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:31:41,876 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:31:43,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff3.min_abs, batch_count=1735893.3333333333, ans=0.2 2023-10-04 17:31:45,925 INFO [train.py:1046] (1/4) Epoch 50, batch 100, loss[loss=0.1622, simple_loss=0.2389, pruned_loss=0.04272, over 23810.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2338, pruned_loss=0.03604, over 1871547.80 frames. ], batch size: 195, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:31:45,986 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:31:47,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:31:50,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:31:51,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1735960.0, ans=0.0 2023-10-04 17:31:52,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 17:31:52,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:31:55,785 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.084e+02 2.344e+02 2.870e+02 5.145e+02, threshold=4.687e+02, percent-clipped=1.0 2023-10-04 17:31:57,275 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:31:57,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:31:58,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:31:58,605 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:31:58,642 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:32:00,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 17:32:03,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:32:03,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=1736026.6666666667, ans=0.05 2023-10-04 17:32:03,805 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.53 vs. limit=6.0 2023-10-04 17:32:04,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:32:04,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:32:04,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:32:08,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 17:32:10,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:32:10,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:32:11,671 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:32:13,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:32:17,646 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 17:32:17,667 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 17:32:17,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1736093.3333333333, ans=0.125 2023-10-04 17:32:19,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1736093.3333333333, ans=0.125 2023-10-04 17:32:20,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:32:20,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:32:25,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:32:26,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:32:28,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:31,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:33,031 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 17:32:34,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 17:32:36,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1736160.0, ans=0.125 2023-10-04 17:32:38,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:32:38,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:32:42,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:44,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:32:45,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:32:49,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:32:49,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1736226.6666666667, ans=0.0 2023-10-04 17:32:49,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1736226.6666666667, ans=0.0 2023-10-04 17:32:50,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:50,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:32:50,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1736226.6666666667, ans=0.125 2023-10-04 17:32:53,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:32:53,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:32:53,508 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:53,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 17:32:55,356 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 17:32:55,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:32:55,440 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:32:55,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:32:55,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:32:56,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 17:32:56,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:32:57,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:32:57,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:32:57,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1736226.6666666667, ans=10.0 2023-10-04 17:32:58,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:33:00,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:33:00,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:33:01,412 INFO [train.py:1046] (1/4) Epoch 50, batch 150, loss[loss=0.1348, simple_loss=0.215, pruned_loss=0.02727, over 23694.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2346, pruned_loss=0.03628, over 2499671.17 frames. ], batch size: 149, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:33:01,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:33:04,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:33:05,226 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.56 vs. limit=22.5 2023-10-04 17:33:07,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:33:07,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:33:07,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:10,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:33:11,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:11,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1736293.3333333333, ans=0.125 2023-10-04 17:33:12,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:33:14,268 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:17,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 17:33:17,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 17:33:17,073 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 17:33:19,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:33:19,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:33:20,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:33:21,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:33:21,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:33:23,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:24,961 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:26,358 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 17:33:27,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:33:35,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:33:39,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:33:41,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 17:33:42,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1736426.6666666667, ans=0.0 2023-10-04 17:33:45,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:33:45,250 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:33:45,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:33:46,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:33:48,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:33:49,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:33:51,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:33:52,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 17:33:55,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:33:57,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:33:57,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:33:57,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:33:57,313 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:34:00,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:34:01,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 17:34:03,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:34:05,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:34:06,472 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:34:07,854 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:34:09,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 17:34:09,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:34:09,189 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 17:34:12,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:34:15,183 INFO [train.py:1046] (1/4) Epoch 50, batch 200, loss[loss=0.1543, simple_loss=0.232, pruned_loss=0.03824, over 23305.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2362, pruned_loss=0.03644, over 3000905.82 frames. ], batch size: 119, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:34:16,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:34:16,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:34:19,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 17:34:19,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:34:21,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:34:24,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 17:34:25,500 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.740e+02 2.066e+02 2.212e+02 2.512e+02 4.565e+02, threshold=4.424e+02, percent-clipped=0.0 2023-10-04 17:34:25,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 17:34:26,065 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.17 vs. limit=15.0 2023-10-04 17:34:28,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:34:28,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1736693.3333333333, ans=0.09899494936611666 2023-10-04 17:34:29,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:34:32,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:34:34,158 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:34:34,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:34:48,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1736760.0, ans=0.1 2023-10-04 17:34:54,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:34:54,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:34:56,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:34:56,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:34:57,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 17:34:57,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:34:57,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:34:58,519 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.42 vs. limit=15.0 2023-10-04 17:34:59,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:35:00,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:35:00,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:35:01,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 17:35:03,150 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:35:03,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:35:08,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:35:12,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:35:15,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1736893.3333333333, ans=0.125 2023-10-04 17:35:19,999 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:20,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:35:21,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1736893.3333333333, ans=0.125 2023-10-04 17:35:28,253 INFO [train.py:1046] (1/4) Epoch 50, batch 250, loss[loss=0.1591, simple_loss=0.233, pruned_loss=0.04259, over 23761.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2365, pruned_loss=0.03679, over 3379625.66 frames. ], batch size: 212, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:35:28,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:30,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 17:35:30,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:35:30,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:35:30,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:35:31,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:35:31,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 17:35:33,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:35:33,430 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 17:35:35,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:36,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:35:37,994 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:38,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:35:41,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:35:42,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:42,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1737026.6666666667, ans=0.0 2023-10-04 17:35:43,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:35:46,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:35:50,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1737026.6666666667, ans=0.1 2023-10-04 17:35:55,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:35:56,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1737026.6666666667, ans=0.125 2023-10-04 17:35:56,678 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.29 vs. limit=22.5 2023-10-04 17:35:58,528 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:35:58,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:36:05,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:36:05,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:36:06,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:36:07,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:36:07,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:36:07,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:36:08,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:36:08,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1737093.3333333333, ans=0.0 2023-10-04 17:36:08,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1737093.3333333333, ans=0.125 2023-10-04 17:36:11,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:36:14,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 17:36:14,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:36:16,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:36:16,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:36:16,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:36:18,199 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:36:20,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:36:20,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:36:21,556 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:36:22,959 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:36:24,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:36:25,904 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:36:31,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:36:35,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:36:39,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:36:40,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:36:40,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1737226.6666666667, ans=0.0 2023-10-04 17:36:43,257 INFO [train.py:1046] (1/4) Epoch 50, batch 300, loss[loss=0.1435, simple_loss=0.2159, pruned_loss=0.0356, over 23731.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2354, pruned_loss=0.03619, over 3692439.00 frames. ], batch size: 179, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:36:43,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 17:36:45,443 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:36:45,461 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:36:46,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 17:36:46,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:36:48,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:36:48,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 17:36:53,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:36:53,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:36:54,339 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.135e+02 2.414e+02 2.952e+02 4.730e+02, threshold=4.828e+02, percent-clipped=1.0 2023-10-04 17:36:57,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:36:57,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 17:36:58,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:36:58,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:36:58,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 17:36:58,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:37:03,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:37:03,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1737360.0, ans=0.05 2023-10-04 17:37:05,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1737360.0, ans=10.0 2023-10-04 17:37:08,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:37:08,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 17:37:12,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 17:37:12,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:13,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1737426.6666666667, ans=0.07 2023-10-04 17:37:15,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:37:15,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:15,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 17:37:15,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:37:19,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:37:21,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:37:23,023 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:37:25,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 17:37:25,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 17:37:27,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:37:28,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:30,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 17:37:30,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:37:33,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1737493.3333333333, ans=0.0 2023-10-04 17:37:36,750 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:37:38,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:37:38,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 17:37:41,086 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:41,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:37:44,438 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:45,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:37:45,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 17:37:45,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:37:47,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:37:49,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 17:37:50,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:50,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:37:51,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:37:51,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:37:53,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:37:58,826 INFO [train.py:1046] (1/4) Epoch 50, batch 350, loss[loss=0.1397, simple_loss=0.2176, pruned_loss=0.03088, over 23679.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2332, pruned_loss=0.03598, over 3902568.92 frames. ], batch size: 149, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:37:58,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:37:58,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 17:38:00,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:06,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:38:06,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1737626.6666666667, ans=0.1 2023-10-04 17:38:09,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:38:10,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:12,864 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 17:38:14,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:38:14,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 17:38:18,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:18,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 17:38:19,611 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:38:22,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 17:38:23,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:38:24,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:38:25,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:38:27,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:38:28,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:38:28,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:38:28,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:38:30,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:38:32,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:38:32,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:35,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1737760.0, ans=0.125 2023-10-04 17:38:39,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:38:39,873 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:38:41,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:38:41,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:38:47,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 17:38:47,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:49,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1737826.6666666667, ans=0.125 2023-10-04 17:38:51,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:38:51,984 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:38:51,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:38:53,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 17:38:55,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:38:56,738 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 17:38:58,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1737893.3333333333, ans=0.04949747468305833 2023-10-04 17:38:59,409 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 17:38:59,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:39:02,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:39:02,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 17:39:04,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1737893.3333333333, ans=0.0 2023-10-04 17:39:05,116 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.89 vs. limit=22.5 2023-10-04 17:39:05,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:39:06,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:39:08,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:39:09,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:39:09,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:39:11,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1737960.0, ans=0.0 2023-10-04 17:39:13,041 INFO [train.py:1046] (1/4) Epoch 50, batch 400, loss[loss=0.143, simple_loss=0.201, pruned_loss=0.04254, over 19175.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2329, pruned_loss=0.03578, over 4096632.47 frames. ], batch size: 388, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:39:13,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:39:15,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:39:17,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:39:19,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 17:39:19,191 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:39:19,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:39:20,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:39:20,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:39:23,110 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.98 vs. limit=15.0 2023-10-04 17:39:23,325 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.850e+02 2.204e+02 2.503e+02 2.955e+02 6.313e+02, threshold=5.007e+02, percent-clipped=5.0 2023-10-04 17:39:23,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:39:24,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:39:27,997 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 17:39:29,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 17:39:29,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:39:30,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 17:39:32,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:39:35,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:39:35,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:39:35,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 17:39:37,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:39:37,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:39:37,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:39:37,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:39:39,958 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 17:39:40,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 17:39:44,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:39:46,211 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:39:47,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 17:39:49,552 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 17:39:50,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:39:52,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:39:59,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 17:40:01,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1738160.0, ans=0.09899494936611666 2023-10-04 17:40:02,770 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:40:04,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 17:40:06,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:40:08,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:40:08,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 17:40:09,349 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.18 vs. limit=15.0 2023-10-04 17:40:11,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:40:14,514 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:40:15,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:40:19,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:40:19,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 17:40:20,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1738226.6666666667, ans=0.1 2023-10-04 17:40:21,739 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:40:22,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 17:40:24,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:40:24,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:40:25,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 17:40:27,285 INFO [train.py:1046] (1/4) Epoch 50, batch 450, loss[loss=0.1476, simple_loss=0.2269, pruned_loss=0.03418, over 23342.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2321, pruned_loss=0.03541, over 4227267.57 frames. ], batch size: 105, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:40:27,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:40:27,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:40:27,524 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 17:40:28,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 17:40:30,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:40:30,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:40:30,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:40:32,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 17:40:32,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:40:34,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:40:36,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:40:43,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1738360.0, ans=0.5 2023-10-04 17:40:46,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:40:46,902 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:40:49,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 17:40:51,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 17:40:53,163 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:40:55,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:40:57,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:41:00,363 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.40 vs. limit=15.0 2023-10-04 17:41:02,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:41:02,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:41:05,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 17:41:05,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 17:41:07,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 17:41:07,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:41:08,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:41:08,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:41:11,660 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 17:41:11,668 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 17:41:11,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:41:13,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:41:14,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 17:41:17,756 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:41:17,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:41:19,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 17:41:19,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 17:41:22,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:41:23,963 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:41:25,283 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:41:26,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 17:41:30,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:41:30,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 17:41:30,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 17:41:32,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:41:37,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:41:38,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:41:40,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:41:40,235 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 17:41:41,557 INFO [train.py:1046] (1/4) Epoch 50, batch 500, loss[loss=0.1744, simple_loss=0.2478, pruned_loss=0.05054, over 23478.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2327, pruned_loss=0.03554, over 4337092.00 frames. ], batch size: 285, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:41:44,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:41:45,144 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.48 vs. limit=22.5 2023-10-04 17:41:45,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:41:45,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:41:45,802 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 17:41:47,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 17:41:47,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:41:50,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:41:52,338 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.792e+02 2.023e+02 2.217e+02 2.721e+02 3.663e+02, threshold=4.434e+02, percent-clipped=0.0 2023-10-04 17:41:55,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1738693.3333333333, ans=0.125 2023-10-04 17:41:56,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:41:57,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:41:59,397 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:41:59,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:41:59,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1738693.3333333333, ans=0.1 2023-10-04 17:42:00,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:11,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:42:11,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 17:42:13,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:42:13,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:42:13,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 17:42:13,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:42:17,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:42:17,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:42:17,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:42:18,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:42:20,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 17:42:23,464 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 17:42:26,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:42:27,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:28,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:28,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:29,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 17:42:31,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 17:42:35,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:42:36,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:42:39,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:42:43,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:48,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:42:51,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 17:42:52,835 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:42:52,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:42:54,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 17:42:54,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1738960.0, ans=0.125 2023-10-04 17:42:56,136 INFO [train.py:1046] (1/4) Epoch 50, batch 550, loss[loss=0.1492, simple_loss=0.233, pruned_loss=0.0327, over 23205.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2341, pruned_loss=0.03605, over 4422912.44 frames. ], batch size: 119, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:42:56,238 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 17:42:57,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:43:02,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 17:43:04,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 17:43:05,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:43:05,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 17:43:05,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:43:05,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:43:06,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:07,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:07,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:43:07,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:43:10,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:43:12,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 17:43:12,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:43:17,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:43:17,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:17,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1739026.6666666667, ans=0.0 2023-10-04 17:43:19,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:43:19,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:22,622 WARNING [train.py:1204] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 17:43:24,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 17:43:27,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:43:30,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:43:30,355 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:43:30,635 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:43:31,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:43:34,573 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:43:34,578 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 17:43:35,928 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:37,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 17:43:41,981 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:43:42,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:43:42,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:43:43,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:43:44,790 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 17:43:46,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 17:43:46,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:43:46,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:43:48,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:43:48,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:43:49,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:43:50,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:43:53,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:43:53,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:43:53,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 17:43:56,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:43:56,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:43:57,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:43:57,849 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:44:00,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:44:00,446 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 17:44:04,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 17:44:09,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 17:44:10,499 INFO [train.py:1046] (1/4) Epoch 50, batch 600, loss[loss=0.1707, simple_loss=0.2367, pruned_loss=0.05233, over 23899.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2346, pruned_loss=0.03645, over 4472513.54 frames. ], batch size: 164, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:44:10,608 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:44:10,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:44:11,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:44:17,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:44:19,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:44:19,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 17:44:22,521 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.822e+02 2.089e+02 2.281e+02 2.529e+02 5.225e+02, threshold=4.562e+02, percent-clipped=2.0 2023-10-04 17:44:22,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:44:24,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:44:27,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:44:28,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 17:44:28,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:44:31,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1739360.0, ans=0.1 2023-10-04 17:44:35,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 17:44:39,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:44:39,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:44:39,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:44:44,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:44:44,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:44:46,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:44:53,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:44:58,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:44:59,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:44:59,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:45:06,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 17:45:11,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:45:11,585 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:45:15,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 17:45:15,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:45:17,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 17:45:19,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:45:19,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:45:25,063 INFO [train.py:1046] (1/4) Epoch 50, batch 650, loss[loss=0.1439, simple_loss=0.2169, pruned_loss=0.0355, over 23706.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2333, pruned_loss=0.03614, over 4524318.63 frames. ], batch size: 232, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:45:25,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 17:45:26,563 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:45:28,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:45:29,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:45:30,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:45:34,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 17:45:35,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:45:39,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:45:39,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:45:42,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:45:44,095 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1739693.3333333333, ans=0.125 2023-10-04 17:45:47,310 WARNING [train.py:1204] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 17:45:48,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:45:50,064 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:45:50,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1739693.3333333333, ans=0.125 2023-10-04 17:45:53,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:45:53,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 17:45:55,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:45:56,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:45:57,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:45:59,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:00,401 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:46:03,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:46:03,048 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 17:46:03,058 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:46:03,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:46:04,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:05,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:46:07,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:46:08,617 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:46:09,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 17:46:09,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:46:09,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:46:11,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 17:46:11,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:46:12,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 17:46:14,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 17:46:15,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 17:46:15,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:15,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:46:17,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:46:17,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:46:19,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:46:21,677 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.56 vs. limit=15.0 2023-10-04 17:46:23,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:24,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:46:24,530 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:46:27,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:46:28,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 17:46:28,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:46:31,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1739893.3333333333, ans=0.125 2023-10-04 17:46:35,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:46:35,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:46:35,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:46:35,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:46:38,254 INFO [train.py:1046] (1/4) Epoch 50, batch 700, loss[loss=0.1528, simple_loss=0.2276, pruned_loss=0.03903, over 23282.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2325, pruned_loss=0.03586, over 4570724.60 frames. ], batch size: 105, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:46:39,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 17:46:41,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 17:46:44,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 17:46:44,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:45,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:46:47,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 17:46:50,803 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.795e+02 2.097e+02 2.406e+02 2.781e+02 3.851e+02, threshold=4.811e+02, percent-clipped=0.0 2023-10-04 17:46:54,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:46:54,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1740026.6666666667, ans=0.0 2023-10-04 17:46:55,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:46:57,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:59,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:46:59,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:47:01,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.25 vs. limit=15.0 2023-10-04 17:47:02,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:47:05,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 17:47:05,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:47:06,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 17:47:09,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 17:47:09,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1740093.3333333333, ans=0.125 2023-10-04 17:47:12,086 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:47:13,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:47:15,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:47:18,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:47:20,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 17:47:24,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:47:26,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:47:26,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 17:47:27,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:47:30,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:47:31,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:47:35,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:47:36,630 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.01 vs. limit=22.5 2023-10-04 17:47:37,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 17:47:38,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1740226.6666666667, ans=0.125 2023-10-04 17:47:40,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 17:47:40,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 17:47:40,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1740226.6666666667, ans=0.2 2023-10-04 17:47:41,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:47:43,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:47:44,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:47:46,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:47:46,233 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 17:47:46,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1740226.6666666667, ans=0.125 2023-10-04 17:47:50,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 17:47:50,998 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 17:47:51,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 17:47:52,330 INFO [train.py:1046] (1/4) Epoch 50, batch 750, loss[loss=0.1695, simple_loss=0.2529, pruned_loss=0.04303, over 23294.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2317, pruned_loss=0.03581, over 4584690.06 frames. ], batch size: 105, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:47:52,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 17:47:53,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 17:47:53,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:47:55,136 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 17:47:55,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:47:56,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:47:57,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:47:59,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:47:59,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:47:59,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1740293.3333333333, ans=0.125 2023-10-04 17:48:00,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:48:03,428 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:48:04,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:48:06,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:48:07,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1740360.0, ans=0.09899494936611666 2023-10-04 17:48:08,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:48:10,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:48:11,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 17:48:11,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:48:13,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1740360.0, ans=0.125 2023-10-04 17:48:14,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:48:16,317 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:48:17,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 17:48:17,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 17:48:17,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:48:21,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 17:48:21,885 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 17:48:23,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 17:48:23,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:48:23,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:48:26,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:48:30,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:48:30,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:48:30,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:48:33,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:48:34,823 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:48:34,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 17:48:34,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:48:36,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 17:48:37,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:48:40,327 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:48:41,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 17:48:41,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:48:46,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:48:47,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:48:47,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:48:51,266 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:48:54,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 17:48:55,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:48:55,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:48:55,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1740560.0, ans=0.125 2023-10-04 17:48:56,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:48:58,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:49:00,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:49:01,006 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:49:05,283 INFO [train.py:1046] (1/4) Epoch 50, batch 800, loss[loss=0.1588, simple_loss=0.2522, pruned_loss=0.03267, over 24539.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2322, pruned_loss=0.03579, over 4603934.70 frames. ], batch size: 71, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:49:09,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:49:09,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:10,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:49:10,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:49:12,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:12,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:49:14,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:15,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1740626.6666666667, ans=0.1 2023-10-04 17:49:16,759 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.769e+02 2.060e+02 2.355e+02 2.799e+02 5.307e+02, threshold=4.709e+02, percent-clipped=2.0 2023-10-04 17:49:20,100 WARNING [train.py:1204] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:49:20,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:49:23,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 17:49:24,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:49:26,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:49:26,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:49:26,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:49:27,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 17:49:27,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:49:29,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 17:49:30,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:30,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1740693.3333333333, ans=0.0 2023-10-04 17:49:33,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:49:35,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:49:35,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:49:38,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:49:38,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:49:43,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:49:44,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:49:44,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 17:49:46,022 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 17:49:46,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 17:49:47,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:49:47,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:49:48,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:48,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:49:51,820 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 17:49:53,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 17:49:55,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:49:58,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:50:01,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:50:05,699 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:50:07,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 17:50:07,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:50:10,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 17:50:14,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:50:17,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:50:17,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 17:50:17,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:50:18,268 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.38 vs. limit=15.0 2023-10-04 17:50:19,244 INFO [train.py:1046] (1/4) Epoch 50, batch 850, loss[loss=0.1566, simple_loss=0.2372, pruned_loss=0.03802, over 24463.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2324, pruned_loss=0.03548, over 4645196.23 frames. ], batch size: 63, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:50:19,386 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:50:20,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 17:50:20,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:50:22,114 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:50:22,210 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:50:24,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:50:25,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:50:29,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 17:50:29,241 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 17:50:29,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 17:50:31,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:50:31,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:50:33,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:50:33,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:50:34,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:50:38,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:50:40,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:50:40,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 17:50:43,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 17:50:46,388 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:50:47,666 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 17:50:50,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 17:50:51,948 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 17:50:54,687 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 17:50:55,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:50:55,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:50:55,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 17:50:57,880 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:50:57,976 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:50:58,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 17:51:01,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:51:01,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:51:02,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:51:02,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 17:51:04,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:51:05,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:51:05,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 17:51:10,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:51:10,665 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:51:10,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:51:10,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:51:11,185 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.57 vs. limit=15.0 2023-10-04 17:51:12,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:51:15,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:51:16,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:51:19,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:51:20,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:51:20,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:51:22,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1741226.6666666667, ans=0.125 2023-10-04 17:51:28,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:51:30,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:51:30,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 17:51:30,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:51:30,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:51:32,665 INFO [train.py:1046] (1/4) Epoch 50, batch 900, loss[loss=0.161, simple_loss=0.2323, pruned_loss=0.04489, over 23836.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2331, pruned_loss=0.03562, over 4665312.65 frames. ], batch size: 164, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:51:32,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 17:51:38,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:51:41,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:51:41,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 17:51:41,477 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:51:42,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:51:42,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 17:51:44,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 17:51:44,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:51:44,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:51:45,743 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.794e+02 2.220e+02 2.526e+02 3.144e+02 5.127e+02, threshold=5.052e+02, percent-clipped=1.0 2023-10-04 17:51:45,821 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:51:45,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:51:50,579 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.85 vs. limit=15.0 2023-10-04 17:51:51,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1741360.0, ans=0.125 2023-10-04 17:51:56,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:51:56,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:51:56,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:52:01,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:52:07,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 17:52:07,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:52:13,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 17:52:14,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:52:16,096 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 17:52:16,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 17:52:20,493 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:52:20,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:52:21,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:52:23,793 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.41 vs. limit=15.0 2023-10-04 17:52:28,067 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:52:28,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:52:29,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 17:52:29,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1741560.0, ans=0.2 2023-10-04 17:52:30,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:52:31,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1741560.0, ans=0.0 2023-10-04 17:52:34,560 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 17:52:35,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:52:35,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:52:37,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:52:38,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:52:42,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 17:52:42,955 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 17:52:44,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 17:52:44,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 17:52:45,558 INFO [train.py:1046] (1/4) Epoch 50, batch 950, loss[loss=0.1517, simple_loss=0.2278, pruned_loss=0.03784, over 23513.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2337, pruned_loss=0.03587, over 4678635.00 frames. ], batch size: 134, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:52:47,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:52:50,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 17:52:54,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:52:56,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:52:56,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1741626.6666666667, ans=0.2 2023-10-04 17:52:57,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:52:57,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:52:59,365 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 17:53:04,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:53:05,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:53:06,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:53:06,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:53:06,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 17:53:08,217 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 17:53:09,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:53:12,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 17:53:12,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:53:15,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:53:15,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:53:15,285 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:53:18,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 17:53:19,966 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:53:22,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:53:22,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:53:27,035 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:53:27,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:53:29,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 17:53:31,670 WARNING [train.py:1204] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 17:53:31,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:53:33,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:53:34,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:53:34,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:53:37,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 17:53:37,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:53:38,704 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.51 vs. limit=12.0 2023-10-04 17:53:40,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:53:40,536 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:53:40,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 17:53:41,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:53:41,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:53:41,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 17:53:46,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:53:49,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:53:49,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1741893.3333333333, ans=0.1 2023-10-04 17:53:53,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:53:56,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 17:53:56,350 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 17:53:59,221 INFO [train.py:1046] (1/4) Epoch 50, batch 1000, loss[loss=0.164, simple_loss=0.2501, pruned_loss=0.03892, over 23940.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2335, pruned_loss=0.03581, over 4692203.62 frames. ], batch size: 80, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:53:59,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:54:03,322 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 17:54:03,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:09,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:54:10,688 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 17:54:10,691 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 17:54:13,322 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.769e+02 2.086e+02 2.281e+02 2.987e+02 4.437e+02, threshold=4.562e+02, percent-clipped=0.0 2023-10-04 17:54:15,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1742026.6666666667, ans=0.0 2023-10-04 17:54:16,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:54:16,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:54:18,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:54:21,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 17:54:23,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 17:54:26,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 17:54:26,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:54:27,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 17:54:29,341 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 17:54:29,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 17:54:31,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:54:32,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:32,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1742093.3333333333, ans=0.125 2023-10-04 17:54:40,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:54:41,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:54:41,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:42,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:54:42,826 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 17:54:42,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:54:44,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:54:44,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:54:44,275 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 17:54:47,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 17:54:47,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 17:54:48,469 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.42 vs. limit=12.0 2023-10-04 17:54:48,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 17:54:51,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:54:53,752 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.62 vs. limit=15.0 2023-10-04 17:54:57,274 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:57,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:54:57,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:58,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:55:01,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 17:55:03,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:55:05,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 17:55:05,153 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 17:55:05,255 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:55:05,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:55:09,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:55:11,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:55:12,692 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:55:13,947 INFO [train.py:1046] (1/4) Epoch 50, batch 1050, loss[loss=0.1547, simple_loss=0.2273, pruned_loss=0.04099, over 23830.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2323, pruned_loss=0.0358, over 4689358.34 frames. ], batch size: 195, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:55:15,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:55:17,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:55:18,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:55:18,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:55:21,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:55:23,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:55:23,155 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:55:27,268 WARNING [train.py:1204] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:55:28,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:55:28,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:55:30,408 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:55:30,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 17:55:32,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:55:33,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 17:55:36,229 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:55:36,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 17:55:36,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 17:55:36,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1742360.0, ans=0.0 2023-10-04 17:55:37,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1742360.0, ans=0.0 2023-10-04 17:55:39,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1742360.0, ans=0.125 2023-10-04 17:55:42,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:55:43,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:55:44,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:55:46,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 17:55:46,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 17:55:46,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:55:50,517 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.95 vs. limit=15.0 2023-10-04 17:55:51,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 17:55:52,908 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:55:55,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 17:55:55,276 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:55:58,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 17:55:58,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1742493.3333333333, ans=0.1 2023-10-04 17:55:59,497 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 17:55:59,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:56:00,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:56:04,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:56:07,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 17:56:07,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 17:56:08,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 17:56:08,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:56:08,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:56:10,487 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 17:56:13,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:56:14,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:56:14,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:56:16,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:56:16,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:56:20,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:56:20,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 17:56:20,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:56:20,812 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 17:56:22,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 17:56:23,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:56:26,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:56:27,631 INFO [train.py:1046] (1/4) Epoch 50, batch 1100, loss[loss=0.1536, simple_loss=0.2146, pruned_loss=0.04633, over 19224.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2315, pruned_loss=0.0357, over 4695569.44 frames. ], batch size: 388, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:56:30,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:56:36,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:56:38,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:56:39,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:56:39,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 17:56:39,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:56:41,303 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.164e+02 2.522e+02 3.230e+02 4.448e+02, threshold=5.045e+02, percent-clipped=0.0 2023-10-04 17:56:43,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:56:45,838 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:56:48,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:56:48,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 17:56:49,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 17:56:51,769 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:56:51,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:56:54,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:56:55,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:56:58,216 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.27 vs. limit=12.0 2023-10-04 17:56:59,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1742760.0, ans=0.125 2023-10-04 17:57:00,616 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:57:03,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 17:57:05,247 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 17:57:05,302 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:57:08,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:57:09,459 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:57:10,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:57:10,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 17:57:10,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:57:12,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:57:12,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:57:13,559 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:57:13,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 17:57:18,462 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:57:18,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 17:57:19,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:57:23,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1742826.6666666667, ans=0.0 2023-10-04 17:57:24,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:57:25,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 17:57:25,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:57:27,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:57:29,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:57:29,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:57:32,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 17:57:32,510 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:57:32,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1742893.3333333333, ans=0.09899494936611666 2023-10-04 17:57:33,888 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:57:35,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 17:57:35,127 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:57:35,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 17:57:36,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:57:36,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:57:36,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1742893.3333333333, ans=0.07 2023-10-04 17:57:37,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1742893.3333333333, ans=0.125 2023-10-04 17:57:39,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:57:41,898 INFO [train.py:1046] (1/4) Epoch 50, batch 1150, loss[loss=0.1641, simple_loss=0.2416, pruned_loss=0.04328, over 23364.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2321, pruned_loss=0.03582, over 4687471.39 frames. ], batch size: 285, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:57:44,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:57:48,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:57:48,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1742960.0, ans=0.125 2023-10-04 17:57:49,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:57:49,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:57:51,468 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 17:57:51,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:57:51,700 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1742960.0, ans=0.125 2023-10-04 17:57:54,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 17:57:55,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:57:56,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:58:01,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 17:58:04,424 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:58:07,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:58:09,256 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:58:10,546 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 17:58:10,553 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:58:10,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:58:14,839 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 17:58:16,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:58:17,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:58:25,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:58:31,323 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:58:31,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 17:58:32,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:58:32,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:58:38,702 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 17:58:41,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:58:46,985 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 17:58:49,855 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:58:51,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:58:51,248 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:58:53,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:58:54,933 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=13.22 vs. limit=15.0 2023-10-04 17:58:56,185 INFO [train.py:1046] (1/4) Epoch 50, batch 1200, loss[loss=0.1465, simple_loss=0.2274, pruned_loss=0.03284, over 20348.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.233, pruned_loss=0.03575, over 4694447.18 frames. ], batch size: 44, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 17:58:57,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:59:01,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:59:01,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:59:05,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:59:05,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:59:06,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:59:07,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:59:09,093 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.691e+02 2.143e+02 2.355e+02 2.830e+02 4.818e+02, threshold=4.710e+02, percent-clipped=0.0 2023-10-04 17:59:09,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:59:11,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:59:11,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:59:11,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1743360.0, ans=0.0 2023-10-04 17:59:12,799 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 17:59:15,025 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.54 vs. limit=12.0 2023-10-04 17:59:15,571 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 17:59:15,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1743360.0, ans=0.125 2023-10-04 17:59:19,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:59:22,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:59:23,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:59:23,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1743426.6666666667, ans=0.125 2023-10-04 17:59:25,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:59:25,194 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 17:59:27,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:59:36,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:59:36,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:59:36,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 17:59:36,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:59:41,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 17:59:45,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 17:59:45,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:59:46,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:59:46,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1743493.3333333333, ans=0.125 2023-10-04 17:59:46,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1743493.3333333333, ans=0.1 2023-10-04 17:59:47,287 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.18 vs. limit=15.0 2023-10-04 17:59:49,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:59:49,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 17:59:50,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:59:50,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:59:52,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:59:52,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 17:59:52,910 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.49 vs. limit=12.0 2023-10-04 17:59:53,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:59:53,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:59:53,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 17:59:56,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:59:56,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:59:59,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 18:00:01,126 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:00:04,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 18:00:07,772 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 18:00:09,743 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:00:10,953 INFO [train.py:1046] (1/4) Epoch 50, batch 1250, loss[loss=0.1355, simple_loss=0.2114, pruned_loss=0.0298, over 24297.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2339, pruned_loss=0.03617, over 4687756.54 frames. ], batch size: 56, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:00:12,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:00:13,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:00:15,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:00:15,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1743626.6666666667, ans=0.125 2023-10-04 18:00:16,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 18:00:18,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1743626.6666666667, ans=0.0 2023-10-04 18:00:19,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1743626.6666666667, ans=0.04949747468305833 2023-10-04 18:00:20,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:00:20,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:00:22,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 18:00:24,506 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.30 vs. limit=15.0 2023-10-04 18:00:24,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:00:26,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:00:26,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1743693.3333333333, ans=0.125 2023-10-04 18:00:30,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:00:31,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:00:32,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:00:32,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:00:35,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:00:38,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:00:38,304 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 18:00:38,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:00:39,742 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.53 vs. limit=15.0 2023-10-04 18:00:40,219 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:00:41,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:00:44,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:00:46,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:00:51,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 18:00:51,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:00:54,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:00:54,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 18:00:54,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:00:54,141 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 18:00:54,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:00:55,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:00:58,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:01:00,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1743826.6666666667, ans=0.2 2023-10-04 18:01:01,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:01:01,828 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:01:03,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 18:01:03,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 18:01:03,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 18:01:03,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1743826.6666666667, ans=0.125 2023-10-04 18:01:05,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:01:09,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 18:01:09,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:01:12,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 18:01:12,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:01:13,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 18:01:13,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 18:01:14,002 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:01:14,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:01:14,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:01:16,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 18:01:19,562 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:01:21,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:01:22,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 18:01:23,775 INFO [train.py:1046] (1/4) Epoch 50, batch 1300, loss[loss=0.1341, simple_loss=0.2115, pruned_loss=0.02828, over 23296.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2342, pruned_loss=0.0362, over 4681540.83 frames. ], batch size: 105, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:01:25,149 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 18:01:27,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:01:28,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 18:01:32,935 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:01:34,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:01:34,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:01:35,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:01:37,116 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.751e+02 2.152e+02 2.609e+02 3.024e+02 4.878e+02, threshold=5.218e+02, percent-clipped=1.0 2023-10-04 18:01:37,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:01:38,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 18:01:42,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1744026.6666666667, ans=0.015 2023-10-04 18:01:43,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:01:45,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1744026.6666666667, ans=0.125 2023-10-04 18:01:46,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:01:47,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 18:01:49,327 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1744026.6666666667, ans=0.125 2023-10-04 18:01:50,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:01:53,389 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:01:54,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:01:56,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:01:56,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:01:57,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:01:57,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 18:01:57,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 18:01:57,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1744093.3333333333, ans=0.125 2023-10-04 18:02:04,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:02:04,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:02:05,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 18:02:07,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 18:02:08,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:02:09,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:02:11,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 18:02:13,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:02:13,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 18:02:15,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:02:15,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1744160.0, ans=0.0 2023-10-04 18:02:19,176 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:02:19,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:02:23,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 18:02:24,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 18:02:24,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 18:02:27,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1744226.6666666667, ans=0.2 2023-10-04 18:02:29,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:02:29,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1744226.6666666667, ans=0.125 2023-10-04 18:02:32,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 18:02:32,145 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:02:38,144 INFO [train.py:1046] (1/4) Epoch 50, batch 1350, loss[loss=0.1383, simple_loss=0.2259, pruned_loss=0.02535, over 24597.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2334, pruned_loss=0.03603, over 4697159.36 frames. ], batch size: 60, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:02:39,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 18:02:43,082 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:02:44,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:02:47,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:02:47,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:02:47,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:02:48,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:02:50,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1744293.3333333333, ans=0.2 2023-10-04 18:02:50,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1744293.3333333333, ans=0.125 2023-10-04 18:02:52,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:02:52,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1744360.0, ans=0.125 2023-10-04 18:02:53,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 18:02:55,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 18:02:55,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:02:59,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 18:03:01,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:03:01,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:03:03,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 18:03:04,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 18:03:05,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 18:03:07,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:03:07,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 18:03:17,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:03:26,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:03:28,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:03:28,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 18:03:29,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:03:31,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 18:03:31,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 18:03:32,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:03:35,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:03:37,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 18:03:38,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:03:44,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 18:03:47,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 18:03:51,535 INFO [train.py:1046] (1/4) Epoch 50, batch 1400, loss[loss=0.1513, simple_loss=0.2374, pruned_loss=0.03255, over 24667.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.232, pruned_loss=0.03567, over 4684526.80 frames. ], batch size: 73, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:03:52,892 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 18:03:53,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:03:54,489 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:03:56,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:03:59,953 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 18:04:00,667 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=13.68 vs. limit=15.0 2023-10-04 18:04:01,222 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 18:04:05,072 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.753e+02 2.179e+02 2.424e+02 2.840e+02 4.477e+02, threshold=4.849e+02, percent-clipped=0.0 2023-10-04 18:04:09,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:04:11,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:04:14,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:04:14,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:04:16,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:04:18,147 WARNING [train.py:1204] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 18:04:29,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:04:29,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:04:29,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1744760.0, ans=0.125 2023-10-04 18:04:33,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1744760.0, ans=0.2 2023-10-04 18:04:34,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 18:04:34,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1744826.6666666667, ans=0.125 2023-10-04 18:04:35,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:04:35,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:04:37,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:04:37,336 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:04:37,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1744826.6666666667, ans=0.1 2023-10-04 18:04:38,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:04:38,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:04:40,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:04:40,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1744826.6666666667, ans=0.0 2023-10-04 18:04:43,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 18:04:43,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:04:46,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:04:50,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:04:55,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1744893.3333333333, ans=0.2 2023-10-04 18:04:56,516 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 18:04:57,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 18:04:57,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1744893.3333333333, ans=0.125 2023-10-04 18:05:00,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:05:02,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 18:05:02,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:05:05,570 INFO [train.py:1046] (1/4) Epoch 50, batch 1450, loss[loss=0.1558, simple_loss=0.2325, pruned_loss=0.03952, over 23745.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2316, pruned_loss=0.03529, over 4690904.36 frames. ], batch size: 164, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:05:05,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:05:07,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:05:08,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:05:08,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:08,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 18:05:14,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:05:15,286 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:05:16,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:05:16,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 18:05:18,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:05:18,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 18:05:18,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:19,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:05:19,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 18:05:21,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:05:22,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:05:23,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 18:05:23,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:05:25,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:05:25,135 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:27,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:05:30,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:05:30,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:05:33,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:05:33,191 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:35,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:05:35,971 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:05:35,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:36,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:05:42,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 18:05:45,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:05:47,978 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 18:05:50,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:05:52,164 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:05:53,662 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:05:55,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 18:05:59,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:05:59,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1745160.0, ans=0.125 2023-10-04 18:06:00,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 18:06:01,368 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.64 vs. limit=12.0 2023-10-04 18:06:02,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 18:06:03,414 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:06:06,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:06:06,866 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:06:09,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 18:06:12,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 18:06:13,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 18:06:15,700 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:06:15,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:06:20,488 INFO [train.py:1046] (1/4) Epoch 50, batch 1500, loss[loss=0.1546, simple_loss=0.2269, pruned_loss=0.04112, over 23852.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2323, pruned_loss=0.03549, over 4691341.64 frames. ], batch size: 195, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:06:23,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 18:06:23,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:06:23,607 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:06:24,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:06:24,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:06:26,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:06:26,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 18:06:27,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:06:29,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:06:29,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:06:30,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:06:32,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1745293.3333333333, ans=0.0 2023-10-04 18:06:33,289 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.778e+02 2.060e+02 2.266e+02 2.721e+02 4.158e+02, threshold=4.532e+02, percent-clipped=0.0 2023-10-04 18:06:33,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:06:34,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:06:37,531 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.63 vs. limit=15.0 2023-10-04 18:06:40,214 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:06:40,224 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 18:06:41,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:06:41,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:06:42,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:06:46,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 18:06:46,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1745360.0, ans=0.1 2023-10-04 18:06:49,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 18:06:50,972 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:06:52,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 18:06:54,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:06:55,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1745426.6666666667, ans=0.0 2023-10-04 18:06:56,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:06:57,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:06:57,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:06:59,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 18:07:00,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:07:00,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:07:00,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 18:07:01,879 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:07:02,424 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.73 vs. limit=15.0 2023-10-04 18:07:04,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:07:04,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 18:07:09,312 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 18:07:13,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:07:17,504 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 18:07:17,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:07:17,560 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 18:07:18,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:07:20,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:07:20,412 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 18:07:21,725 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:07:24,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 18:07:25,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:07:27,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:07:27,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:07:28,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:07:28,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:07:28,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:07:30,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 18:07:31,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 18:07:31,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:07:32,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 18:07:32,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 18:07:34,227 INFO [train.py:1046] (1/4) Epoch 50, batch 1550, loss[loss=0.1761, simple_loss=0.2511, pruned_loss=0.05057, over 19169.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2335, pruned_loss=0.03611, over 4689479.44 frames. ], batch size: 388, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:07:34,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1745626.6666666667, ans=0.1 2023-10-04 18:07:35,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:07:37,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:07:37,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:07:38,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:07:38,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:07:39,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1745626.6666666667, ans=0.125 2023-10-04 18:07:40,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:07:45,783 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 18:07:45,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:07:46,602 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.04 vs. limit=10.0 2023-10-04 18:07:47,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:07:47,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 18:07:48,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:07:48,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 18:07:51,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:07:51,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 18:07:52,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 18:07:52,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 18:07:52,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:07:54,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:07:55,933 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:07:57,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:07:59,752 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 18:07:59,754 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 18:08:05,852 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.73 vs. limit=15.0 2023-10-04 18:08:08,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:08:12,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:08:13,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:08:13,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:08:13,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 18:08:19,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:08:21,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:08:23,165 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=6.78 vs. limit=15.0 2023-10-04 18:08:23,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:08:24,293 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.18 vs. limit=15.0 2023-10-04 18:08:25,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:08:26,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:08:26,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 18:08:26,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:08:26,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:08:27,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:08:28,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 18:08:29,320 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 18:08:32,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:08:36,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 18:08:42,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:08:44,485 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.57 vs. limit=15.0 2023-10-04 18:08:45,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:08:45,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 18:08:47,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1745960.0, ans=0.125 2023-10-04 18:08:49,081 INFO [train.py:1046] (1/4) Epoch 50, batch 1600, loss[loss=0.1495, simple_loss=0.2344, pruned_loss=0.03234, over 24458.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2343, pruned_loss=0.03629, over 4688250.20 frames. ], batch size: 63, lr: 2.03e-03, grad_scale: 32.0 2023-10-04 18:08:49,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:08:49,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:08:49,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:08:51,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:08:52,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:08:54,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:08:55,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 18:08:56,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 18:08:59,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 18:09:00,926 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:09:02,118 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.717e+02 2.102e+02 2.370e+02 2.617e+02 3.812e+02, threshold=4.739e+02, percent-clipped=0.0 2023-10-04 18:09:02,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 18:09:03,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:09:06,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:09:12,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:09:14,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 18:09:16,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:09:18,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 18:09:19,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:09:20,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 18:09:22,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1746093.3333333333, ans=0.0 2023-10-04 18:09:26,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 18:09:28,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1746093.3333333333, ans=0.125 2023-10-04 18:09:32,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:09:32,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 18:09:33,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:09:33,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:09:33,442 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:09:34,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 18:09:38,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 18:09:40,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:09:40,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:09:41,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:09:43,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:09:44,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:09:46,310 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:09:46,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:09:52,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:09:52,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:09:54,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 18:09:54,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:09:56,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 18:10:02,473 INFO [train.py:1046] (1/4) Epoch 50, batch 1650, loss[loss=0.1471, simple_loss=0.2347, pruned_loss=0.02972, over 24307.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2353, pruned_loss=0.03672, over 4682509.54 frames. ], batch size: 61, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:10:02,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:10:03,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:10:05,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:10:05,352 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 18:10:05,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 18:10:05,381 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 18:10:06,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 18:10:09,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:10:09,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:10:09,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:10:10,929 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:10:12,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:10:13,741 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 18:10:15,641 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:10:15,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:10:15,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:10:15,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:10:17,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 18:10:18,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 18:10:18,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1746360.0, ans=0.1 2023-10-04 18:10:19,279 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.69 vs. limit=6.0 2023-10-04 18:10:23,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:10:24,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:10:32,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 18:10:32,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:10:35,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 18:10:37,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:10:39,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:10:39,942 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:10:41,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:10:41,348 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:10:42,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:10:44,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:10:45,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:10:45,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:10:47,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:10:49,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:10:51,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:10:54,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:10:55,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 18:10:58,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:10:58,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 18:11:00,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 18:11:00,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 18:11:01,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:11:01,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:11:03,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:11:03,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:11:03,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 18:11:06,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:11:07,349 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:11:08,205 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.33 vs. limit=12.0 2023-10-04 18:11:08,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:11:10,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 18:11:15,456 INFO [train.py:1046] (1/4) Epoch 50, batch 1700, loss[loss=0.1268, simple_loss=0.192, pruned_loss=0.03083, over 23391.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2329, pruned_loss=0.03598, over 4681500.63 frames. ], batch size: 285, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:11:15,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:11:15,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:11:15,523 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 18:11:16,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:11:16,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:11:16,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:11:18,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:11:18,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:11:18,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 18:11:22,702 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:11:26,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1746626.6666666667, ans=0.2 2023-10-04 18:11:31,435 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.062e+02 2.401e+02 2.672e+02 3.684e+02, threshold=4.801e+02, percent-clipped=0.0 2023-10-04 18:11:32,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:11:34,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:11:38,768 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:11:40,040 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:11:40,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:11:40,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:11:42,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 18:11:44,172 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:11:44,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:11:45,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:11:47,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:11:50,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 18:11:52,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 18:11:53,878 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:11:55,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 18:11:57,232 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:12:04,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:12:05,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:12:05,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:12:07,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:12:08,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 18:12:08,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:12:11,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:12:11,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 18:12:11,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:12:11,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:12:12,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:12:12,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:12:13,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:12:13,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:12:15,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:12:15,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:12:16,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:12:19,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:12:21,345 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 18:12:24,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:12:24,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:12:29,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 18:12:30,770 INFO [train.py:1046] (1/4) Epoch 50, batch 1750, loss[loss=0.1561, simple_loss=0.2455, pruned_loss=0.03336, over 24083.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2319, pruned_loss=0.03566, over 4687784.36 frames. ], batch size: 80, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:12:33,619 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:12:35,141 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:12:35,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:12:36,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 18:12:36,510 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:12:39,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:12:39,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:12:43,400 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 18:12:44,871 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:12:47,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 18:12:47,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:12:49,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:12:51,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 18:12:53,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 18:12:56,209 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:12:56,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1747026.6666666667, ans=0.0 2023-10-04 18:12:57,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 18:12:58,434 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.10 vs. limit=15.0 2023-10-04 18:13:05,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:13:06,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1747093.3333333333, ans=0.1 2023-10-04 18:13:07,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:13:07,512 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:13:10,313 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:13:10,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:13:12,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:13:14,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:13:15,881 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:13:16,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1747160.0, ans=0.0 2023-10-04 18:13:17,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:13:18,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1747160.0, ans=0.0 2023-10-04 18:13:19,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 18:13:19,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:13:20,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1747160.0, ans=0.0 2023-10-04 18:13:22,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 18:13:23,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:13:25,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:13:26,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:13:30,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 18:13:31,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 18:13:31,557 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:13:33,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:13:37,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:13:40,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:13:41,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:13:42,918 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 18:13:42,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:13:44,239 INFO [train.py:1046] (1/4) Epoch 50, batch 1800, loss[loss=0.1534, simple_loss=0.2371, pruned_loss=0.03491, over 23438.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2318, pruned_loss=0.03543, over 4690213.98 frames. ], batch size: 93, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:13:44,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:13:44,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:13:44,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:13:44,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:13:45,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:13:48,918 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:13:49,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:13:50,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 18:13:54,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:13:56,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:13:57,877 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:13:59,697 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 2.181e+02 2.519e+02 2.906e+02 5.254e+02, threshold=5.039e+02, percent-clipped=1.0 2023-10-04 18:14:01,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:14:04,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:14:04,595 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:14:05,973 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:14:07,320 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:14:08,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 18:14:08,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:12,938 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:17,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 18:14:17,972 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.44 vs. limit=15.0 2023-10-04 18:14:18,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 18:14:18,528 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 18:14:18,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:14:19,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:14:19,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:14:21,285 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:14:24,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1747426.6666666667, ans=0.1 2023-10-04 18:14:29,811 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 18:14:31,164 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:14:32,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:32,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 18:14:32,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 18:14:34,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:14:35,852 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:14:35,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:14:40,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 18:14:44,272 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:14:45,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 18:14:45,614 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:14:45,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:14:47,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:14:47,037 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 18:14:47,934 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.99 vs. limit=15.0 2023-10-04 18:14:48,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:14:48,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:14:53,688 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 18:14:53,689 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:14:53,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1747560.0, ans=0.035 2023-10-04 18:14:54,944 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:14:54,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:14:54,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:58,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:58,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:14:59,463 INFO [train.py:1046] (1/4) Epoch 50, batch 1850, loss[loss=0.153, simple_loss=0.2449, pruned_loss=0.0305, over 24588.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2325, pruned_loss=0.03583, over 4686018.20 frames. ], batch size: 71, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:14:59,569 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:14:59,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:15:02,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:15:02,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:15:09,779 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:15:09,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 18:15:11,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1747626.6666666667, ans=0.125 2023-10-04 18:15:12,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 18:15:15,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 18:15:18,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:15:19,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 18:15:19,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 18:15:29,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:15:29,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 18:15:32,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:15:32,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:15:35,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 18:15:35,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:15:35,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:15:38,257 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:15:39,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:15:42,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:15:45,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1747826.6666666667, ans=0.2 2023-10-04 18:15:46,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:15:46,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:15:46,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1747826.6666666667, ans=0.1 2023-10-04 18:15:47,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 18:15:47,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:15:49,765 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:15:50,001 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:15:51,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:15:54,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 18:15:54,541 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:15:54,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1747826.6666666667, ans=0.0 2023-10-04 18:15:59,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:15:59,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:15:59,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 18:15:59,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 18:16:00,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1747893.3333333333, ans=0.1 2023-10-04 18:16:01,764 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 18:16:03,178 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 18:16:04,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:16:04,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:16:04,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:16:06,371 WARNING [train.py:1204] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:16:06,445 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 18:16:07,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:16:07,805 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:16:09,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:16:09,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 18:16:09,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:16:09,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 18:16:12,146 WARNING [train.py:1204] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:16:12,163 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 18:16:13,472 INFO [train.py:1046] (1/4) Epoch 50, batch 1900, loss[loss=0.1611, simple_loss=0.237, pruned_loss=0.04261, over 23703.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2338, pruned_loss=0.03632, over 4684991.32 frames. ], batch size: 232, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:16:13,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:16:13,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:16:17,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:16:20,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:16:21,839 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 18:16:22,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 18:16:23,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:16:23,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:16:23,879 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 18:16:25,166 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 18:16:27,857 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.683e+02 2.104e+02 2.393e+02 2.783e+02 5.984e+02, threshold=4.786e+02, percent-clipped=4.0 2023-10-04 18:16:29,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 18:16:32,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:16:35,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 18:16:37,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 18:16:41,211 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.26 vs. limit=15.0 2023-10-04 18:16:47,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 18:16:49,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 18:16:49,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:16:50,020 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 18:16:50,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 18:16:50,050 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 18:16:51,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 18:16:51,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:16:56,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 18:16:56,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1748160.0, ans=0.0 2023-10-04 18:16:59,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:17:00,885 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:17:00,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 18:17:03,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:17:08,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 18:17:10,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:17:13,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1748226.6666666667, ans=0.0 2023-10-04 18:17:14,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:17:14,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:17:16,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:17:16,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:17:18,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 18:17:18,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 18:17:20,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:17:23,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:17:23,130 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:17:23,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1748226.6666666667, ans=0.125 2023-10-04 18:17:26,377 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:17:26,380 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:17:27,667 INFO [train.py:1046] (1/4) Epoch 50, batch 1950, loss[loss=0.1472, simple_loss=0.2363, pruned_loss=0.02908, over 24665.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.234, pruned_loss=0.0363, over 4697537.50 frames. ], batch size: 68, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:17:27,745 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:17:29,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:17:32,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:17:34,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1748293.3333333333, ans=0.1 2023-10-04 18:17:34,125 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:17:35,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:17:35,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:17:35,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:17:38,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 18:17:39,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 18:17:39,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:17:40,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:17:43,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:17:43,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:17:43,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:17:44,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:17:47,177 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:17:47,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:17:47,217 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:17:48,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:17:48,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1748360.0, ans=0.125 2023-10-04 18:17:51,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:17:53,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:17:53,981 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:17:53,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:17:53,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 18:17:55,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:17:55,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:17:55,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:18:00,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:18:01,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:18:07,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:18:09,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:18:11,036 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:18:11,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 18:18:12,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:18:14,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1748493.3333333333, ans=10.0 2023-10-04 18:18:15,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:18:17,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:18:17,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:18:24,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:18:26,293 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:18:27,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:18:30,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:18:33,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:18:33,760 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:18:35,060 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 18:18:35,065 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:18:35,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:18:36,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 18:18:36,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1748560.0, ans=0.125 2023-10-04 18:18:37,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:18:41,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:18:43,193 INFO [train.py:1046] (1/4) Epoch 50, batch 2000, loss[loss=0.1343, simple_loss=0.205, pruned_loss=0.03175, over 23766.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2344, pruned_loss=0.03623, over 4689761.78 frames. ], batch size: 232, lr: 2.03e-03, grad_scale: 32.0 2023-10-04 18:18:43,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:18:44,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:18:45,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:18:47,281 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:18:50,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 18:18:51,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:18:54,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:18:55,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 18:18:57,499 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.701e+02 2.194e+02 2.589e+02 3.041e+02 4.664e+02, threshold=5.178e+02, percent-clipped=0.0 2023-10-04 18:18:58,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:18:58,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:19:01,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:19:03,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 18:19:06,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:06,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:06,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1748693.3333333333, ans=0.1 2023-10-04 18:19:07,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:07,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 18:19:07,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:19:10,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 18:19:10,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:19:12,462 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.37 vs. limit=15.0 2023-10-04 18:19:13,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:19:13,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 18:19:13,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:14,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:19:14,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:19:16,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 18:19:19,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 18:19:19,651 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:19:20,338 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.04 vs. limit=22.5 2023-10-04 18:19:20,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:19:23,242 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.75 vs. limit=22.5 2023-10-04 18:19:26,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:19:26,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:19:26,680 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:19:28,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:19:31,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:19:31,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:19:31,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:19:31,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:19:34,071 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:37,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:19:37,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 18:19:41,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:19:43,077 WARNING [train.py:1204] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:19:43,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1748893.3333333333, ans=0.1 2023-10-04 18:19:47,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:19:47,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:19:50,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:52,326 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:19:52,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:52,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:19:52,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:19:55,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:19:57,075 INFO [train.py:1046] (1/4) Epoch 50, batch 2050, loss[loss=0.1521, simple_loss=0.2054, pruned_loss=0.04941, over 19810.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2339, pruned_loss=0.03607, over 4698612.52 frames. ], batch size: 388, lr: 2.03e-03, grad_scale: 32.0 2023-10-04 18:19:57,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:20:01,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:20:01,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:20:05,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:20:07,619 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:20:08,836 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:20:08,917 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:20:09,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1748960.0, ans=0.125 2023-10-04 18:20:11,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 18:20:11,476 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:20:11,588 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:20:11,623 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:20:22,162 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:20:22,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:20:24,096 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 18:20:24,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1749026.6666666667, ans=0.125 2023-10-04 18:20:24,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1749026.6666666667, ans=0.0 2023-10-04 18:20:25,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:20:26,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 18:20:28,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:20:31,132 WARNING [train.py:1204] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:20:32,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:20:32,657 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:20:34,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:20:34,128 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:20:37,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:20:37,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:20:39,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:20:41,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:20:44,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:20:44,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:20:45,088 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.65 vs. limit=22.5 2023-10-04 18:20:48,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:20:53,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:20:55,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 18:21:00,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:21:02,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:21:03,750 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:21:07,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 18:21:10,000 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 18:21:10,001 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:21:10,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:21:11,261 INFO [train.py:1046] (1/4) Epoch 50, batch 2100, loss[loss=0.1223, simple_loss=0.1807, pruned_loss=0.03192, over 19175.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2316, pruned_loss=0.03556, over 4686894.97 frames. ], batch size: 388, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:21:11,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:21:11,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:21:12,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 18:21:12,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 18:21:14,198 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:21:16,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:21:18,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:21:22,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:21:22,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:21:23,629 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 18:21:25,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:21:26,315 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 18:21:26,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 18:21:27,633 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.746e+02 2.172e+02 2.598e+02 3.189e+02 5.506e+02, threshold=5.196e+02, percent-clipped=2.0 2023-10-04 18:21:27,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:21:27,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:21:27,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 18:21:27,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 18:21:35,107 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 18:21:35,108 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:21:36,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:21:36,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:21:36,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1749360.0, ans=0.125 2023-10-04 18:21:41,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:21:41,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 18:21:42,569 WARNING [train.py:1204] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:21:42,572 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 18:21:44,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 18:21:44,085 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:21:44,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 18:21:45,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 18:21:45,436 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 18:21:48,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:21:49,480 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:21:51,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1749426.6666666667, ans=0.125 2023-10-04 18:21:52,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:21:55,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:21:56,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:21:59,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:21:59,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 18:21:59,513 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:21:59,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:21:59,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:22:00,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 18:22:01,025 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 18:22:02,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 18:22:06,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:22:09,232 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:22:11,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 18:22:13,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:22:16,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:22:16,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:22:16,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:22:16,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 18:22:17,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:22:20,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:22:20,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:22:22,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:22:22,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:22:23,527 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 18:22:25,411 INFO [train.py:1046] (1/4) Epoch 50, batch 2150, loss[loss=0.1728, simple_loss=0.2432, pruned_loss=0.05124, over 23804.00 frames. ], tot_loss[loss=0.151, simple_loss=0.2312, pruned_loss=0.03534, over 4695053.58 frames. ], batch size: 164, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:22:25,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 18:22:25,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:22:25,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1749626.6666666667, ans=0.04949747468305833 2023-10-04 18:22:28,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:22:28,324 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:22:28,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:22:29,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:22:30,318 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.52 vs. limit=15.0 2023-10-04 18:22:33,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 18:22:35,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:22:37,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:22:37,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1749626.6666666667, ans=0.125 2023-10-04 18:22:38,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:22:38,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:22:39,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:22:41,911 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:22:43,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:22:43,266 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:22:47,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:22:48,866 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 18:22:51,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:22:51,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:22:53,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:22:53,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:22:53,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1749760.0, ans=0.0 2023-10-04 18:22:55,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:22:55,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:22:55,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:22:55,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:22:57,075 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:22:58,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 18:23:01,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:23:01,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:23:01,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:23:03,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:23:04,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:23:07,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:23:07,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:23:08,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:23:08,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 18:23:09,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:23:10,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1749826.6666666667, ans=0.1 2023-10-04 18:23:12,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:23:13,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:14,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:23:16,024 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:23:17,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:18,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:18,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 18:23:20,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 18:23:20,678 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=3.84 vs. limit=10.0 2023-10-04 18:23:21,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:23:21,409 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 18:23:21,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:21,493 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:23:22,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 18:23:22,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:23:22,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 18:23:22,871 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 18:23:22,871 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 18:23:24,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 18:23:26,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:26,214 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:23:26,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:23:27,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:27,651 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 18:23:29,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:29,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:38,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:23:39,926 INFO [train.py:1046] (1/4) Epoch 50, batch 2200, loss[loss=0.1399, simple_loss=0.2289, pruned_loss=0.02543, over 24286.00 frames. ], tot_loss[loss=0.1508, simple_loss=0.2313, pruned_loss=0.03516, over 4698226.70 frames. ], batch size: 61, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:23:40,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 18:23:43,367 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:23:47,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:47,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:23:48,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:23:50,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:23:52,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:54,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:23:54,230 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 18:23:55,572 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.726e+02 2.058e+02 2.327e+02 2.714e+02 4.351e+02, threshold=4.654e+02, percent-clipped=0.0 2023-10-04 18:23:58,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 18:24:01,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:24:06,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 18:24:08,757 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.22 vs. limit=22.5 2023-10-04 18:24:09,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:24:09,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:24:11,104 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:24:14,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:24:14,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 18:24:17,245 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:24:18,579 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:24:18,640 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 18:24:22,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:24:23,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:24:25,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:24:26,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:24:26,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1750160.0, ans=0.125 2023-10-04 18:24:29,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 18:24:31,242 WARNING [train.py:1204] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:24:32,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 18:24:34,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:24:34,643 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:24:36,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:24:39,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:24:39,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:24:39,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:24:39,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:24:40,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:24:40,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:24:40,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1750226.6666666667, ans=0.07 2023-10-04 18:24:43,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:24:44,144 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:24:46,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 18:24:47,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:24:49,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:24:50,056 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.34 vs. limit=6.0 2023-10-04 18:24:50,798 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 18:24:53,515 INFO [train.py:1046] (1/4) Epoch 50, batch 2250, loss[loss=0.1464, simple_loss=0.2268, pruned_loss=0.03297, over 24458.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.232, pruned_loss=0.03552, over 4701635.55 frames. ], batch size: 58, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:24:53,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:24:53,637 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 18:24:55,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:24:56,275 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 18:24:56,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:24:57,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 18:25:00,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:25:00,512 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 18:25:03,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:25:07,071 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:25:11,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:25:11,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:25:14,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:25:14,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:25:14,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:25:17,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 18:25:17,946 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:25:17,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:25:19,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1750360.0, ans=0.0 2023-10-04 18:25:20,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 18:25:21,793 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:25:21,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:25:23,175 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:25:28,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:25:30,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:25:30,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:25:31,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 18:25:34,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:25:34,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:25:38,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:25:40,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:25:41,088 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:25:41,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1750493.3333333333, ans=0.0 2023-10-04 18:25:42,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:25:45,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:25:45,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:25:49,951 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:25:52,831 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:25:58,374 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:25:58,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:25:59,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:26:05,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 18:26:06,943 INFO [train.py:1046] (1/4) Epoch 50, batch 2300, loss[loss=0.1433, simple_loss=0.2316, pruned_loss=0.02748, over 24464.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2326, pruned_loss=0.03553, over 4699818.14 frames. ], batch size: 66, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:26:07,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 18:26:07,023 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 18:26:07,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:26:07,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:26:07,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1750626.6666666667, ans=0.125 2023-10-04 18:26:09,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 18:26:09,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1750626.6666666667, ans=0.125 2023-10-04 18:26:11,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:26:13,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:26:13,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1750626.6666666667, ans=0.2 2023-10-04 18:26:18,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:26:18,961 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.98 vs. limit=12.0 2023-10-04 18:26:19,346 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:26:22,135 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 18:26:23,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:26:24,817 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.106e+02 2.353e+02 2.933e+02 4.045e+02, threshold=4.706e+02, percent-clipped=0.0 2023-10-04 18:26:29,168 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:26:30,471 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 18:26:30,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:26:30,540 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:26:30,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 18:26:30,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:26:33,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:26:33,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:26:39,093 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:26:41,819 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:26:44,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:26:48,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:26:49,351 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:26:52,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:26:54,873 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:26:57,791 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:26:57,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:26:59,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:26:59,236 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 18:27:02,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 18:27:02,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:27:03,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:27:03,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:27:03,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:27:04,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 18:27:04,752 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:27:04,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 18:27:04,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:27:04,820 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:27:06,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 18:27:08,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1750893.3333333333, ans=0.0 2023-10-04 18:27:11,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:27:14,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:27:17,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:27:17,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:27:17,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:27:20,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 18:27:20,452 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:27:21,684 INFO [train.py:1046] (1/4) Epoch 50, batch 2350, loss[loss=0.1511, simple_loss=0.2317, pruned_loss=0.03527, over 23803.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2333, pruned_loss=0.03584, over 4700929.52 frames. ], batch size: 212, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:27:21,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:27:21,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 18:27:27,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.24 vs. limit=12.0 2023-10-04 18:27:28,561 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:27:28,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 18:27:32,908 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 18:27:35,278 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:27:36,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:27:37,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:27:37,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:27:37,743 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:27:39,660 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:27:41,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 18:27:42,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1751026.6666666667, ans=0.125 2023-10-04 18:27:43,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:27:49,902 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 18:27:52,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:27:55,454 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:27:55,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:27:56,942 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:27:58,461 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 18:27:59,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:28:00,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:28:00,027 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:28:01,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:28:04,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:28:06,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 18:28:06,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:28:10,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:28:10,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:28:12,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 18:28:13,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:28:15,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 18:28:16,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:28:18,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 18:28:22,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 18:28:24,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:28:24,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 18:28:24,116 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 18:28:24,140 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 18:28:26,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 18:28:31,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:28:35,676 INFO [train.py:1046] (1/4) Epoch 50, batch 2400, loss[loss=0.1603, simple_loss=0.2526, pruned_loss=0.03402, over 24528.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2336, pruned_loss=0.03587, over 4709166.52 frames. ], batch size: 71, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:28:35,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:28:41,724 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:28:42,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1751293.3333333333, ans=0.2 2023-10-04 18:28:43,481 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:28:43,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 18:28:44,911 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 18:28:49,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1751360.0, ans=0.125 2023-10-04 18:28:50,786 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 18:28:50,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:28:54,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 18:28:54,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:28:55,233 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.709e+02 2.154e+02 2.535e+02 3.094e+02 5.336e+02, threshold=5.070e+02, percent-clipped=5.0 2023-10-04 18:28:55,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:28:55,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 18:29:01,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:29:02,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 18:29:05,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:29:05,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1751426.6666666667, ans=0.125 2023-10-04 18:29:09,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 18:29:11,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1751426.6666666667, ans=0.1 2023-10-04 18:29:13,083 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:29:14,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:29:15,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1751426.6666666667, ans=0.2 2023-10-04 18:29:16,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1751426.6666666667, ans=0.2 2023-10-04 18:29:17,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:29:17,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 18:29:17,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:29:25,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1751493.3333333333, ans=0.125 2023-10-04 18:29:26,664 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:29:29,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:29:30,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:29:32,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:29:32,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 18:29:32,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:29:32,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:29:33,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:29:33,625 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 18:29:36,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:29:38,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:29:39,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 18:29:40,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 18:29:41,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:29:41,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:29:42,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 18:29:42,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 18:29:42,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 18:29:42,893 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 18:29:44,566 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 18:29:45,943 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:29:46,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:29:46,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:29:48,700 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 18:29:49,798 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.56 vs. limit=15.0 2023-10-04 18:29:49,963 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.06 vs. limit=15.0 2023-10-04 18:29:50,522 INFO [train.py:1046] (1/4) Epoch 50, batch 2450, loss[loss=0.1548, simple_loss=0.2401, pruned_loss=0.03471, over 24349.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2317, pruned_loss=0.03563, over 4708925.33 frames. ], batch size: 77, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:29:50,584 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:29:50,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 18:29:53,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:29:53,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:29:56,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:29:56,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:29:57,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 18:29:59,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1751626.6666666667, ans=0.125 2023-10-04 18:30:04,518 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:30:04,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:30:08,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:30:09,646 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:30:09,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:30:09,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 18:30:13,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:30:14,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1751693.3333333333, ans=0.0 2023-10-04 18:30:16,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:30:18,003 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:30:22,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1751760.0, ans=0.0 2023-10-04 18:30:23,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 18:30:24,475 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:30:24,580 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:30:24,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:30:26,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 18:30:27,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:30:28,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1751760.0, ans=0.1 2023-10-04 18:30:34,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:30:35,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:30:35,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:30:35,677 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:30:35,724 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:30:37,102 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:30:37,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 18:30:42,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:30:42,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:30:44,842 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:30:44,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:30:48,133 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.73 vs. limit=15.0 2023-10-04 18:30:49,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:30:49,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 18:30:50,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:30:52,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:30:52,720 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 18:30:52,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:30:54,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:30:58,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:30:59,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:31:01,131 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:31:01,671 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.58 vs. limit=15.0 2023-10-04 18:31:03,747 INFO [train.py:1046] (1/4) Epoch 50, batch 2500, loss[loss=0.1422, simple_loss=0.2133, pruned_loss=0.03558, over 23654.00 frames. ], tot_loss[loss=0.151, simple_loss=0.2311, pruned_loss=0.03541, over 4705892.24 frames. ], batch size: 232, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:31:03,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 18:31:05,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:31:13,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:31:18,336 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.49 vs. limit=15.0 2023-10-04 18:31:20,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:31:20,777 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:31:21,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1752026.6666666667, ans=0.0 2023-10-04 18:31:22,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:31:22,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 18:31:23,686 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 2.188e+02 2.534e+02 3.108e+02 6.481e+02, threshold=5.068e+02, percent-clipped=2.0 2023-10-04 18:31:28,048 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:31:29,437 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:31:29,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 18:31:29,539 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 18:31:30,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 18:31:32,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:31:33,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:31:33,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 18:31:33,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:31:34,973 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 18:31:35,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:31:36,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1752093.3333333333, ans=0.125 2023-10-04 18:31:37,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1752093.3333333333, ans=0.125 2023-10-04 18:31:40,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:31:41,080 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:31:44,328 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:31:45,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 18:31:45,687 WARNING [train.py:1204] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:31:45,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1752093.3333333333, ans=0.125 2023-10-04 18:31:47,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:31:51,171 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:31:53,771 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:31:55,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1752160.0, ans=0.125 2023-10-04 18:31:57,636 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:32:01,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:32:04,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 18:32:04,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:32:04,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 18:32:06,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:32:06,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:32:08,718 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 18:32:08,719 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 18:32:08,734 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 18:32:11,562 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.53 vs. limit=15.0 2023-10-04 18:32:12,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:32:13,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 18:32:13,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 18:32:15,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:32:15,492 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 18:32:17,975 INFO [train.py:1046] (1/4) Epoch 50, batch 2550, loss[loss=0.1588, simple_loss=0.2328, pruned_loss=0.0424, over 22953.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2317, pruned_loss=0.03547, over 4710591.34 frames. ], batch size: 322, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:32:18,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1752293.3333333333, ans=0.0 2023-10-04 18:32:19,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 18:32:22,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:32:24,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:32:25,387 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:32:27,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:32:27,362 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 18:32:27,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:32:30,274 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 18:32:32,923 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:32:34,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:32:36,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:32:36,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 18:32:36,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:32:36,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:32:37,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:32:39,753 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:32:39,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 18:32:39,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1752360.0, ans=0.0 2023-10-04 18:32:41,612 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 18:32:41,618 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:32:41,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 18:32:43,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1752360.0, ans=0.0 2023-10-04 18:32:53,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:32:56,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1752426.6666666667, ans=0.125 2023-10-04 18:32:59,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:32:59,432 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:32:59,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:33:00,808 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:33:09,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:33:12,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:33:12,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:33:12,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:33:13,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:33:13,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:33:16,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:33:16,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:33:21,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:33:21,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 18:33:21,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:33:21,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:33:22,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:33:23,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:33:23,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:33:30,369 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:33:31,752 INFO [train.py:1046] (1/4) Epoch 50, batch 2600, loss[loss=0.1375, simple_loss=0.2169, pruned_loss=0.02904, over 21504.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2326, pruned_loss=0.03553, over 4718344.17 frames. ], batch size: 47, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:33:32,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1752626.6666666667, ans=0.1 2023-10-04 18:33:32,070 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:33:33,154 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:33:34,565 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 18:33:37,355 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 18:33:37,376 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:33:37,403 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 18:33:38,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 18:33:38,770 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 18:33:43,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:33:43,243 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 18:33:43,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1752626.6666666667, ans=0.035 2023-10-04 18:33:45,192 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 18:33:45,283 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 18:33:46,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:33:48,387 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 18:33:49,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1752693.3333333333, ans=0.125 2023-10-04 18:33:51,031 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 2.019e+02 2.212e+02 2.492e+02 4.115e+02, threshold=4.424e+02, percent-clipped=0.0 2023-10-04 18:33:51,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 18:33:52,476 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:33:52,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 18:33:55,232 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 18:33:55,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 18:34:02,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1752760.0, ans=0.0 2023-10-04 18:34:04,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:04,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:34:04,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:34:04,204 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 18:34:04,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:34:11,176 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 18:34:14,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:34:14,811 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:16,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 18:34:16,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:34:16,807 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:34:18,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 18:34:18,439 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:34:20,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:34:20,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:34:23,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:34:27,661 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 18:34:27,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:34:27,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:34:31,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1752893.3333333333, ans=10.0 2023-10-04 18:34:33,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:34:33,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:34:33,895 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 18:34:35,175 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:34:36,632 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:34:38,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:34:43,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 18:34:44,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1752960.0, ans=0.0 2023-10-04 18:34:45,689 INFO [train.py:1046] (1/4) Epoch 50, batch 2650, loss[loss=0.1535, simple_loss=0.2249, pruned_loss=0.04102, over 23790.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2332, pruned_loss=0.03565, over 4719931.95 frames. ], batch size: 212, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:34:45,729 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:48,564 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:34:48,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1752960.0, ans=0.5 2023-10-04 18:34:52,668 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 18:34:52,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:54,033 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 18:34:54,105 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 18:34:55,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:34:56,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:59,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 18:35:00,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:35:02,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:35:04,303 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 18:35:04,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:35:04,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:35:07,120 WARNING [train.py:1204] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 18:35:08,501 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 18:35:09,904 WARNING [train.py:1204] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:35:12,721 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 18:35:12,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:35:12,782 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 18:35:15,028 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.80 vs. limit=15.0 2023-10-04 18:35:17,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:35:17,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 18:35:18,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:35:18,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:19,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1753093.3333333333, ans=0.125 2023-10-04 18:35:19,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1753093.3333333333, ans=0.0 2023-10-04 18:35:22,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 18:35:22,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 18:35:23,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:35:29,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 18:35:29,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:35:31,032 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:31,072 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:35:31,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:35:31,147 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:35:31,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1753160.0, ans=0.125 2023-10-04 18:35:32,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:35:34,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:35:35,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:35:35,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:35:37,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:35:39,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:35:39,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:35:39,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:35:43,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:35:43,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 18:35:43,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1753226.6666666667, ans=0.0 2023-10-04 18:35:47,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:47,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:35:47,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1753226.6666666667, ans=0.125 2023-10-04 18:35:48,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:35:48,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 18:35:52,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:35:53,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:53,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:54,903 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:35:55,830 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.86 vs. limit=15.0 2023-10-04 18:35:56,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:35:56,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:35:58,863 INFO [train.py:1046] (1/4) Epoch 50, batch 2700, loss[loss=0.1458, simple_loss=0.2155, pruned_loss=0.03809, over 23511.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2348, pruned_loss=0.03622, over 4710614.53 frames. ], batch size: 256, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:35:58,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:35:58,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 18:36:02,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:36:03,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 18:36:05,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:36:06,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:36:06,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:36:08,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:36:08,641 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:36:08,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:36:09,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:36:09,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 18:36:11,371 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:36:14,610 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:36:14,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:36:14,749 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:36:18,703 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.205e+02 2.516e+02 3.164e+02 5.488e+02, threshold=5.032e+02, percent-clipped=3.0 2023-10-04 18:36:18,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:36:20,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 18:36:20,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:36:26,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:36:26,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:36:30,715 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:36:30,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:36:30,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:36:30,759 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:36:33,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:36:36,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:36:36,898 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:36:36,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:36:37,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1753426.6666666667, ans=0.125 2023-10-04 18:36:41,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:36:41,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:36:44,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1753493.3333333333, ans=0.1 2023-10-04 18:36:48,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:36:50,221 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:36:53,597 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:36:53,599 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:36:55,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1753493.3333333333, ans=0.2 2023-10-04 18:36:57,632 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:36:59,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:37:00,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:37:00,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:01,861 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:37:03,152 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:37:05,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:37:07,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:37:07,907 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:37:11,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 18:37:12,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:37:13,958 INFO [train.py:1046] (1/4) Epoch 50, batch 2750, loss[loss=0.1492, simple_loss=0.2401, pruned_loss=0.02916, over 24440.00 frames. ], tot_loss[loss=0.153, simple_loss=0.234, pruned_loss=0.03596, over 4719019.11 frames. ], batch size: 69, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:37:14,082 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:37:14,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 18:37:15,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 18:37:16,010 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:37:17,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1753626.6666666667, ans=0.125 2023-10-04 18:37:20,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:37:20,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:37:22,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:22,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:37:23,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:26,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:37:26,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 18:37:27,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:37:27,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:27,437 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 18:37:27,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:37:27,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:37:31,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 18:37:33,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:37:34,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:34,964 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:37:36,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 18:37:36,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:37:37,171 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=7.02 vs. limit=10.0 2023-10-04 18:37:38,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:37:38,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:37:39,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:37:42,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:37:42,427 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 18:37:44,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:37:44,220 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:45,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:37:48,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=1753760.0, ans=0.5 2023-10-04 18:37:51,068 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:37:52,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:37:54,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:37:55,137 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.88 vs. limit=15.0 2023-10-04 18:37:58,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:58,277 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:37:58,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:38:02,789 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:38:04,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:38:04,115 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 18:38:08,805 WARNING [train.py:1204] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:38:10,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 18:38:13,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1753893.3333333333, ans=0.0 2023-10-04 18:38:16,227 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 18:38:18,863 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:38:18,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 18:38:18,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:38:21,783 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:38:23,133 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 18:38:23,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:38:26,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 18:38:26,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:38:26,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:38:27,753 INFO [train.py:1046] (1/4) Epoch 50, batch 2800, loss[loss=0.1505, simple_loss=0.2425, pruned_loss=0.02926, over 24290.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2327, pruned_loss=0.0355, over 4709671.17 frames. ], batch size: 74, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:38:27,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 18:38:29,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:38:29,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:38:30,592 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:38:30,631 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 18:38:30,631 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 18:38:33,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:38:35,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:38:35,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:38:35,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1753960.0, ans=0.0 2023-10-04 18:38:39,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:38:41,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1754026.6666666667, ans=0.125 2023-10-04 18:38:42,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 18:38:44,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 18:38:45,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 18:38:47,177 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.767e+02 2.165e+02 2.479e+02 3.124e+02 4.666e+02, threshold=4.957e+02, percent-clipped=0.0 2023-10-04 18:38:47,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:38:47,302 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:38:47,305 WARNING [train.py:1204] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:38:47,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1754026.6666666667, ans=0.125 2023-10-04 18:38:50,360 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:38:51,670 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:38:51,674 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:38:53,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:39:00,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:39:02,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:39:03,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:39:04,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:39:06,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:39:12,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:39:12,105 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 18:39:12,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:39:12,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:39:12,243 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:39:16,889 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:39:18,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:39:18,948 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.76 vs. limit=15.0 2023-10-04 18:39:20,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:39:23,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:39:23,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:39:23,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:39:23,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 18:39:25,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:39:26,480 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:39:26,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 18:39:26,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:39:28,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:39:28,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:39:29,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 18:39:30,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:39:31,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:39:31,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:39:31,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1754226.6666666667, ans=0.125 2023-10-04 18:39:32,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 18:39:40,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:39:40,027 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:39:40,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:39:41,404 INFO [train.py:1046] (1/4) Epoch 50, batch 2850, loss[loss=0.1654, simple_loss=0.2484, pruned_loss=0.04118, over 23809.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2319, pruned_loss=0.03562, over 4712665.59 frames. ], batch size: 85, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:39:42,850 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:39:47,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:39:47,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:39:47,542 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:39:49,062 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:39:49,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:39:50,563 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:39:51,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 18:39:57,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 18:39:57,538 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:39:59,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 18:40:00,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:40:03,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 18:40:03,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 18:40:06,896 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:40:15,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1754426.6666666667, ans=0.125 2023-10-04 18:40:15,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1754426.6666666667, ans=0.125 2023-10-04 18:40:16,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1754426.6666666667, ans=0.0 2023-10-04 18:40:19,261 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:40:19,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:40:19,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:40:21,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:40:22,012 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:40:22,043 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:40:22,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:40:23,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 18:40:26,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:40:26,331 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:40:27,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:40:27,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:40:30,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:40:30,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:40:32,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:40:33,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:40:35,434 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:40:35,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:40:36,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:40:38,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:40:38,951 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:40:43,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:40:45,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 18:40:45,089 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 18:40:47,784 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 18:40:47,834 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:40:47,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 18:40:49,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:40:49,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:40:50,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:40:50,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:40:50,472 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 18:40:50,531 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 18:40:50,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:40:51,882 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:40:54,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:40:54,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:40:56,077 INFO [train.py:1046] (1/4) Epoch 50, batch 2900, loss[loss=0.1578, simple_loss=0.2463, pruned_loss=0.03465, over 23354.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2314, pruned_loss=0.03555, over 4705222.72 frames. ], batch size: 93, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:40:56,176 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:40:57,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 18:41:00,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:41:00,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 18:41:01,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 18:41:03,594 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:41:03,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:41:05,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:41:07,062 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:41:11,675 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:41:11,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:41:13,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:41:14,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 18:41:14,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:41:16,327 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.833e+02 2.129e+02 2.392e+02 2.896e+02 5.102e+02, threshold=4.784e+02, percent-clipped=2.0 2023-10-04 18:41:16,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:41:19,200 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 18:41:20,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 18:41:23,269 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:41:23,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 18:41:23,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:41:26,052 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:41:26,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:41:28,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:41:30,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:41:32,872 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:41:34,694 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:41:36,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 18:41:36,158 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 18:41:36,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:41:36,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1754760.0, ans=0.125 2023-10-04 18:41:41,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:41:44,064 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 18:41:44,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1754826.6666666667, ans=0.125 2023-10-04 18:41:45,574 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:41:45,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1754826.6666666667, ans=0.0 2023-10-04 18:41:47,997 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.39 vs. limit=15.0 2023-10-04 18:41:51,467 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:41:59,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:41:59,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:41:59,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1754893.3333333333, ans=0.0 2023-10-04 18:42:00,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1754893.3333333333, ans=0.125 2023-10-04 18:42:01,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 18:42:03,950 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:42:03,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 18:42:05,321 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:42:05,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:42:08,273 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.87 vs. limit=15.0 2023-10-04 18:42:10,540 INFO [train.py:1046] (1/4) Epoch 50, batch 2950, loss[loss=0.1614, simple_loss=0.2341, pruned_loss=0.04432, over 23805.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2332, pruned_loss=0.03591, over 4715828.18 frames. ], batch size: 164, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:42:11,961 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:42:13,319 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 18:42:15,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:42:15,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:42:16,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:42:18,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:42:19,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 18:42:19,889 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 18:42:21,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:42:21,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:42:25,646 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:42:26,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:42:28,378 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:42:29,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:42:33,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:42:33,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:42:35,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:42:37,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:42:37,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:42:40,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 18:42:44,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 18:42:44,477 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 18:42:45,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:42:47,419 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 18:42:49,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 18:42:50,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:42:50,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:42:50,608 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 18:42:50,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 18:42:53,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 18:42:55,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:42:55,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:42:56,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:42:57,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1755160.0, ans=0.125 2023-10-04 18:42:58,128 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:42:58,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:42:58,188 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 18:42:58,223 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:42:59,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 18:43:03,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:43:05,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:43:06,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 18:43:06,601 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:43:09,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 18:43:12,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:43:13,508 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.63 vs. limit=15.0 2023-10-04 18:43:14,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:43:14,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:43:16,978 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:43:16,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 18:43:17,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:43:18,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:43:18,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:43:19,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:43:19,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:43:21,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:43:21,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1755226.6666666667, ans=0.07 2023-10-04 18:43:24,330 INFO [train.py:1046] (1/4) Epoch 50, batch 3000, loss[loss=0.1817, simple_loss=0.2537, pruned_loss=0.05486, over 22784.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.234, pruned_loss=0.03628, over 4709414.98 frames. ], batch size: 322, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:43:24,331 INFO [train.py:1069] (1/4) Computing validation loss 2023-10-04 18:43:31,993 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.8345, 2.4655, 3.5005, 3.4227], device='cuda:1') 2023-10-04 18:43:36,831 INFO [train.py:1078] (1/4) Epoch 50, validation: loss=0.3701, simple_loss=0.2758, pruned_loss=0.2322, over 1125622.00 frames. 2023-10-04 18:43:36,832 INFO [train.py:1079] (1/4) Maximum memory allocated so far is 21041MB 2023-10-04 18:43:36,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:43:36,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 18:43:37,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:43:39,884 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:43:41,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 18:43:45,966 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 18:43:46,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 18:43:48,730 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:43:48,792 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:43:50,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 18:43:50,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:43:57,839 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.709e+02 2.159e+02 2.457e+02 2.749e+02 3.497e+02, threshold=4.915e+02, percent-clipped=0.0 2023-10-04 18:43:57,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:44:00,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1755360.0, ans=0.125 2023-10-04 18:44:06,247 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:44:06,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1755426.6666666667, ans=0.125 2023-10-04 18:44:12,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 18:44:12,705 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:44:16,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:44:16,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:44:16,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:44:18,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:44:18,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 18:44:20,218 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 18:44:21,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:44:22,327 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.33 vs. limit=22.5 2023-10-04 18:44:22,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:44:24,954 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:44:25,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:44:25,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:44:25,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:44:29,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:44:29,644 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:44:29,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:44:31,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:44:33,806 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 18:44:35,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:44:35,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:44:35,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:44:39,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:44:39,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:44:41,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 18:44:41,262 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 18:44:41,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1755560.0, ans=0.125 2023-10-04 18:44:42,509 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:44:42,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 18:44:42,590 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:44:45,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 18:44:47,458 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 18:44:48,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 18:44:50,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 18:44:51,417 INFO [train.py:1046] (1/4) Epoch 50, batch 3050, loss[loss=0.1485, simple_loss=0.2364, pruned_loss=0.03028, over 24629.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2344, pruned_loss=0.03613, over 4720893.07 frames. ], batch size: 68, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:44:51,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 18:44:51,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 18:44:51,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:44:52,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:44:52,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:44:54,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:44:54,139 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:44:59,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 18:44:59,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1755626.6666666667, ans=0.1 2023-10-04 18:45:00,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:45:03,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:45:04,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:45:06,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:45:09,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 18:45:15,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 18:45:15,543 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 18:45:15,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:45:18,290 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:45:21,123 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:45:21,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:45:22,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:45:23,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:45:25,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 18:45:26,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:45:26,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:45:26,473 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:45:28,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:45:31,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:45:34,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:45:34,307 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 18:45:34,357 WARNING [train.py:1204] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:45:34,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:45:37,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:45:38,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:45:38,577 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:45:38,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1755826.6666666667, ans=0.125 2023-10-04 18:45:39,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:45:43,225 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.97 vs. limit=10.0 2023-10-04 18:45:44,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:45:44,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:45:51,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:45:51,919 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:45:51,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:45:54,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:45:56,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 18:45:56,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:45:56,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 18:45:57,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:45:57,914 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:45:59,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 18:46:01,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:46:05,209 INFO [train.py:1046] (1/4) Epoch 50, batch 3100, loss[loss=0.1595, simple_loss=0.2502, pruned_loss=0.03442, over 24450.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2343, pruned_loss=0.03612, over 4716752.82 frames. ], batch size: 69, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:46:05,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:46:06,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:46:09,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:46:12,224 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 18:46:14,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 18:46:15,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 18:46:16,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:46:20,011 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:46:20,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:46:22,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 18:46:25,458 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.774e+02 2.135e+02 2.495e+02 2.950e+02 6.010e+02, threshold=4.989e+02, percent-clipped=3.0 2023-10-04 18:46:25,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:46:25,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1756026.6666666667, ans=0.1 2023-10-04 18:46:32,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 18:46:35,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 18:46:35,103 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:46:36,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:46:36,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:46:37,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 18:46:39,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:46:39,206 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 18:46:39,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:46:40,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:46:42,570 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 18:46:43,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:46:44,595 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.94 vs. limit=22.5 2023-10-04 18:46:45,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=1756093.3333333333, ans=0.025 2023-10-04 18:46:46,745 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:46:48,695 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 18:46:48,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 18:46:49,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:46:49,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:46:54,106 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:46:54,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:46:54,144 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:46:55,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:46:55,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:46:57,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:46:57,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:46:57,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:46:57,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 18:47:02,156 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:47:03,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 18:47:04,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:47:06,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 18:47:07,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:47:07,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:47:07,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 18:47:18,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 18:47:19,634 INFO [train.py:1046] (1/4) Epoch 50, batch 3150, loss[loss=0.1357, simple_loss=0.1931, pruned_loss=0.03918, over 19259.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2329, pruned_loss=0.03577, over 4721685.86 frames. ], batch size: 388, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:47:21,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:47:21,180 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:47:21,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1756293.3333333333, ans=0.125 2023-10-04 18:47:22,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:47:22,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:47:23,933 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 18:47:25,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:47:25,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 18:47:28,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 18:47:29,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:47:31,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1756293.3333333333, ans=0.125 2023-10-04 18:47:33,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1756360.0, ans=0.1 2023-10-04 18:47:34,479 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 18:47:34,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 18:47:36,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:47:36,185 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 18:47:37,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 18:47:40,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 18:47:40,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 18:47:40,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 18:47:40,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:47:40,301 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:47:41,697 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:47:43,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 18:47:44,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:47:44,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1756360.0, ans=0.07 2023-10-04 18:47:46,140 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:47:46,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:47:48,291 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:47:52,264 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 18:47:52,333 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:47:53,833 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:47:53,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:47:55,213 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 18:47:56,753 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 18:47:58,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:47:58,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 18:47:58,758 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 18:48:01,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:48:01,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:48:01,647 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:48:01,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 18:48:03,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 18:48:03,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 18:48:03,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:04,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:48:04,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:48:06,267 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 18:48:06,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:48:07,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 18:48:07,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:09,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 18:48:10,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 18:48:11,827 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:48:11,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:48:13,663 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 18:48:13,733 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 18:48:13,981 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:48:15,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:48:19,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:48:20,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1756560.0, ans=0.125 2023-10-04 18:48:21,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:21,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:48:24,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1756560.0, ans=0.125 2023-10-04 18:48:25,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:48:26,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:28,571 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 18:48:33,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:48:33,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:48:34,625 INFO [train.py:1046] (1/4) Epoch 50, batch 3200, loss[loss=0.1461, simple_loss=0.2248, pruned_loss=0.03369, over 23533.00 frames. ], tot_loss[loss=0.1507, simple_loss=0.231, pruned_loss=0.03525, over 4712258.23 frames. ], batch size: 134, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:48:36,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:38,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:48:38,932 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 18:48:40,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:48:41,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1756626.6666666667, ans=0.125 2023-10-04 18:48:43,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:48:49,368 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:54,867 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.049e+02 2.254e+02 2.662e+02 4.145e+02, threshold=4.508e+02, percent-clipped=0.0 2023-10-04 18:48:57,000 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:49:06,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 18:49:06,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:49:09,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 18:49:11,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:49:16,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:49:16,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:49:16,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:49:20,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 18:49:21,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 18:49:22,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1756826.6666666667, ans=0.125 2023-10-04 18:49:24,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 18:49:27,656 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 18:49:29,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:49:35,717 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:49:35,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:49:36,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:49:37,049 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 18:49:37,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 18:49:39,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:49:41,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 18:49:41,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 18:49:42,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 18:49:44,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1756893.3333333333, ans=0.125 2023-10-04 18:49:45,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 18:49:46,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1756893.3333333333, ans=0.125 2023-10-04 18:49:47,727 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:49:48,988 INFO [train.py:1046] (1/4) Epoch 50, batch 3250, loss[loss=0.1472, simple_loss=0.2211, pruned_loss=0.03663, over 23479.00 frames. ], tot_loss[loss=0.151, simple_loss=0.2312, pruned_loss=0.03539, over 4713371.28 frames. ], batch size: 120, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:49:50,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:49:50,456 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 18:49:50,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:49:50,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:49:52,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1756960.0, ans=0.1 2023-10-04 18:49:53,234 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 18:49:57,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:50:00,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:50:07,865 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:50:07,874 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 18:50:07,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:50:09,738 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:50:09,739 WARNING [train.py:1204] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:50:09,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:50:11,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 18:50:11,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1757026.6666666667, ans=0.0 2023-10-04 18:50:13,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:50:13,901 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:50:13,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:50:14,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1757026.6666666667, ans=0.0 2023-10-04 18:50:15,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:50:15,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:50:15,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:50:15,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1757026.6666666667, ans=0.035 2023-10-04 18:50:15,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1757026.6666666667, ans=0.125 2023-10-04 18:50:19,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:50:20,596 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.02 vs. limit=22.5 2023-10-04 18:50:21,101 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:50:23,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:50:23,143 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:50:24,607 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:50:24,638 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:50:24,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:50:28,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 18:50:28,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:50:28,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:50:30,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:50:30,816 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:50:36,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1757160.0, ans=0.0 2023-10-04 18:50:37,394 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:50:44,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:50:44,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:50:44,965 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 18:50:44,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:50:44,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 18:50:45,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1757160.0, ans=0.0 2023-10-04 18:50:46,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:50:49,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 18:50:49,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 18:50:50,273 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.35 vs. limit=15.0 2023-10-04 18:50:51,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:50:52,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:50:52,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:50:52,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 18:50:54,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:50:57,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:50:57,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:50:58,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 18:50:58,435 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:50:59,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:50:59,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 18:51:02,934 INFO [train.py:1046] (1/4) Epoch 50, batch 3300, loss[loss=0.1538, simple_loss=0.2364, pruned_loss=0.03554, over 23639.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2319, pruned_loss=0.03552, over 4716161.34 frames. ], batch size: 149, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:51:03,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:51:03,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 18:51:05,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 18:51:05,869 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 18:51:05,888 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:51:10,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:51:10,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1757293.3333333333, ans=0.025 2023-10-04 18:51:12,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:51:12,150 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:14,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 18:51:14,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:51:16,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:51:18,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:51:19,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1757360.0, ans=0.2 2023-10-04 18:51:24,178 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 2.069e+02 2.302e+02 2.665e+02 3.368e+02, threshold=4.603e+02, percent-clipped=0.0 2023-10-04 18:51:24,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 18:51:25,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:51:25,578 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:51:27,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:28,444 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 18:51:29,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:51:29,867 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 18:51:31,229 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:51:31,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:51:31,264 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 18:51:35,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:51:35,421 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 18:51:36,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:36,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 18:51:40,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 18:51:40,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:40,237 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:51:41,798 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 18:51:44,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 18:51:44,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:51:47,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 18:51:51,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:51:52,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:51:52,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:51:55,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:51:55,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:51:55,399 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:51:57,111 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:51:58,545 WARNING [train.py:1204] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:51:58,562 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:59,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:52:00,011 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 18:52:01,385 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 18:52:04,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 18:52:04,098 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:52:04,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:52:05,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1757560.0, ans=0.125 2023-10-04 18:52:06,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:52:06,700 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:52:10,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:52:11,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:11,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:52:11,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:52:12,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:52:14,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 18:52:15,751 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:15,832 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:17,878 INFO [train.py:1046] (1/4) Epoch 50, batch 3350, loss[loss=0.1508, simple_loss=0.2499, pruned_loss=0.02588, over 24311.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2334, pruned_loss=0.03585, over 4722840.78 frames. ], batch size: 74, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:52:17,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:52:17,990 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:52:19,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:52:21,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:52:21,463 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:24,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:52:27,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:28,662 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:52:30,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:31,573 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:52:32,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:52:34,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:52:35,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 18:52:37,023 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 18:52:37,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:52:37,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1757693.3333333333, ans=0.0 2023-10-04 18:52:41,671 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 18:52:41,683 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 18:52:43,053 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:52:43,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:52:43,171 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:52:44,530 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 18:52:44,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:44,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:52:47,272 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:49,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:50,582 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:50,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:52:50,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1757760.0, ans=0.0 2023-10-04 18:52:52,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1757760.0, ans=0.125 2023-10-04 18:52:54,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:52:55,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:56,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:53:01,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:53:01,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:53:04,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:53:04,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:53:05,199 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.14 vs. limit=15.0 2023-10-04 18:53:05,702 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:53:08,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 18:53:08,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:53:08,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 18:53:08,451 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:53:09,848 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 18:53:11,614 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:53:12,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:53:20,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:53:20,742 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 18:53:20,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:53:22,094 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:53:23,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:53:27,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:53:31,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 18:53:32,605 INFO [train.py:1046] (1/4) Epoch 50, batch 3400, loss[loss=0.1818, simple_loss=0.2584, pruned_loss=0.05259, over 19437.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2339, pruned_loss=0.03613, over 4718203.01 frames. ], batch size: 389, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:53:32,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:53:32,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:53:32,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:53:34,154 WARNING [train.py:1204] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 18:53:34,216 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:53:34,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 18:53:35,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:53:37,123 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:53:37,160 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:53:38,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:53:38,576 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 18:53:43,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 18:53:44,499 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 18:53:44,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:53:48,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:53:48,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:53:50,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:53:50,730 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:53:54,665 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.748e+02 2.083e+02 2.309e+02 2.717e+02 5.994e+02, threshold=4.619e+02, percent-clipped=1.0 2023-10-04 18:53:55,549 WARNING [train.py:1204] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:53:56,701 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 18:54:01,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:54:04,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:54:05,423 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:54:06,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 18:54:10,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:54:13,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1758093.3333333333, ans=0.0 2023-10-04 18:54:15,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 18:54:19,859 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:54:19,913 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:54:21,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 18:54:21,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:54:23,018 WARNING [train.py:1204] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:54:23,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:54:24,376 WARNING [train.py:1204] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:54:26,415 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:54:30,517 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:54:30,523 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:54:35,412 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:54:36,803 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 18:54:40,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 18:54:45,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 18:54:46,861 INFO [train.py:1046] (1/4) Epoch 50, batch 3450, loss[loss=0.1553, simple_loss=0.2262, pruned_loss=0.04217, over 23825.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2339, pruned_loss=0.03602, over 4723377.37 frames. ], batch size: 195, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:54:47,762 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.56 vs. limit=10.0 2023-10-04 18:54:48,404 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 18:54:50,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:54:51,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:54:51,472 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 18:54:52,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:54:56,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 18:55:02,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:55:02,289 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:55:03,613 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:55:03,622 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:55:06,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:55:08,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1758360.0, ans=0.125 2023-10-04 18:55:12,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 18:55:12,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1758360.0, ans=0.125 2023-10-04 18:55:16,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 18:55:16,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:55:16,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:55:19,391 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:55:24,013 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 18:55:24,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:55:28,721 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:55:28,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:55:30,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:55:30,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1758493.3333333333, ans=0.0 2023-10-04 18:55:31,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:55:33,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 18:55:34,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:55:35,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:55:37,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:55:39,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 18:55:43,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:55:47,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:55:49,181 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:55:50,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:55:54,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:55:54,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:55:55,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:55:55,995 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:55:59,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:56:01,205 INFO [train.py:1046] (1/4) Epoch 50, batch 3500, loss[loss=0.1399, simple_loss=0.2236, pruned_loss=0.02808, over 24448.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2319, pruned_loss=0.03585, over 4710322.98 frames. ], batch size: 63, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 18:56:04,520 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:56:04,599 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 18:56:07,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 18:56:10,100 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 18:56:10,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1758626.6666666667, ans=0.2 2023-10-04 18:56:11,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:56:11,623 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 18:56:15,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1758693.3333333333, ans=0.2 2023-10-04 18:56:16,465 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:56:17,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:56:20,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:56:20,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:56:20,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:56:20,591 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:20,629 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:56:20,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 18:56:24,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:24,374 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:56:25,599 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 2.089e+02 2.430e+02 2.862e+02 4.477e+02, threshold=4.860e+02, percent-clipped=0.0 2023-10-04 18:56:27,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:56:28,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1758693.3333333333, ans=0.0 2023-10-04 18:56:30,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:31,677 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 18:56:31,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:56:34,522 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:56:35,760 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:56:37,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:38,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:56:38,547 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:56:41,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 18:56:42,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 18:56:42,627 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 18:56:42,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:56:45,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:47,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:56:47,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:56:50,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1758826.6666666667, ans=0.125 2023-10-04 18:56:50,714 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.24 vs. limit=22.5 2023-10-04 18:56:51,223 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 18:56:51,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:56:56,157 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:56:58,714 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 18:56:58,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 18:56:58,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:57:01,401 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:57:03,309 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:57:04,667 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:57:05,544 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.29 vs. limit=15.0 2023-10-04 18:57:06,151 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 18:57:07,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:57:07,645 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:57:09,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 18:57:10,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 18:57:13,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:57:13,639 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:57:13,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:57:14,913 INFO [train.py:1046] (1/4) Epoch 50, batch 3550, loss[loss=0.1466, simple_loss=0.2239, pruned_loss=0.0347, over 23294.00 frames. ], tot_loss[loss=0.151, simple_loss=0.2312, pruned_loss=0.03543, over 4710464.42 frames. ], batch size: 105, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 18:57:14,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:57:17,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:57:26,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:57:28,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 18:57:28,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1759026.6666666667, ans=0.0 2023-10-04 18:57:30,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:57:33,986 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:57:35,368 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:57:35,453 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:57:35,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:57:39,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:57:39,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:57:39,764 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:57:39,779 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:57:41,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:57:48,335 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:57:48,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:57:49,693 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:57:49,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:57:49,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:57:49,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 18:57:49,791 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:57:51,282 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:57:51,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 18:57:57,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:57:57,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:57:58,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:58:00,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 18:58:01,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:58:03,367 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 18:58:03,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:58:05,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1759160.0, ans=0.125 2023-10-04 18:58:06,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:58:06,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:58:07,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1759160.0, ans=0.125 2023-10-04 18:58:09,869 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 18:58:09,959 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:58:16,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:58:16,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1759226.6666666667, ans=0.1 2023-10-04 18:58:17,439 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 18:58:18,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:58:21,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:58:21,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 18:58:29,014 INFO [train.py:1046] (1/4) Epoch 50, batch 3600, loss[loss=0.1459, simple_loss=0.224, pruned_loss=0.03387, over 23707.00 frames. ], tot_loss[loss=0.1506, simple_loss=0.2308, pruned_loss=0.03515, over 4707184.18 frames. ], batch size: 232, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 18:58:29,074 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 18:58:29,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:58:30,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:58:30,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1759293.3333333333, ans=0.125 2023-10-04 18:58:31,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:58:33,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:58:33,822 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:58:38,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:58:39,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:58:41,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:58:42,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:58:43,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:58:43,924 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 18:58:47,196 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 18:58:48,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:58:49,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:58:52,622 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.822e+02 2.097e+02 2.364e+02 2.970e+02 4.964e+02, threshold=4.728e+02, percent-clipped=1.0 2023-10-04 18:58:54,091 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:58:55,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:58:55,502 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:58:55,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 18:58:56,930 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:58:58,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:59:00,200 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:59:01,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:59:04,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:59:05,300 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.06 vs. limit=15.0 2023-10-04 18:59:06,182 WARNING [train.py:1204] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:59:06,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 18:59:06,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1759426.6666666667, ans=0.125 2023-10-04 18:59:12,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:59:15,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 18:59:15,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 18:59:18,626 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:59:24,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:59:28,190 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:59:32,983 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:59:33,005 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:59:33,011 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 18:59:35,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 18:59:37,022 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 18:59:40,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:59:42,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:59:42,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 18:59:43,484 INFO [train.py:1046] (1/4) Epoch 50, batch 3650, loss[loss=0.1399, simple_loss=0.2267, pruned_loss=0.02657, over 24661.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2319, pruned_loss=0.03545, over 4712686.42 frames. ], batch size: 65, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 18:59:43,548 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:59:43,572 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:59:43,578 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:59:43,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 18:59:46,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 18:59:47,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:59:47,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 18:59:53,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 18:59:53,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:59:58,122 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 18:59:59,105 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.17 vs. limit=15.0 2023-10-04 18:59:59,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 19:00:03,676 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:00:03,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:00:03,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:00:08,267 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 19:00:08,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:00:08,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 19:00:10,192 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:00:10,215 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:00:10,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 19:00:10,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 19:00:11,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:00:11,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:00:11,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:00:13,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 19:00:14,611 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 19:00:14,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:00:17,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 19:00:20,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:00:20,222 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:00:24,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 19:00:27,159 WARNING [train.py:1204] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:00:27,168 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 19:00:27,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:00:28,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:00:30,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:00:34,716 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:00:36,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:00:36,075 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:00:38,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 19:00:38,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:00:39,410 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:00:45,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1759893.3333333333, ans=0.025 2023-10-04 19:00:46,414 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 19:00:50,894 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:00:50,905 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:00:50,996 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 19:00:51,041 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:00:52,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 19:00:52,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:00:53,168 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.45 vs. limit=15.0 2023-10-04 19:00:53,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 19:00:53,887 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:00:56,535 INFO [train.py:1046] (1/4) Epoch 50, batch 3700, loss[loss=0.1438, simple_loss=0.2242, pruned_loss=0.03172, over 24402.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2328, pruned_loss=0.03572, over 4720208.04 frames. ], batch size: 58, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:00:57,851 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 19:00:59,246 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:00:59,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:01:01,578 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.55 vs. limit=22.5 2023-10-04 19:01:02,034 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:01:02,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 19:01:02,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:01:03,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 19:01:03,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:01:11,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:01:14,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:01:15,491 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:01:15,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:01:16,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:01:17,024 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 19:01:18,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:01:18,620 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 19:01:22,813 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.773e+02 2.080e+02 2.404e+02 2.897e+02 4.526e+02, threshold=4.809e+02, percent-clipped=0.0 2023-10-04 19:01:27,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:01:27,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 19:01:28,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:01:28,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 19:01:28,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:01:32,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:01:32,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 19:01:34,235 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:01:34,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1760093.3333333333, ans=0.125 2023-10-04 19:01:35,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1760093.3333333333, ans=0.0 2023-10-04 19:01:37,442 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:01:39,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1760093.3333333333, ans=0.0 2023-10-04 19:01:40,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:01:40,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:01:40,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1760093.3333333333, ans=0.0 2023-10-04 19:01:42,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 19:01:48,087 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:01:48,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 19:01:48,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:01:48,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 19:01:55,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:01:55,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:01:57,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:01:57,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 19:01:59,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:01:59,954 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 19:01:59,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:01:59,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:02:05,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:02:05,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 19:02:06,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1760226.6666666667, ans=0.0 2023-10-04 19:02:08,130 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 19:02:08,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:02:08,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:02:09,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:02:09,638 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:02:12,771 INFO [train.py:1046] (1/4) Epoch 50, batch 3750, loss[loss=0.1697, simple_loss=0.2427, pruned_loss=0.04834, over 23809.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2338, pruned_loss=0.03611, over 4728208.11 frames. ], batch size: 179, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:02:12,934 WARNING [train.py:1204] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:02:16,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:02:16,686 WARNING [train.py:1204] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:02:18,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 19:02:19,468 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 19:02:22,201 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 19:02:22,258 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 19:02:25,362 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:02:26,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:02:26,837 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:02:28,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:02:29,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:02:32,616 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:02:33,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:02:36,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:02:38,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:02:39,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 19:02:40,969 WARNING [train.py:1204] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:02:42,941 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:02:44,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:02:49,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 19:02:49,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1760426.6666666667, ans=0.1 2023-10-04 19:02:51,259 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.50 vs. limit=15.0 2023-10-04 19:02:53,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 19:02:53,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:02:55,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:02:55,682 WARNING [train.py:1204] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:03:01,006 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:03:01,105 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 19:03:04,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 19:03:06,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:03:10,817 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:03:10,886 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:03:14,187 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:03:18,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 19:03:20,212 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:03:21,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:03:22,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:03:23,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1760560.0, ans=0.1 2023-10-04 19:03:26,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 19:03:27,428 INFO [train.py:1046] (1/4) Epoch 50, batch 3800, loss[loss=0.1595, simple_loss=0.2378, pruned_loss=0.04058, over 23635.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2329, pruned_loss=0.0354, over 4746651.62 frames. ], batch size: 149, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:03:27,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1760626.6666666667, ans=0.0 2023-10-04 19:03:33,290 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:03:36,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:03:37,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 19:03:37,654 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 19:03:39,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:03:40,471 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:03:40,554 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 19:03:41,890 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 19:03:41,891 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:03:43,797 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:03:46,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:03:46,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:03:47,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:03:47,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 19:03:50,491 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.844e+02 2.154e+02 2.528e+02 3.068e+02 4.826e+02, threshold=5.056e+02, percent-clipped=1.0 2023-10-04 19:03:51,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 19:03:52,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1760693.3333333333, ans=0.125 2023-10-04 19:03:53,834 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:03:53,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:03:55,492 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:03:56,803 WARNING [train.py:1204] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:03:58,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 19:03:58,209 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:04:02,187 WARNING [train.py:1204] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:04:02,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:04:06,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 19:04:06,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 19:04:07,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:04:16,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:04:18,380 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.44 vs. limit=15.0 2023-10-04 19:04:20,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:04:22,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 19:04:25,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 19:04:25,390 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:04:27,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:04:28,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1760893.3333333333, ans=0.125 2023-10-04 19:04:29,323 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:04:30,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 19:04:34,304 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.10 vs. limit=22.5 2023-10-04 19:04:34,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 19:04:34,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 19:04:34,864 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:04:34,947 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:04:36,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1760893.3333333333, ans=0.0 2023-10-04 19:04:39,170 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:04:40,394 INFO [train.py:1046] (1/4) Epoch 50, batch 3850, loss[loss=0.1617, simple_loss=0.2527, pruned_loss=0.03532, over 24351.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2322, pruned_loss=0.03521, over 4742632.72 frames. ], batch size: 77, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:04:41,824 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:04:45,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:04:47,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 19:04:48,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:04:50,081 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:04:52,928 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.30 vs. limit=15.0 2023-10-04 19:04:53,359 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 19:04:53,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1760960.0, ans=0.07 2023-10-04 19:04:56,112 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:04:57,483 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 19:04:58,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 19:05:03,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:04,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:05:07,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:05:07,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:05:10,255 WARNING [train.py:1204] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:11,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:05:11,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:11,602 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:05:12,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:14,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:16,283 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:16,306 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:05:16,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 19:05:17,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 19:05:18,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:05:18,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:22,133 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:22,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:22,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 19:05:25,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 19:05:26,581 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:27,968 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 19:05:29,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 19:05:33,642 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:35,043 WARNING [train.py:1204] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:39,021 WARNING [train.py:1204] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:40,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 19:05:41,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 19:05:45,106 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:45,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:48,393 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 19:05:48,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 19:05:49,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:51,045 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:51,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:05:51,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 19:05:51,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:05:52,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 19:05:52,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:54,187 INFO [train.py:1046] (1/4) Epoch 50, batch 3900, loss[loss=0.1575, simple_loss=0.249, pruned_loss=0.03306, over 24440.00 frames. ], tot_loss[loss=0.151, simple_loss=0.2319, pruned_loss=0.03508, over 4718580.12 frames. ], batch size: 69, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:05:54,235 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:55,649 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:05:55,707 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:57,039 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:05:57,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:57,097 WARNING [train.py:1204] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:58,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:05:58,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 19:05:59,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:06:02,612 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:06:02,704 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 19:06:04,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:06:05,583 WARNING [train.py:1204] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:06:08,344 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 19:06:08,355 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:06:09,781 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:06:11,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1761360.0, ans=0.125 2023-10-04 19:06:12,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 19:06:12,512 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:06:13,972 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 19:06:14,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:06:15,916 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 19:06:17,802 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.032e+02 2.234e+02 2.595e+02 4.358e+02, threshold=4.468e+02, percent-clipped=0.0 2023-10-04 19:06:17,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 19:06:22,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:06:23,930 WARNING [train.py:1204] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:06:23,943 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:06:23,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:06:24,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1761426.6666666667, ans=0.0 2023-10-04 19:06:26,824 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:06:29,555 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:06:30,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:06:30,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:06:32,360 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:06:39,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:06:39,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:06:46,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:06:46,844 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:06:57,477 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:06:58,962 WARNING [train.py:1204] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:07:00,260 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 19:07:00,289 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 19:07:00,303 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:07:01,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 19:07:02,989 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:07:04,279 WARNING [train.py:1204] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 19:07:06,886 INFO [train.py:1046] (1/4) Epoch 50, batch 3950, loss[loss=0.1476, simple_loss=0.2196, pruned_loss=0.03783, over 23416.00 frames. ], tot_loss[loss=0.1508, simple_loss=0.2312, pruned_loss=0.03515, over 4704514.59 frames. ], batch size: 285, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:07:07,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.whiten.whitening_limit, batch_count=1761626.6666666667, ans=15.0 2023-10-04 19:07:09,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:07:11,296 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 19:07:12,680 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:07:15,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:07:17,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:07:19,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1761626.6666666667, ans=0.125 2023-10-04 19:07:21,079 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 19:07:22,399 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:07:22,418 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 19:07:22,481 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 19:07:22,514 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:07:25,708 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:07:25,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:07:25,722 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:07:26,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1761693.3333333333, ans=0.1 2023-10-04 19:07:28,396 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 19:07:31,104 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:07:31,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:07:32,431 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:07:33,849 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:07:35,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:07:44,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:07:45,597 WARNING [train.py:1204] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:07:50,228 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 19:07:54,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 19:07:54,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 19:07:54,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:07:56,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:08:02,099 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:08:02,113 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:08:03,447 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:08:03,488 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:08:04,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 19:08:08,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:08:10,288 WARNING [train.py:1204] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:08:10,806 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.50 vs. limit=15.0 2023-10-04 19:08:15,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 19:08:17,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1761893.3333333333, ans=0.1 2023-10-04 19:08:21,571 INFO [train.py:1046] (1/4) Epoch 50, batch 4000, loss[loss=0.1555, simple_loss=0.2373, pruned_loss=0.03683, over 18673.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2319, pruned_loss=0.03542, over 4703233.82 frames. ], batch size: 40, lr: 2.02e-03, grad_scale: 32.0 2023-10-04 19:08:25,713 WARNING [train.py:1204] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:08:27,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1761960.0, ans=0.0 2023-10-04 19:08:32,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:08:38,553 WARNING [train.py:1204] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:08:38,608 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:08:38,634 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:08:38,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 19:08:40,015 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 19:08:40,081 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 19:08:40,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:08:40,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 19:08:42,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:08:45,964 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.773e+02 2.094e+02 2.391e+02 2.911e+02 5.164e+02, threshold=4.782e+02, percent-clipped=3.0 2023-10-04 19:08:46,073 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:08:46,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:08:46,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:08:46,121 WARNING [train.py:1204] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:08:46,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 19:08:48,063 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:08:49,549 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 19:08:50,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:08:50,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:08:55,576 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 19:08:55,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 19:08:55,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:08:58,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1762093.3333333333, ans=0.125 2023-10-04 19:09:03,116 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 19:09:03,166 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:09:05,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:09:06,529 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.03 vs. limit=15.0 2023-10-04 19:09:07,137 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 19:09:08,551 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:09:08,621 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 19:09:09,851 WARNING [train.py:1204] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:09:11,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:09:12,503 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:09:13,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:09:13,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:09:14,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1762160.0, ans=0.07 2023-10-04 19:09:15,208 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:09:16,667 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 19:09:16,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:09:18,708 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 19:09:22,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:09:25,696 WARNING [train.py:1204] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 19:09:27,603 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:09:27,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:09:28,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:09:30,821 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:09:34,806 INFO [train.py:1046] (1/4) Epoch 50, batch 4050, loss[loss=0.1391, simple_loss=0.2226, pruned_loss=0.02776, over 24484.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2331, pruned_loss=0.03601, over 4703300.34 frames. ], batch size: 63, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:09:34,950 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:09:39,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 19:09:39,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 19:09:40,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:09:42,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:09:42,102 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:09:43,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 19:09:44,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:09:47,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:09:51,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:09:51,486 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 19:09:52,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:09:52,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:09:56,995 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:09:58,225 WARNING [train.py:1204] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 19:10:01,444 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 19:10:01,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1762360.0, ans=0.0 2023-10-04 19:10:03,405 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 19:10:03,439 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 19:10:06,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:10:10,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 19:10:11,812 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:10:15,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:10:17,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1762493.3333333333, ans=0.125 2023-10-04 19:10:19,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:10:20,701 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:10:20,723 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:10:23,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:10:26,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 19:10:26,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 19:10:27,719 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:10:29,035 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 19:10:34,501 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:10:40,166 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 19:10:40,240 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:10:40,244 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:10:41,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 19:10:41,682 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 19:10:41,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:10:43,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1762560.0, ans=0.0 2023-10-04 19:10:44,998 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:10:46,353 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:10:46,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:10:48,838 INFO [train.py:1046] (1/4) Epoch 50, batch 4100, loss[loss=0.1377, simple_loss=0.2138, pruned_loss=0.03084, over 23774.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2336, pruned_loss=0.03673, over 4682095.58 frames. ], batch size: 149, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:10:55,017 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 19:10:56,416 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 19:10:58,093 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 19:10:59,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 19:10:59,455 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:11:00,774 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:11:00,804 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:11:00,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:11:02,124 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 19:11:05,794 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:11:07,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:11:07,273 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:11:07,333 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:11:12,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:11:12,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1762693.3333333333, ans=0.125 2023-10-04 19:11:13,954 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.059e+02 2.320e+02 2.888e+02 5.611e+02, threshold=4.640e+02, percent-clipped=1.0 2023-10-04 19:11:14,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:11:14,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:11:14,110 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 19:11:16,092 WARNING [train.py:1204] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:11:16,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:11:16,108 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:11:16,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:11:16,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 19:11:20,381 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:11:21,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 19:11:23,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:11:25,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:11:25,814 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 19:11:26,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1762760.0, ans=0.2 2023-10-04 19:11:27,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:11:27,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:11:27,832 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:11:27,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1762760.0, ans=0.125 2023-10-04 19:11:30,624 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 19:11:32,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:11:32,090 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:11:35,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 19:11:35,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:11:35,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:11:38,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:11:42,921 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:11:45,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:11:47,370 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:11:54,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:11:54,262 WARNING [train.py:1204] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:11:57,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:11:58,940 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:12:03,063 INFO [train.py:1046] (1/4) Epoch 50, batch 4150, loss[loss=0.1582, simple_loss=0.2402, pruned_loss=0.03811, over 24010.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2344, pruned_loss=0.03693, over 4677267.92 frames. ], batch size: 86, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:12:03,226 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:12:06,426 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:12:06,500 WARNING [train.py:1204] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:12:06,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:12:09,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 19:12:09,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:12:10,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 19:12:10,855 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 19:12:10,870 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 19:12:11,384 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.96 vs. limit=15.0 2023-10-04 19:12:12,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:12:15,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:12:15,386 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:12:20,095 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:12:21,338 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:12:21,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 19:12:22,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 19:12:24,157 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:12:25,485 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 19:12:27,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1763026.6666666667, ans=0.1 2023-10-04 19:12:28,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:12:31,840 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:12:33,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 19:12:35,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1763093.3333333333, ans=0.125 2023-10-04 19:12:36,766 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 19:12:36,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:12:36,854 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 19:12:38,149 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:12:38,169 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:12:40,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:12:40,979 WARNING [train.py:1204] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:12:44,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1763093.3333333333, ans=0.2 2023-10-04 19:12:46,372 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 19:12:49,534 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 19:12:50,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:12:51,018 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 19:12:52,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:12:53,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 19:12:55,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:12:56,541 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:12:57,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.53 vs. limit=15.0 2023-10-04 19:12:58,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:12:59,809 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 19:12:59,809 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:12:59,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 19:12:59,948 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 19:13:01,820 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 19:13:01,841 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:13:01,845 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:13:01,861 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 19:13:03,284 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 19:13:03,322 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:13:04,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 19:13:04,606 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:13:07,630 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:13:07,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 19:13:07,678 WARNING [train.py:1204] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 19:13:13,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:13:16,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 19:13:17,310 INFO [train.py:1046] (1/4) Epoch 50, batch 4200, loss[loss=0.16, simple_loss=0.2395, pruned_loss=0.04023, over 24478.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2329, pruned_loss=0.03661, over 4690598.11 frames. ], batch size: 63, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:13:17,417 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:13:19,398 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:13:19,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1763293.3333333333, ans=0.125 2023-10-04 19:13:20,841 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:13:20,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:13:20,909 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:13:22,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1763293.3333333333, ans=0.125 2023-10-04 19:13:23,600 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 19:13:23,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1763293.3333333333, ans=0.125 2023-10-04 19:13:26,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 19:13:28,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:13:30,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:13:31,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1763360.0, ans=0.0 2023-10-04 19:13:32,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1763360.0, ans=0.125 2023-10-04 19:13:33,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:13:35,479 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 19:13:36,887 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:13:38,525 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:13:38,579 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 19:13:38,589 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:13:41,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:13:42,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:13:42,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:13:43,864 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.744e+02 2.065e+02 2.338e+02 2.693e+02 5.755e+02, threshold=4.677e+02, percent-clipped=2.0 2023-10-04 19:13:44,037 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:13:45,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 19:13:45,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:13:47,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1763426.6666666667, ans=0.125 2023-10-04 19:13:50,297 WARNING [train.py:1204] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 19:13:51,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:13:53,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:13:54,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:13:58,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:13:58,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 19:13:58,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:14:00,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:14:05,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 19:14:06,686 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:14:08,648 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.96 vs. limit=10.0 2023-10-04 19:14:10,857 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:14:11,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1763493.3333333333, ans=0.09899494936611666 2023-10-04 19:14:12,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1763493.3333333333, ans=0.1 2023-10-04 19:14:14,131 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 19:14:16,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:14:21,601 WARNING [train.py:1204] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 19:14:21,673 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:14:23,057 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 19:14:26,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1763560.0, ans=0.0 2023-10-04 19:14:27,330 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 19:14:31,955 INFO [train.py:1046] (1/4) Epoch 50, batch 4250, loss[loss=0.1422, simple_loss=0.2174, pruned_loss=0.03345, over 23801.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2317, pruned_loss=0.03628, over 4696641.99 frames. ], batch size: 212, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:14:32,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:14:32,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 19:14:35,254 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:14:39,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1763626.6666666667, ans=0.1 2023-10-04 19:14:39,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=1763626.6666666667, ans=0.1 2023-10-04 19:14:40,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:14:40,768 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 19:14:40,800 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:14:42,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1763626.6666666667, ans=0.1 2023-10-04 19:14:43,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:14:48,436 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:14:52,577 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:14:52,591 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:14:54,028 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:14:54,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:14:55,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1763693.3333333333, ans=0.125 2023-10-04 19:14:56,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:14:56,785 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:14:56,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1763693.3333333333, ans=0.125 2023-10-04 19:14:58,198 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:15:00,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:15:00,877 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:15:02,711 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 19:15:07,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 19:15:07,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:15:08,648 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:15:08,666 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:15:09,463 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.73 vs. limit=22.5 2023-10-04 19:15:10,014 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:15:10,017 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:15:10,069 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:15:13,370 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 19:15:14,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:15:19,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:15:20,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:15:21,672 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.53 vs. limit=15.0 2023-10-04 19:15:22,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 19:15:22,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:15:23,456 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 19:15:24,931 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:15:27,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:15:28,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:15:29,007 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:15:30,435 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 19:15:31,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 19:15:31,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 19:15:32,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1763893.3333333333, ans=0.125 2023-10-04 19:15:35,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:15:38,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:15:38,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:15:41,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:15:42,790 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:15:44,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:15:45,512 INFO [train.py:1046] (1/4) Epoch 50, batch 4300, loss[loss=0.1513, simple_loss=0.2279, pruned_loss=0.03736, over 22751.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2313, pruned_loss=0.03589, over 4697778.55 frames. ], batch size: 322, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:15:45,559 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:15:45,565 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 19:15:47,587 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:15:51,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:15:53,173 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:15:58,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:16:02,926 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:16:02,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 19:16:03,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1764026.6666666667, ans=0.1 2023-10-04 19:16:05,644 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:16:07,513 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:16:07,536 WARNING [train.py:1204] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:16:07,548 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 19:16:12,136 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.793e+02 2.156e+02 2.411e+02 2.835e+02 5.289e+02, threshold=4.821e+02, percent-clipped=1.0 2023-10-04 19:16:12,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 19:16:13,650 WARNING [train.py:1204] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:16:15,705 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 19:16:15,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:16:16,766 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.65 vs. limit=15.0 2023-10-04 19:16:17,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 19:16:18,547 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 19:16:20,375 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:16:23,183 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:16:23,185 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:16:23,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1764093.3333333333, ans=0.0 2023-10-04 19:16:24,490 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:16:24,593 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:16:25,936 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:16:27,206 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 19:16:27,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 19:16:30,107 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:16:31,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1764160.0, ans=0.125 2023-10-04 19:16:32,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:16:32,906 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 19:16:32,921 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:16:32,964 WARNING [train.py:1204] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:16:32,980 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 19:16:32,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 19:16:34,391 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 19:16:35,771 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:16:35,796 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 19:16:35,833 WARNING [train.py:1204] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 19:16:39,758 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:16:42,454 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 19:16:42,519 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:16:43,970 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:16:43,982 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:16:47,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 19:16:48,568 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:16:48,586 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:16:48,633 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:16:48,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:16:49,969 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:16:51,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:16:53,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1764226.6666666667, ans=0.125 2023-10-04 19:16:54,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:16:55,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:16:55,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:17:00,008 INFO [train.py:1046] (1/4) Epoch 50, batch 4350, loss[loss=0.1613, simple_loss=0.2421, pruned_loss=0.0402, over 23242.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.232, pruned_loss=0.03566, over 4704299.45 frames. ], batch size: 105, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:17:00,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1764293.3333333333, ans=0.125 2023-10-04 19:17:01,504 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 19:17:01,544 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 19:17:05,882 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:17:07,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:17:10,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:17:10,997 WARNING [train.py:1204] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:17:18,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:17:20,941 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:17:22,999 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:17:24,248 WARNING [train.py:1204] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:17:25,726 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:17:27,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:17:28,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:17:35,286 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 19:17:35,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:17:35,432 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:17:37,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1764426.6666666667, ans=0.1 2023-10-04 19:17:41,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:17:44,655 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 19:17:47,434 WARNING [train.py:1204] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:17:47,548 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 19:17:47,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1764493.3333333333, ans=0.125 2023-10-04 19:17:52,268 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 19:17:52,377 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:17:54,327 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:17:55,690 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 19:17:55,749 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 19:17:55,761 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:17:55,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:17:58,373 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:17:58,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:17:59,795 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:17:59,839 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:18:01,279 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 19:18:01,294 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:01,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:18:01,325 WARNING [train.py:1204] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:01,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1764560.0, ans=0.2 2023-10-04 19:18:02,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 19:18:02,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1764560.0, ans=0.125 2023-10-04 19:18:03,990 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 19:18:03,994 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 19:18:05,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 19:18:08,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:18:08,364 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:18:09,664 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:18:10,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:18:12,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 19:18:14,254 INFO [train.py:1046] (1/4) Epoch 50, batch 4400, loss[loss=0.1546, simple_loss=0.2365, pruned_loss=0.03636, over 24482.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2336, pruned_loss=0.03603, over 4720202.27 frames. ], batch size: 66, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:18:14,366 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 19:18:14,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:17,413 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:18:17,429 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:18,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:18:20,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 19:18:22,038 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 19:18:22,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 19:18:22,087 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 19:18:23,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 19:18:23,520 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:18:26,587 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 19:18:29,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:30,654 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:18:30,664 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 19:18:33,444 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:18:33,445 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 19:18:34,943 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 19:18:36,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 19:18:37,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 19:18:37,799 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 19:18:37,825 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:18:39,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:18:40,858 WARNING [train.py:1204] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:18:40,928 WARNING [train.py:1204] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:18:42,813 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.843e+02 2.185e+02 2.399e+02 2.720e+02 3.791e+02, threshold=4.798e+02, percent-clipped=0.0 2023-10-04 19:18:42,978 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 19:18:42,985 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 19:18:44,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:18:44,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1764760.0, ans=0.1 2023-10-04 19:18:45,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:18:45,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:47,292 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:18:47,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:18:47,340 WARNING [train.py:1204] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 19:18:48,713 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 19:18:49,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1764760.0, ans=0.0 2023-10-04 19:18:51,944 WARNING [train.py:1204] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:18:52,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1764760.0, ans=0.125 2023-10-04 19:18:59,084 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:19:00,593 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 19:19:06,055 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:19:08,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:19:10,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1764826.6666666667, ans=0.125 2023-10-04 19:19:12,181 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:19:12,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 19:19:12,249 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:19:12,260 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:19:12,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:19:13,590 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:19:16,953 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 19:19:18,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1764893.3333333333, ans=0.125 2023-10-04 19:19:20,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 19:19:21,049 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 19:19:22,332 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:19:22,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 19:19:22,427 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:19:25,843 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:19:28,739 INFO [train.py:1046] (1/4) Epoch 50, batch 4450, loss[loss=0.1567, simple_loss=0.2386, pruned_loss=0.03741, over 23370.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2345, pruned_loss=0.03626, over 4719775.08 frames. ], batch size: 93, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:19:29,424 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 19:19:32,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:19:34,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:19:34,807 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:19:41,681 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:19:41,699 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:19:43,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1765026.6666666667, ans=0.2 2023-10-04 19:19:45,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:19:47,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1765026.6666666667, ans=0.1 2023-10-04 19:19:48,281 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:19:49,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:19:51,029 WARNING [train.py:1204] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:19:51,112 WARNING [train.py:1204] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 19:19:51,114 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:19:52,482 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:19:52,524 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:19:52,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:19:55,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 19:20:00,388 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:20:01,673 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:20:03,021 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:20:03,060 WARNING [train.py:1204] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:20:04,390 WARNING [train.py:1204] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:20:04,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1765093.3333333333, ans=0.1 2023-10-04 19:20:07,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 19:20:08,550 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 19:20:09,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 19:20:09,963 WARNING [train.py:1204] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:20:12,747 WARNING [train.py:1204] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:20:12,818 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 19:20:16,321 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:20:19,109 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:20:20,446 WARNING [train.py:1204] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 19:20:20,469 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:20:20,478 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:20:20,498 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:20:21,847 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:20:21,967 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:20:24,728 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 19:20:24,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 19:20:28,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:20:30,004 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:20:31,406 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:20:32,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:20:32,786 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 19:20:32,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1765226.6666666667, ans=0.0 2023-10-04 19:20:35,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:20:35,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1765226.6666666667, ans=0.0 2023-10-04 19:20:37,091 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 19:20:38,426 WARNING [train.py:1204] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:20:39,082 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.48 vs. limit=6.0 2023-10-04 19:20:42,367 INFO [train.py:1046] (1/4) Epoch 50, batch 4500, loss[loss=0.1517, simple_loss=0.2397, pruned_loss=0.03189, over 24354.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2349, pruned_loss=0.03664, over 4709181.62 frames. ], batch size: 77, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:20:42,522 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:20:45,924 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 19:20:45,925 WARNING [train.py:1204] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 19:20:46,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1765293.3333333333, ans=0.1 2023-10-04 19:20:47,358 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:20:48,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1765293.3333333333, ans=0.2 2023-10-04 19:20:54,311 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:20:54,372 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:20:54,438 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:20:55,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:20:55,830 WARNING [train.py:1204] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:20:57,185 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:21:08,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:21:09,972 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.200e+02 2.403e+02 2.900e+02 5.127e+02, threshold=4.806e+02, percent-clipped=1.0 2023-10-04 19:21:10,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:21:11,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:21:12,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:21:12,822 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:21:18,956 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 19:21:21,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:21:26,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:21:26,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1765493.3333333333, ans=0.125 2023-10-04 19:21:27,197 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.28 vs. limit=15.0 2023-10-04 19:21:29,203 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:21:29,259 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 19:21:30,975 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:21:31,020 WARNING [train.py:1204] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:21:32,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:21:33,691 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:21:36,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:21:36,449 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 19:21:36,450 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 19:21:36,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:21:40,797 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:21:40,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:21:45,488 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:21:47,034 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:21:47,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:21:50,298 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 19:21:51,652 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 19:21:51,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 19:21:54,412 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 19:21:55,734 INFO [train.py:1046] (1/4) Epoch 50, batch 4550, loss[loss=0.1573, simple_loss=0.2443, pruned_loss=0.03519, over 24430.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2341, pruned_loss=0.03631, over 4707289.95 frames. ], batch size: 69, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:21:57,271 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 19:21:59,379 WARNING [train.py:1204] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:22:01,982 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:22:02,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:22:04,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:22:09,525 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:22:10,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:22:12,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:22:12,774 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:22:12,775 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:15,511 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:22:15,561 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:22:20,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:22:21,479 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 19:22:21,532 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 19:22:22,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:22:24,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 19:22:24,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1765760.0, ans=0.2 2023-10-04 19:22:28,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 19:22:29,505 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:22:34,793 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 19:22:34,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:22:38,937 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:38,977 WARNING [train.py:1204] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:38,990 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:22:40,433 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 19:22:42,458 WARNING [train.py:1204] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:22:43,893 WARNING [train.py:1204] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:45,219 WARNING [train.py:1204] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:22:46,609 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:22:47,974 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 19:22:48,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 19:22:49,684 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:22:49,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 19:22:51,261 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 19:22:51,278 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:22:51,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1765826.6666666667, ans=0.0 2023-10-04 19:22:52,639 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:22:52,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:22:54,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:54,127 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:22:55,618 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 19:22:55,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1765893.3333333333, ans=0.125 2023-10-04 19:22:56,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 19:22:58,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:22:58,231 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 19:22:58,304 WARNING [train.py:1204] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 19:22:58,311 WARNING [train.py:1204] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:22:58,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 19:23:02,917 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:23:02,937 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:23:03,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1765893.3333333333, ans=0.0 2023-10-04 19:23:06,313 WARNING [train.py:1204] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:23:06,365 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:23:06,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 19:23:07,846 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:23:09,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 19:23:10,994 INFO [train.py:1046] (1/4) Epoch 50, batch 4600, loss[loss=0.1531, simple_loss=0.2404, pruned_loss=0.03288, over 24101.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2332, pruned_loss=0.03567, over 4720371.96 frames. ], batch size: 80, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:23:12,508 WARNING [train.py:1204] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:13,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:23:16,546 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:23:16,566 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:23:18,498 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:23:18,596 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 19:23:20,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:23:23,923 WARNING [train.py:1204] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:23:23,988 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:23:26,732 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:27,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1766026.6666666667, ans=0.0 2023-10-04 19:23:32,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 19:23:32,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:36,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:39,343 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.716e+02 2.190e+02 2.507e+02 2.915e+02 5.152e+02, threshold=5.014e+02, percent-clipped=2.0 2023-10-04 19:23:39,486 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:23:39,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:23:42,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1766093.3333333333, ans=0.0 2023-10-04 19:23:42,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1766093.3333333333, ans=0.0 2023-10-04 19:23:44,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 19:23:44,949 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 19:23:46,422 WARNING [train.py:1204] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:23:52,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:52,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:23:53,780 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:23:57,897 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 19:23:57,992 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 19:24:02,238 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:04,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:24:05,473 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:05,474 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 19:24:07,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:24:07,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 19:24:08,757 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:08,813 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:24:10,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:12,207 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:24:12,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:24:13,661 WARNING [train.py:1204] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 19:24:13,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 19:24:13,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 19:24:13,740 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:24:15,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:24:16,419 WARNING [train.py:1204] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:24:18,194 WARNING [train.py:1204] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:24:25,040 INFO [train.py:1046] (1/4) Epoch 50, batch 4650, loss[loss=0.1619, simple_loss=0.2389, pruned_loss=0.04243, over 23787.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2327, pruned_loss=0.03542, over 4725910.37 frames. ], batch size: 179, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:24:26,551 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:24:29,308 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:24:29,345 WARNING [train.py:1204] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:24:29,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:24:29,425 WARNING [train.py:1204] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:24:30,692 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:24:30,783 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:24:35,352 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 19:24:40,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:24:41,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 19:24:43,293 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:24:44,635 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 19:24:44,663 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:24:46,051 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 19:24:46,070 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 19:24:46,078 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:47,405 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:24:48,204 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.24 vs. limit=15.0 2023-10-04 19:24:48,938 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:24:50,744 WARNING [train.py:1204] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:24:50,772 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 19:24:54,844 WARNING [train.py:1204] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:24:56,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 19:24:59,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:59,044 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:24:59,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1766426.6666666667, ans=0.0 2023-10-04 19:24:59,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1766426.6666666667, ans=0.0 2023-10-04 19:25:00,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 19:25:01,848 WARNING [train.py:1204] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:25:04,718 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:25:07,976 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:25:12,280 WARNING [train.py:1204] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:25:14,337 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:25:15,637 WARNING [train.py:1204] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:25:16,965 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:25:18,903 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.40 vs. limit=15.0 2023-10-04 19:25:19,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 19:25:19,719 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 19:25:21,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 19:25:21,058 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 19:25:23,004 WARNING [train.py:1204] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:25:28,604 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:25:28,615 WARNING [train.py:1204] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:25:29,932 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 19:25:29,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:25:30,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1766560.0, ans=0.125 2023-10-04 19:25:31,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:25:31,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:25:32,899 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:25:35,659 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:25:35,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:25:35,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:25:38,785 INFO [train.py:1046] (1/4) Epoch 50, batch 4700, loss[loss=0.1623, simple_loss=0.2402, pruned_loss=0.04217, over 22881.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2339, pruned_loss=0.036, over 4718550.15 frames. ], batch size: 322, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:25:38,960 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:25:38,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:25:38,994 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:25:40,920 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 19:25:42,354 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:25:43,720 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 19:25:46,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1766626.6666666667, ans=0.0 2023-10-04 19:25:50,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:25:50,787 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.83 vs. limit=15.0 2023-10-04 19:25:51,421 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:25:51,470 WARNING [train.py:1204] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:25:53,392 WARNING [train.py:1204] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:25:56,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 19:25:57,874 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.81 vs. limit=15.0 2023-10-04 19:26:01,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 19:26:01,535 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 19:26:04,334 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:26:05,012 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.72 vs. limit=22.5 2023-10-04 19:26:05,626 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:26:05,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1766693.3333333333, ans=0.125 2023-10-04 19:26:07,001 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.807e+02 2.151e+02 2.397e+02 2.969e+02 5.110e+02, threshold=4.793e+02, percent-clipped=1.0 2023-10-04 19:26:07,072 WARNING [train.py:1204] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:26:10,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:26:11,240 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.68 vs. limit=22.5 2023-10-04 19:26:11,371 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.08 vs. limit=12.0 2023-10-04 19:26:15,347 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 19:26:15,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 19:26:18,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:26:22,878 WARNING [train.py:1204] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 19:26:24,603 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:26:26,055 WARNING [train.py:1204] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:31,526 WARNING [train.py:1204] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 19:26:32,891 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:26:34,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1766826.6666666667, ans=0.0 2023-10-04 19:26:35,698 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:26:35,803 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 19:26:37,499 WARNING [train.py:1204] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 19:26:38,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:38,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:26:40,485 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:26:41,858 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:26:41,875 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 19:26:41,951 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 19:26:43,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:26:45,329 WARNING [train.py:1204] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:45,330 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:45,334 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 19:26:46,684 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:46,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1766893.3333333333, ans=0.1 2023-10-04 19:26:50,746 WARNING [train.py:1204] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 19:26:52,686 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.40 vs. limit=15.0 2023-10-04 19:26:54,058 INFO [train.py:1046] (1/4) Epoch 50, batch 4750, loss[loss=0.1624, simple_loss=0.253, pruned_loss=0.03588, over 24650.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2347, pruned_loss=0.03622, over 4714762.76 frames. ], batch size: 73, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:26:54,143 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:26:56,008 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:26:56,600 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.91 vs. limit=15.0 2023-10-04 19:27:00,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:27:01,441 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:27:04,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 19:27:04,196 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:27:07,657 WARNING [train.py:1204] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 19:27:09,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:27:09,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:27:10,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:27:10,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1767026.6666666667, ans=0.2 2023-10-04 19:27:12,383 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.67 vs. limit=22.5 2023-10-04 19:27:14,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 19:27:20,772 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:27:23,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 19:27:23,410 WARNING [train.py:1204] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:27:25,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1767093.3333333333, ans=0.0 2023-10-04 19:27:26,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:27:26,736 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:27:26,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:27:28,100 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 19:27:28,103 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 19:27:34,309 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 19:27:34,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1767093.3333333333, ans=0.0 2023-10-04 19:27:36,984 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:27:39,634 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:27:41,756 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:27:41,757 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 19:27:41,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:27:44,363 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:27:45,881 WARNING [train.py:1204] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:27:48,575 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 19:27:48,606 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 19:27:49,971 WARNING [train.py:1204] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:27:49,993 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:27:50,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:27:51,395 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 19:27:51,411 WARNING [train.py:1204] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 19:27:53,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 19:27:56,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:27:58,357 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:27:58,359 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 19:27:59,672 WARNING [train.py:1204] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:28:01,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1767226.6666666667, ans=0.2 2023-10-04 19:28:02,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:28:04,138 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:28:04,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:05,519 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 19:28:08,325 INFO [train.py:1046] (1/4) Epoch 50, batch 4800, loss[loss=0.1556, simple_loss=0.2365, pruned_loss=0.03741, over 23428.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2354, pruned_loss=0.03629, over 4717582.65 frames. ], batch size: 93, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:28:08,365 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:28:08,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 19:28:09,883 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 19:28:11,142 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 19:28:14,398 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:28:14,422 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:28:14,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 19:28:17,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1767293.3333333333, ans=0.0 2023-10-04 19:28:18,817 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:20,197 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:28:21,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1767360.0, ans=0.0 2023-10-04 19:28:24,419 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:28:26,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:28:26,319 WARNING [train.py:1204] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:28,186 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 19:28:28,252 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:28:28,301 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:28:29,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:28:34,337 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:28:35,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:28:35,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:28:37,018 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.867e+02 2.213e+02 2.544e+02 3.489e+02 5.983e+02, threshold=5.088e+02, percent-clipped=6.0 2023-10-04 19:28:37,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:28:37,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 19:28:37,182 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:28:38,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:28:39,996 WARNING [train.py:1204] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:28:43,383 WARNING [train.py:1204] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:28:46,078 WARNING [train.py:1204] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:28:46,098 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:28:47,443 WARNING [train.py:1204] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 19:28:48,798 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:50,190 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 19:28:50,208 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 19:28:51,668 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:51,681 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:28:51,728 WARNING [train.py:1204] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:28:51,734 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:28:51,742 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:28:51,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1767493.3333333333, ans=0.2 2023-10-04 19:28:53,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:28:53,218 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:28:56,369 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:28:58,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:00,916 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:29:01,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1767493.3333333333, ans=0.0 2023-10-04 19:29:05,066 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 19:29:05,095 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:29:05,134 WARNING [train.py:1204] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:05,178 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:29:06,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:29:10,640 WARNING [train.py:1204] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:29:11,343 WARNING [train.py:1204] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:29:11,350 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:11,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:29:12,737 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:29:14,067 WARNING [train.py:1204] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:29:16,945 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:29:16,952 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:18,161 WARNING [train.py:1204] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:29:18,902 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.15 vs. limit=15.0 2023-10-04 19:29:19,596 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 19:29:21,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 19:29:21,052 WARNING [train.py:1204] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:29:21,056 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:29:22,392 INFO [train.py:1046] (1/4) Epoch 50, batch 4850, loss[loss=0.1407, simple_loss=0.2202, pruned_loss=0.03064, over 24331.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2352, pruned_loss=0.03619, over 4713926.06 frames. ], batch size: 61, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:29:22,505 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:29:22,506 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:24,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1767626.6666666667, ans=0.125 2023-10-04 19:29:27,690 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:29:33,884 WARNING [train.py:1204] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 19:29:33,992 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:29:39,635 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:29:39,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1767693.3333333333, ans=0.2 2023-10-04 19:29:41,016 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 19:29:41,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:44,247 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:29:45,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:29:45,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1767693.3333333333, ans=0.125 2023-10-04 19:29:48,195 WARNING [train.py:1204] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:29:48,205 WARNING [train.py:1204] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 19:29:49,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1767693.3333333333, ans=0.1 2023-10-04 19:29:51,039 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:29:52,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1767760.0, ans=0.125 2023-10-04 19:29:53,748 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:29:53,781 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 19:29:55,170 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:29:55,174 WARNING [train.py:1204] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 19:29:57,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:29:58,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:30:02,784 WARNING [train.py:1204] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:30:02,801 WARNING [train.py:1204] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 19:30:02,850 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 19:30:03,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1767760.0, ans=0.0 2023-10-04 19:30:04,269 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 19:30:05,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1767826.6666666667, ans=0.0 2023-10-04 19:30:08,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1767826.6666666667, ans=0.125 2023-10-04 19:30:11,167 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:30:12,456 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 19:30:13,754 WARNING [train.py:1204] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:30:13,767 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:30:15,856 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:30:17,320 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 19:30:17,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:30:17,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1767826.6666666667, ans=0.0 2023-10-04 19:30:18,782 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 19:30:18,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:30:20,237 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:30:20,299 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 19:30:21,014 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.26 vs. limit=15.0 2023-10-04 19:30:29,556 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:30:34,364 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:30:35,631 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:30:36,959 INFO [train.py:1046] (1/4) Epoch 50, batch 4900, loss[loss=0.1396, simple_loss=0.2197, pruned_loss=0.02978, over 24470.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2339, pruned_loss=0.03586, over 4712759.59 frames. ], batch size: 58, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:30:39,827 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 19:30:39,829 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:30:44,084 WARNING [train.py:1204] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:30:45,733 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:30:45,763 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:30:49,030 WARNING [train.py:1204] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 19:30:53,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 19:30:56,356 WARNING [train.py:1204] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 19:30:58,338 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 19:31:00,169 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:31:00,207 WARNING [train.py:1204] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:31:02,172 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:31:02,193 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:31:02,202 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:31:02,280 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 19:31:05,019 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 19:31:05,074 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:31:06,262 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.714e+02 2.161e+02 2.469e+02 3.032e+02 6.207e+02, threshold=4.938e+02, percent-clipped=2.0 2023-10-04 19:31:07,741 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:31:09,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:31:10,495 WARNING [train.py:1204] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:31:11,863 WARNING [train.py:1204] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:31:13,287 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:31:13,300 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 19:31:14,020 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.37 vs. limit=10.0 2023-10-04 19:31:16,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:31:17,382 WARNING [train.py:1204] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:31:17,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 19:31:17,402 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 19:31:20,529 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 19:31:22,025 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:31:22,109 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:31:22,151 WARNING [train.py:1204] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:31:23,464 WARNING [train.py:1204] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:31:23,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 19:31:23,521 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:31:23,560 WARNING [train.py:1204] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 19:31:25,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1768160.0, ans=0.07 2023-10-04 19:31:27,879 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:31:29,324 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 19:31:31,265 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:31:33,328 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 19:31:34,669 WARNING [train.py:1204] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:31:34,735 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 19:31:34,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 19:31:39,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:31:40,361 WARNING [train.py:1204] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:31:41,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 19:31:41,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 19:31:41,722 WARNING [train.py:1204] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:31:43,125 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:31:46,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1768226.6666666667, ans=0.0 2023-10-04 19:31:47,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:31:47,853 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:31:47,880 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:31:47,905 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 19:31:49,312 WARNING [train.py:1204] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 19:31:51,958 INFO [train.py:1046] (1/4) Epoch 50, batch 4950, loss[loss=0.1486, simple_loss=0.2304, pruned_loss=0.0334, over 23610.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2322, pruned_loss=0.03541, over 4699497.37 frames. ], batch size: 120, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:31:52,042 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:31:52,065 WARNING [train.py:1204] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 19:31:57,124 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 19:31:57,153 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 19:31:57,179 WARNING [train.py:1204] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:31:57,434 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1768293.3333333333, ans=0.125 2023-10-04 19:31:58,534 WARNING [train.py:1204] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 19:31:58,558 WARNING [train.py:1204] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:31:58,567 WARNING [train.py:1204] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:32:00,490 WARNING [train.py:1204] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:32:00,515 WARNING [train.py:1204] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:03,389 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:32:03,439 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:32:06,088 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:32:07,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:32:08,897 WARNING [train.py:1204] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:32:08,910 WARNING [train.py:1204] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:32:13,031 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:32:13,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1768360.0, ans=0.125 2023-10-04 19:32:15,927 WARNING [train.py:1204] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:32:18,001 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:32:19,393 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:32:19,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:20,757 WARNING [train.py:1204] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:32:23,463 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 19:32:23,531 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 19:32:26,706 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:28,645 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:32:28,665 WARNING [train.py:1204] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:32:30,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:32:30,032 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:32:31,358 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:32:32,769 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:32:34,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:32:36,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:32:37,396 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:32:37,428 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:38,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 19:32:38,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:32:38,909 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:32:43,096 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:32:43,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1768493.3333333333, ans=0.125 2023-10-04 19:32:45,135 WARNING [train.py:1204] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:32:46,242 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:32:46,287 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:47,554 WARNING [train.py:1204] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:32:47,620 WARNING [train.py:1204] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:32:50,295 WARNING [train.py:1204] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:32:50,349 WARNING [train.py:1204] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:32:50,384 WARNING [train.py:1204] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:32:51,795 WARNING [train.py:1204] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 19:32:56,460 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:33:03,059 WARNING [train.py:1204] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 19:33:03,076 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 19:33:07,077 INFO [train.py:1046] (1/4) Epoch 50, batch 5000, loss[loss=0.1646, simple_loss=0.2494, pruned_loss=0.03984, over 24383.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.232, pruned_loss=0.03539, over 4696625.56 frames. ], batch size: 77, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:33:08,744 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:33:08,755 WARNING [train.py:1204] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 19:33:10,117 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 19:33:11,467 WARNING [train.py:1204] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 19:33:12,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:33:14,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 19:33:14,931 WARNING [train.py:1204] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:33:14,946 WARNING [train.py:1204] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:33:16,263 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 19:33:16,299 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:33:17,649 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:33:18,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 19:33:18,993 WARNING [train.py:1204] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:33:19,046 WARNING [train.py:1204] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:33:19,165 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 19:33:20,397 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 19:33:21,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:33:21,831 WARNING [train.py:1204] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 19:33:21,838 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:33:23,184 WARNING [train.py:1204] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:33:24,535 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 19:33:24,537 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 19:33:24,543 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 19:33:25,955 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 19:33:25,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:33:27,915 WARNING [train.py:1204] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:33:29,315 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 19:33:29,335 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 19:33:31,292 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:33:32,610 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:33:34,507 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 19:33:35,936 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 19:33:37,195 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.194e+02 2.487e+02 3.099e+02 5.334e+02, threshold=4.974e+02, percent-clipped=1.0 2023-10-04 19:33:37,296 WARNING [train.py:1204] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:33:38,709 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:33:42,780 WARNING [train.py:1204] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 19:33:43,449 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.64 vs. limit=15.0 2023-10-04 19:33:44,316 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:33:46,346 WARNING [train.py:1204] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:33:46,348 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:33:49,193 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 19:33:50,414 WARNING [train.py:1204] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:33:50,448 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:33:51,725 WARNING [train.py:1204] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:33:51,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1768826.6666666667, ans=0.125 2023-10-04 19:33:53,197 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 19:33:53,246 WARNING [train.py:1204] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:33:57,462 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:33:57,552 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:34:03,656 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 19:34:07,022 WARNING [train.py:1204] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:34:17,241 WARNING [train.py:1204] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:34:18,648 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:34:18,655 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:34:18,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:34:19,958 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:34:19,987 WARNING [train.py:1204] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 19:34:20,029 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:34:21,301 INFO [train.py:1046] (1/4) Epoch 50, batch 5050, loss[loss=0.1554, simple_loss=0.2427, pruned_loss=0.03409, over 24473.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2325, pruned_loss=0.03538, over 4702097.07 frames. ], batch size: 66, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:34:25,481 WARNING [train.py:1204] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:34:25,502 WARNING [train.py:1204] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 19:34:26,900 WARNING [train.py:1204] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:34:27,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1768960.0, ans=0.125 2023-10-04 19:34:28,342 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:34:28,429 WARNING [train.py:1204] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:34:29,778 WARNING [train.py:1204] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 19:34:31,138 WARNING [train.py:1204] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:34:33,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:34:35,690 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 19:34:37,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:34:37,674 WARNING [train.py:1204] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 19:34:46,423 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 19:34:47,706 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 19:34:47,785 WARNING [train.py:1204] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:34:49,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 19:34:49,252 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:34:50,625 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:34:50,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:34:51,351 WARNING [train.py:1204] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:34:51,354 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 19:34:54,005 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 19:34:54,094 WARNING [train.py:1204] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:34:56,047 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.94 vs. limit=10.0 2023-10-04 19:34:56,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:34:59,736 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:34:59,776 WARNING [train.py:1204] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 19:35:01,148 WARNING [train.py:1204] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:35:04,532 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 19:35:04,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1769160.0, ans=0.125 2023-10-04 19:35:05,804 WARNING [train.py:1204] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:35:05,836 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:35:07,233 WARNING [train.py:1204] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:35:07,317 WARNING [train.py:1204] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:35:09,273 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:35:12,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:35:13,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:35:13,731 WARNING [train.py:1204] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:35:13,738 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:35:15,051 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 19:35:16,417 WARNING [train.py:1204] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:35:17,835 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:35:20,683 WARNING [train.py:1204] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:35:20,689 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 19:35:20,703 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 19:35:22,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:35:22,713 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:35:22,741 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 19:35:25,484 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:35:25,496 WARNING [train.py:1204] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 19:35:25,496 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:35:29,658 WARNING [train.py:1204] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:35:29,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1769226.6666666667, ans=0.0 2023-10-04 19:35:31,026 WARNING [train.py:1204] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:35:31,045 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 19:35:32,416 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 19:35:34,227 WARNING [train.py:1204] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:35:34,243 WARNING [train.py:1204] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:35:34,284 WARNING [train.py:1204] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:35:35,540 INFO [train.py:1046] (1/4) Epoch 50, batch 5100, loss[loss=0.1721, simple_loss=0.2576, pruned_loss=0.04335, over 23278.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.234, pruned_loss=0.03566, over 4706210.50 frames. ], batch size: 93, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:35:38,314 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 19:35:40,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1769293.3333333333, ans=0.125 2023-10-04 19:35:40,645 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.68 vs. limit=22.5 2023-10-04 19:35:41,550 WARNING [train.py:1204] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:35:44,788 WARNING [train.py:1204] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 19:35:44,847 WARNING [train.py:1204] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 19:35:44,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:35:47,598 WARNING [train.py:1204] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:35:50,291 WARNING [train.py:1204] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:35:50,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 19:35:50,363 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 19:35:54,862 WARNING [train.py:1204] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:35:54,907 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:35:59,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:36:00,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 19:36:01,956 WARNING [train.py:1204] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:36:02,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1769360.0, ans=0.125 2023-10-04 19:36:02,695 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.26 vs. limit=15.0 2023-10-04 19:36:04,714 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.080e+02 2.334e+02 2.937e+02 4.893e+02, threshold=4.667e+02, percent-clipped=0.0 2023-10-04 19:36:04,789 WARNING [train.py:1204] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:36:04,806 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 19:36:08,142 WARNING [train.py:1204] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:36:08,216 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:36:08,221 WARNING [train.py:1204] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 19:36:11,452 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 19:36:12,810 WARNING [train.py:1204] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:36:12,859 WARNING [train.py:1204] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 19:36:12,868 WARNING [train.py:1204] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 19:36:16,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=1769426.6666666667, ans=22.5 2023-10-04 19:36:16,957 WARNING [train.py:1204] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:36:24,602 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:36:27,939 WARNING [train.py:1204] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 19:36:27,973 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 19:36:27,981 WARNING [train.py:1204] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 19:36:29,407 WARNING [train.py:1204] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 19:36:29,409 WARNING [train.py:1204] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:36:33,748 WARNING [train.py:1204] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 19:36:37,162 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 19:36:39,899 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 19:36:41,270 WARNING [train.py:1204] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:36:41,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1769560.0, ans=0.0 2023-10-04 19:36:44,494 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 19:36:46,068 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 19:36:46,113 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 19:36:50,140 INFO [train.py:1046] (1/4) Epoch 50, batch 5150, loss[loss=0.1674, simple_loss=0.2517, pruned_loss=0.04156, over 24310.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2349, pruned_loss=0.03579, over 4709295.06 frames. ], batch size: 77, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:36:51,991 WARNING [train.py:1204] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:36:52,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:36:52,009 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:36:53,316 WARNING [train.py:1204] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:36:53,336 WARNING [train.py:1204] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 19:36:53,403 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:36:54,843 WARNING [train.py:1204] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 19:36:54,846 WARNING [train.py:1204] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 19:36:54,886 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 19:36:54,912 WARNING [train.py:1204] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:36:54,922 WARNING [train.py:1204] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 19:36:56,344 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:36:56,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1769626.6666666667, ans=0.0 2023-10-04 19:36:57,675 WARNING [train.py:1204] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 19:36:59,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1769626.6666666667, ans=0.125 2023-10-04 19:37:00,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:37:00,919 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:37:02,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1769626.6666666667, ans=0.125 2023-10-04 19:37:05,119 WARNING [train.py:1204] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 19:37:05,137 WARNING [train.py:1204] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 19:37:06,533 WARNING [train.py:1204] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:37:06,586 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:37:09,697 WARNING [train.py:1204] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:37:09,698 WARNING [train.py:1204] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:37:09,715 WARNING [train.py:1204] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:37:09,770 WARNING [train.py:1204] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:37:09,775 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:37:11,117 WARNING [train.py:1204] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 19:37:12,494 WARNING [train.py:1204] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:37:12,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:37:14,576 WARNING [train.py:1204] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:37:16,079 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 19:37:17,420 WARNING [train.py:1204] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:37:21,966 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:37:24,564 WARNING [train.py:1204] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 19:37:28,718 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:37:36,234 WARNING [train.py:1204] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:37:37,628 WARNING [train.py:1204] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:37:41,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:37:41,046 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:37:43,762 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 19:37:47,059 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:37:48,449 WARNING [train.py:1204] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:37:48,489 WARNING [train.py:1204] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:37:53,041 WARNING [train.py:1204] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:37:53,129 WARNING [train.py:1204] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:37:54,460 WARNING [train.py:1204] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 19:37:58,716 WARNING [train.py:1204] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:37:58,811 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 19:38:00,211 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:38:00,226 WARNING [train.py:1204] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:38:02,050 WARNING [train.py:1204] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 19:38:02,079 WARNING [train.py:1204] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 19:38:02,089 WARNING [train.py:1204] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:38:03,341 WARNING [train.py:1204] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:38:04,697 INFO [train.py:1046] (1/4) Epoch 50, batch 5200, loss[loss=0.1692, simple_loss=0.256, pruned_loss=0.04122, over 24660.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2347, pruned_loss=0.03581, over 4721362.51 frames. ], batch size: 68, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:38:06,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:38:08,773 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:38:12,028 WARNING [train.py:1204] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:38:13,544 WARNING [train.py:1204] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 19:38:15,516 WARNING [train.py:1204] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:38:15,574 WARNING [train.py:1204] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:38:16,898 WARNING [train.py:1204] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:38:18,318 WARNING [train.py:1204] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:38:18,339 WARNING [train.py:1204] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:38:19,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1770026.6666666667, ans=0.125 2023-10-04 19:38:20,968 WARNING [train.py:1204] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 19:38:21,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1770026.6666666667, ans=0.1 2023-10-04 19:38:24,087 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:38:24,145 WARNING [train.py:1204] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:38:25,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1770026.6666666667, ans=0.2 2023-10-04 19:38:28,314 WARNING [train.py:1204] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 19:38:29,772 WARNING [train.py:1204] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:38:31,653 WARNING [train.py:1204] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:38:31,707 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 19:38:33,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 19:38:34,487 WARNING [train.py:1204] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 19:38:34,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1770093.3333333333, ans=0.125 2023-10-04 19:38:35,662 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.843e+02 2.266e+02 2.653e+02 3.411e+02 5.060e+02, threshold=5.305e+02, percent-clipped=4.0 2023-10-04 19:38:37,116 WARNING [train.py:1204] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:38:37,119 WARNING [train.py:1204] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 19:38:37,126 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:38:37,251 WARNING [train.py:1204] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:38:38,465 WARNING [train.py:1204] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:38:38,511 WARNING [train.py:1204] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 19:38:39,860 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:38:41,895 WARNING [train.py:1204] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:38:46,009 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 19:38:46,047 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 19:38:46,076 WARNING [train.py:1204] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 19:38:50,710 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 19:38:52,085 WARNING [train.py:1204] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 19:38:56,787 WARNING [train.py:1204] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:38:56,815 WARNING [train.py:1204] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:38:58,256 WARNING [train.py:1204] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 19:38:59,647 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:38:59,676 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 19:38:59,679 WARNING [train.py:1204] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:38:59,712 WARNING [train.py:1204] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:39:01,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1770160.0, ans=0.2 2023-10-04 19:39:04,177 WARNING [train.py:1204] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:39:04,275 WARNING [train.py:1204] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:39:07,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1770226.6666666667, ans=0.125 2023-10-04 19:39:08,366 WARNING [train.py:1204] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:39:08,451 WARNING [train.py:1204] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:39:08,452 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:39:13,038 WARNING [train.py:1204] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:39:14,402 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 19:39:15,749 WARNING [train.py:1204] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:39:15,765 WARNING [train.py:1204] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:39:17,189 WARNING [train.py:1204] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:39:18,829 INFO [train.py:1046] (1/4) Epoch 50, batch 5250, loss[loss=0.1565, simple_loss=0.2372, pruned_loss=0.03788, over 23428.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2336, pruned_loss=0.03565, over 4727086.60 frames. ], batch size: 105, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:39:18,896 WARNING [train.py:1204] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 19:39:19,002 WARNING [train.py:1204] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:39:21,717 WARNING [train.py:1204] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:39:25,160 WARNING [train.py:1204] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:39:25,203 WARNING [train.py:1204] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:39:27,818 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:39:32,118 WARNING [train.py:1204] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:39:34,090 WARNING [train.py:1204] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:39:36,794 WARNING [train.py:1204] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:39:36,894 WARNING [train.py:1204] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:39:38,239 WARNING [train.py:1204] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 19:39:38,254 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:39:40,929 WARNING [train.py:1204] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:39:50,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1770426.6666666667, ans=0.125 2023-10-04 19:39:55,094 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.58 vs. limit=6.0 2023-10-04 19:39:58,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1770426.6666666667, ans=0.125 2023-10-04 19:40:01,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1770493.3333333333, ans=0.5 2023-10-04 19:40:06,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1770493.3333333333, ans=0.2 2023-10-04 19:40:12,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1770493.3333333333, ans=0.125 2023-10-04 19:40:13,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1770560.0, ans=0.5 2023-10-04 19:40:27,584 INFO [train.py:1046] (1/4) Epoch 50, batch 5300, loss[loss=0.1435, simple_loss=0.2228, pruned_loss=0.03215, over 24467.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.233, pruned_loss=0.03558, over 4718702.30 frames. ], batch size: 58, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:40:29,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1770626.6666666667, ans=0.0 2023-10-04 19:40:37,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1770626.6666666667, ans=0.1 2023-10-04 19:40:38,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1770626.6666666667, ans=0.125 2023-10-04 19:40:42,135 INFO [train.py:1310] (1/4) Done!